AI Search Engines Base Themselves on Incorrect Sources in 60 Percent of Cases

ai incorrect source

American researchers examined the source citations of eight AI search engines.

A study by the Tow Center for Digital Journalism at Columbia reveals serious accuracy problems with generative AI models used for news searches. The researchers tested eight AI search tools and found that more than 60 percent of the search results were incorrect.

What’s Going Wrong?

Researchers note that about a quarter of Americans use AI models as an alternative to traditional search engines. This raises serious concerns about reliability, as the study showed a high error rate.

The error margins varied greatly per platform. Perplexity provided incorrect information in 37 percent of cases, while ChatGPT Search was wrong in 67 percent of cases. Grok 3 scored the worst, with an error rate of up to 94 percent. These AI models often gave answers that seemed plausible but were incorrect.

The tests were conducted by giving the AI models excerpts from real news articles and then asking for the headline, original publisher, publication date, and URL of the article. A total of 1,600 searches were performed. Notably, premium versions of some AI tools performed worse than their free counterparts. Perplexity Pro and Grok 3 Premium gave incorrect answers more often. They also rarely refused to provide an answer when they didn’t have reliable information.

read also

AI Search Engines Base Themselves on Incorrect Sources in 60 Percent of Cases

The study also showed that some AI search tools ignored publishers’ instructions. For example, Perplexity’s free version correctly cited all ten tested excerpts from paid National Geographic content. This shouldn’t happen, as National Geographic blocks AI web crawlers. Additionally, AI models often refer to syndicated versions of articles on platforms like Yahoo News instead of the original publications. This means publishers receive less traffic, despite their texts forming the basis for AI search results.

Another common problem is the fabrication of URLs. More than half of the references by Google’s Gemini and Grok 3 led researchers to non-existent or broken links.

AI Also Popular in Europe

This issue is not unique to the US. According to the latest Digimeter from imec, the adoption of AI tools for information searches is also increasing in Flanders. The report states that 28 percent of Flemish people regularly use generative AI, and 21 percent have explicitly used ChatGPT.

A survey by ITdaily shows that 56 percent of respondents, consisting of IT professionals, use AI and are honest about it. 16 percent do not use AI, and 30 percent experience ‘AI shame’. This is mainly related to the ecological impact.

Although OpenAI and Microsoft have responded to the report, they did not provide direct answers to the criticisms. OpenAI emphasized that it supports publishers with clear references and summaries, while Microsoft stated that it adheres to existing protocols and guidelines. For now, the reliability of AI search engines remains a big question mark. It is also unclear whether measures will be taken to avoid incorrect source citations.

read also

AI Search Engines Base Themselves on Incorrect Sources in 60 Percent of Cases