OpenAI proudly debuted ChatGPT search in October as the next stage for search engines. The company boasted that the new feature combined ChatGPT’s conversational skills with the best web search tools, offering real-time information in a more useful form than any list of links. According to a recent review by Columbia University’s Tow Center for Digital Journalism, that celebration may have been premature. The report found ChatGPT to have a somewhat lassie-faire attitude toward accuracy, attribution, and basic reality when sourcing news stories.
What’s especially notable is that the problems crop up regardless of whether a publication blocks OpenAI’s web crawlers or has an official licensing deal with OpenAI for its content. The study tested 200 quotes from 20 publications and asked ChatGPT to source them. The results were all over the place.
Sometimes, the chatbot got it right. Other times, it attributed quotes to the wrong outlet or simply made up a source. OpenAI’s partners, including The Wall Street Journal, The Atlantic, and the Axel Springer and Meredith publications, sometimes fared better, but not with any consistency.
Gambling on accuracy when asking ChatGPT about the news is not what OpenAI or its partners want. The deals were trumpeted as a way for OpenAI to support journalism while improving ChatGPT’s accuracy. When ChatGPT turned to Politico, published by Axel Springer, for quotes, the person speaking was often not whom the chatbot cited.
AI news to lose
The short answer to the problem is simply ChatGPT’s method of finding and digesting information. The web crawlers ChatGPT uses to access data can be performing perfectly, but the AI model underlying ChatGPT can still make mistakes and hallucinate. Licensed access to content doesn’t change that basic fact.
Of course, if a publication is blocking the web crawlers, ChatGPT can slide from newshound to wolf in sheep’s clothing in accuracy. Outlets employing robots.txt files to keep ChatGPT away from their content, like The New York Times, leave the AI floundering and fabricating sources instead of saying it has no answer for you. More than a third of the responses in the report fit this description. That’s more than a small coding fix. Arguably worse is that if ChatGPT couldn’t access legitimate sources, it would turn to places where the same content was published without permission, perpetuating plagiarism.
Ultimately, AI misattributing quotes isn’t as big a deal as the implication for journalism and AI tools like ChatGPT. OpenAI wants ChatGPT search to be where people turn for quick, reliable answers linked and cited properly. If it can’t deliver, it undermines trust in both AI and the journalism it’s summarizing. For OpenAI’s partners, the revenue from their licensing deal might not be worth the lost traffic from unreliable links and citations.
So, while ChatGPT search can be a boon in a lot of activities, be sure to check those links if you want to ensure the AI isn’t hallucinating answers from the internet.