As more publishers close content licensing deals with OpenAI, creator of ChatGPT, a study published this week by Digital Journalism Trailer Center – watching the AI chatbot produce citations (i.e. sources) for the publishers’ content – makes for interesting reading or, well, worrying.
Simply put, the findings suggest that publishers remain at the mercy of the generative AI tool’s tendency to fabricate or misrepresent information, regardless of whether or not they allow OpenAI to crawl their content.
The research, conducted at Columbia Journalism School, examined quotes produced by ChatGPT after it was asked to identify the source of sample quotes drawn from a mix of publishers, some of whom had signed deals with OpenAI and others. No.
The Center took block quotes from 10 stories each produced by a total of 20 randomly selected editors (i.e. 200 different quotes in total), including content from The New York Times (which is currently suing OpenAI in a claim copyright); The Washington Post (which is not affiliated with the creator of ChatGPT); The Financial Times (which has signed a licensing agreement); and others.
“We chose citations that, if pasted into Google or Bing, would return the source article in the top three results and evaluated whether OpenAI’s new search tool would correctly identify the article that was the source of each citation,” the Tow researchers wrote. , Klaudia Jaźwińska and Aisvarya Chandrasekar in a blog post explaining their approach and summarizing their findings.
“What we found was not promising for news editors,” he continues. “While OpenAI emphasizes its ability to provide users with ‘timely responses with links to relevant web sources,’ the company does not explicitly commit to ensuring the accuracy of those quotes. “This is a notable omission for publishers who expect their content to be referenced and accurately represented.”
“Our testing found that no publisher, regardless of the degree of affiliation with OpenAI, was spared inaccurate representations of their content in ChatGPT,” they added.
Unreliable supply
The researchers say they found “numerous” cases where ChatGPT inaccurately cited publishers’ content, and they also found what they call “a spectrum of accuracy in the responses.” So while they found “some” quotes that were completely correct (i.e., ChatGPT accurately returned the publisher, date, and URL of the block quote shared with it), there were “many” quotes that were completely wrong; and “some” who were somewhere in between.
In short, ChatGPT dating seems to be an unreliable mix. The researchers also found very few cases where the chatbot did not project complete confidence in its (incorrect) answers.
Some of the quotes come from publishers who have actively blocked OpenAI search trackers. In those cases, the researchers say they anticipated it would have trouble producing correct citations. But they discovered that this scenario posed another problem, as the robot “rarely” confessed to being unable to produce a response. Instead, he resorted to collusion to generate some (albeit incorrect) supply.
“In total, ChatGPT returned partially or completely incorrect answers on 153 occasions, although it only recognized the inability to accurately answer a query seven times,” the researchers said. “Only in those seven results did the chatbot use qualifying words and phrases such as ‘it seems’, ‘it’s possible’ or ‘could’, or statements such as ‘I couldn’t locate the exact item.'”
They compare this unhappy situation to a standard Internet search in which a search engine like Google or Bing typically locates an exact quote and directs the user to the website where it was found, or claims that it found no results with an exact match. .
ChatGPT’s “lack of transparency about its trust in a response can make it difficult for users to assess the validity of a claim and understand which parts of a response they can or cannot trust,” they argue.
For publishers, there could also be reputational risks from incorrect citations, they suggest, as well as the commercial risk of readers being directed elsewhere.
Decontextualized data
The study also highlights another issue. It suggests that ChatGPT could essentially reward plagiarism. Researchers recount a case in which ChatGPT miscited a website that had plagiarized a piece of “deeply reported” journalism from the New York Times—that is, by copying and pasting the text without attribution—as the source of the NYT story, speculating that, in that case, the bot may have generated this fake response to fill an information gap that resulted from its inability to crawl the NYT website.
“This raises serious questions about OpenAI’s ability to filter and validate the quality and authenticity of its data sources, especially when it comes to unlicensed or plagiarized content,” they suggest.
In other findings that are likely worrying for publishers who have signed deals with OpenAI, the study found that ChatGPT citations weren’t always reliable in their cases either, so allowing their trackers in doesn’t seem to guarantee accuracy either.
The researchers argue that the fundamental problem is that OpenAI technology is treating journalism “as decontextualized content,” seemingly without regard to the circumstances of its original production.
Another problem that the study points out is the variation in ChatGPT responses. The researchers tested asking the robot the same query several times and found that it “typically returned a different answer each time.” While this is typical of GenAI tools, in general, in the context of a citation, such inconsistency is obviously suboptimal if precision is what you are looking for.
While Tow’s study is small in scale (the researchers acknowledge that “more rigorous” testing is needed), it is notable given the high-profile deals that major publishers are busy striking with OpenAI.
If media companies expected these agreements to lead to special treatment for their content versus their competitors, at least in terms of producing accurate sourcing, this study suggests that OpenAI has not yet offered such consistency.
While publishers that do not have licensing agreements but also I haven’t done it completely blocked OpenAI trackers, perhaps hoping to at least pick up some traffic when ChatGPT returns content about its stories; The study also makes for depressing reading, as the quotes may not be accurate in their cases either.
In other words, there is no guaranteed “visibility” for publishers on the OpenAI search engine, even when they allow their crawlers to enter.
Complete blocking of trackers also does not mean that publishers can save themselves from the risk of reputational damage by preventing any mention of their stories on ChatGPT. The study found that the bot still incorrectly attributed articles to the New York Times despite the ongoing lawsuit, for example.
‘Insignificant agency’
The researchers conclude that, as things stand, publishers have “little meaningful agency” over what happens to their content when ChatGPT gets its hands on it (directly or, well, indirectly).
The blog post includes a response from OpenAI to the research findings, which accuses the researchers of conducting an “atypical test of our product.”
“We support publishers and creators by helping ChatGPT’s 250 million weekly users discover quality content through summaries, quotes, clear links and attribution,” OpenAI also told them, adding: “We have collaborated with partners to improve the online citation accuracy and Respect editors’ preferences, including enabling how they appear in search by managing OAI-SearchBot in your robots.txt file. We will continue to improve the results of. search”.