Chatbots generate invalid links to non-existent sources, thereby undermining the role of the reference to the source.
By Tal Sokolov, Davidson Institute, the educational arm of the Weizmann Institute of Science
"Insanity is doing the same thing over and over again and expecting different results" - Albert Einstein
Citations, references, and bibliographies are different ways to rely on external sources in writing. Such means can add credibility to writing, as they originate from an article, report, or book that has been carefully edited and processed. In other words, someone else has already approved the content and stands behind the chosen text. Referring also gives the source additional exposure and helps to distribute it. At the same time, using a reference to a source does not guarantee that the source is true and accurate. The same applies to the quote that opened the article, which is attributed to Einstein, probably originated mistakenly.
However, references also have weaknesses. Articles withdrawn from publication because they were found to have significant flaws Continue to serve as sources Despite the criticism heaped upon them. And worse, researchers often cite Articles that don't even exist. From the perspective of article writers, there is a catch here. On the one hand, referring to another source is equivalent to a kind of warranty certificate – a stamp stating that this article is well-founded and good. On the other hand, checking the correctness and reliability of these references is a tedious task, which is tempting to skip.
In reviews conducted in previous decades, long before they appeared Text generators Based on artificial intelligence, it was found that out of tens of thousands of references to sources that appear in biomedical articles, About thirty percent of the time, mistakes are made. in the details of the source, which makes it difficult to locate. The text generators, who are not committed to factual truth, but to creating eloquent communication with people, further deepen the problem.
Citing sources plays a central role in research articles: a researcher who cites or refers to a previous article declares that he is following in the footsteps of others. That is, previous studies have already laid the groundwork, and now the researcher relies on the findings of his predecessors or tries to refute them. Such a process allows us to build our knowledge layer upon layer, without having to prove again and again the foundations on which the entire field rests.
Wrong source generator
Search engines are essential mediation tools that allow us to navigate the sea of existing information sources, especially on the Internet. For example Google Scholar (Google Scholar) is a search engine that focuses on scientific articles, while the search engine of the “Windows” operating system specializes in locating files within the computer. An article search engine displays the full title of the article, a list of authors, and links to websites on the Internet from which the full content can be accessed.
in the last year Expanded many text generators their applications and they offer search engine services. But unlike regular search engines, text generators do not make the sources accessible as they are. When we ask such a generator to direct us to sources of information on a particular topic, it searches the sources that are accessible to it, summarizes what is written in the source, Processes the content and generates a response, which can sometimes include a link to the source.
In this process, errors can occur, for example when the information the generator relies on is incorrect or inaccurate. Furthermore, text generators may To hallucinate texts illogical. Their job is to predict the most likely text in answer to the question posed to them, but not necessarily the most reliable and accurate text. In the process of producing the answer, distortions in the name of the source may occur, incorrect titles, errors in the names of the authors, and broken links that point to a different site than the one where the source was published, and even to sites that never existed and were not created.
A journalistic review that examined Eight text generators that offer search engine options, including Chat-GPT, Google's Gemini, and Twitter's Grok, were found to be wrong in 60 percent of their source references. That is, most of the times when asked to identify text from an article, the generators got the title wrong, messed up the publisher's name, or provided an incorrect web address. Grok went even further, getting the reference details wrong in 94 percent of the requests it received. More than half of the responses provided by Gemini and Grok included links to incorrect web addresses.
Previous studies have shown that chatbots Tend to prefer giving a wrong answer to a qualified answerThe journalists noted that the same phenomenon also exists with respect to requests for references. “The chatbot, eager to please, would rather provide an answer out of thin air than admit that it does not have access to an answer,” Please specifyChat-GPT got 134 out of 200 requests for references to sources examined in the review wrong, but only admitted that it was not sure they were correct for 15 of them.
Misquotations and misquotations from one source or another also damage the credibility of the sources themselves. The British Broadcasting Corporation, BBC, examined Disruptions that appeared in references to articles On its website. The review found that one in five AI-generated references that quoted details from BBC articles misquoted or attributed to the broadcaster content that does not exist on its website. Since this inaccurate content was attributed to the BBC, it could damage its credibility and professional standing.
No connection to the link
Examining the involvement of artificial intelligence in composing texts is A challenging problem in itself. Content copied verbatim from text generators is already finding its way into academic articles. Within texts created by generators, checking the consistency of links and references requires additional effort. The skeptical reviewer is required to access the link content to ensure that it matches what is described in the reference, such as the year of publication or the author's name. Furthermore, he must read the entire content of the link to verify that it does indeed support the claim for which it was used. Because of these many checking steps, errors in a reference created by a generator tend to be even more elusive than errors in regular generator text.
In the context of a trial which took place in the United States around artificial intelligence technology that allows the creation of fake reality (Deepfake), an artificial intelligence expert was asked to submit a written report to the court. Ironically, the expert himself used a text generator to write the report, which is why it contained references that do not exist. The phenomenon of references to non-existent judgments She was also identified in the courts of Israel..
Artificial intelligence products are becoming increasingly A significant obstacle to our ability to distinguish between truth and fictionThe use of references to additional sources is supposed to serve as an effective tool for establishing content that is well-grounded in reality, and the fact that artificial intelligence products distort this tool as well is instructive. Beyond the existence or non-existence of a source to which the link refers, the mere appearance of an external link from a document gives it an aura of authority and credibility, and therefore it is precisely here that we must maintain Double and doubled vigilance.
By the way, did you bother to verify the content of the last link?
More of the topic in Hayadan:
One response
Well, but why are the first and third paragraphs the same?
If you're editing articles with AI, you should also check the content, not just the references.