Researchers at the Technion discovered that, unlike artificial language models, which analyze long texts as one piece, the human brain synthesizes what it has read into a kind of "summary" that allows it to understand the rest of the text.
LLMs ("Large Language Models"), including some popular "stars" such as GPT and Bard chat, have revolutionized the ability of computer systems to perform tasks and solve problems over the past decade. These models are largely built on artificial neural networks and are learned or trained on very large amounts of text. Today, they can generate texts, translate texts from one language to another, and even identify sentiments in a given text.
Like artificial neural networks, which are the basis of artificial intelligence in general, LLMs also draw inspiration from the human brain, but it is important to recognize the differences between these two worlds. Research conducted at the Technion and published in Nature Communications. Points out the similarities and differences between the brain and large language models in a specific context: understanding spoken texts.
The research was led Prof. Roy Reichert and Dr. Rafael Tikochinsky from the Faculty of Data and Decision SciencesThis is part of Dr. Tikochinsky's doctoral thesis, which was conducted under the joint supervision of Prof. Reichert from the Technion and Prof. Uri Hasson from Princeton University. The research partners are Dr. Ariel Goldstein from the Hebrew University and Yoav Meiri, a master's student in the Faculty of Data and Decision Sciences.
The study conducted at the Technion is based on fMRI scans – functional brain imaging – of 219 subjects while they listened to stories. The researchers tested the ability of existing LLMs to predict brain activity during listening, and found that the prediction is successful only when it comes to short texts – dozens of words at most. When the text is longer, these models fail to predict the activity in the listener’s brain.
Tykoczynski and Prof. Reichert showed that the reason for this failure is that with long texts, the human brain does not behave similarly to LLMs. The similarity between the two exists with short texts, where both the brain and the model process all the words in the input in parallel, that is, at once; when the text is long, the artificial model continues to process it in the same way, but the brain – which cannot “digest” it in its entirety – switches to another mode, an “accumulation mechanism”. This means that the brain accumulates the text it has heard so far in a kind of “accumulation” of contextual knowledge, and according to it it interprets the next words it hears.
The artificial model, on the other hand, can "digest" all the text it has heard so far at once, and therefore does not need the same cumulative mechanism. The researchers assumed that this fundamental difference is the reason for the failure of artificial models to predict activity in the brain listening to stories, and even demonstrated this; they developed an improved artificial model that works similarly to the brain. This model was based on Abstracts dynamics of the text that has been heard so far, and based on which it decoded the rest of the text. This model did indeed improve the prediction of brain activity, which shows that the listening brain is constantly engaged in summarizing the previous text and bases its understanding on it. This method allows us to absorb a lot of information that we receive over a long period of time, as happens, for example, in an entire lecture, an entire book, or an entire podcast.
Using complementary analysis, the researchers mapped brain regions that play critical roles in both short and long processing processes, as they show in the following figure. At the bottom of the figure are the regions that process short text segments; at the top are the regions responsible for "context aggregation," which allows a person to understand the rest of the text based on the text heard up to that point.
The research findings show that hierarchical processing in the brain allows for flexible integration of information over time and provide significant insights into both neuroscience and development in the world of AI.
The research was supported by a grant from the Israeli Academy of Sciences and the National Institutes of Health (NIH).
for the article in Kind Communication
More of the topic in Hayadan:
One response
Very strange. The text indicates a lack of understanding of how LLM systems work. And a lack of updating ( bard. long dead )