"The more the field of artificial intelligence advances, the easier it is for these networks to fool us and pretend to be very similar in their behavior to humans. But when we activate the appropriate tools, we can see that there is still a way to go until we reach algorithms that very accurately imitate human behavior
In a dialogue with chatbots, that is, with the networks that process language, many times it seems that there is a human being on the other side of the chat. Researchers from Ben-Gurion University of the Negev and Columbia University tested whether these networks understand and process language like humans and found surprising gaps. The research findings were published in the prestigious journal Nature Machine Intelligence.
When examining the processing of sentences in the English language among humans and among deep learning systems (artificial neural networks), there seems to be a surprising similarity between humans and networks. This fact bothered you Dr. Tal Golan From the Department of Cognitive and Brain Sciences at Ben-Gurion University of the Negev and Matthew Siegelman, a research student from Columbia University, since there are significant differences between the various networks as well as between the way these networks are built and operate, and the human brain. "If we better understand the similarities and differences between artificial intelligence and natural intelligence, we can better understand how we ourselves work," explained Dr. Golan.
One of the main tools in language research is the examination of the sentences perceived by the speakers of the language as "acceptable". For example, "Dana ate a sandwich" is an acceptable sentence in Hebrew, but "Dana ate a sandwich" or "Dana ate a sandwich", are not acceptable sentences. In recent years, scientists began testing artificial neural networks in a similar way, and found to their surprise a great similarity between human judgments and the probability that artificial neural networks assign to different sentences.
In the current study, the researchers wanted to test the limits of similarity between humans and networks. For this purpose, they developed software that builds pairs of "controversial" sentences between the networks. In each such pair, there is a sentence that one network identifies as acceptable, while the other network identifies as unacceptable. The second sentence is judged by the networks in the opposite way - the first network identifies it as inadmissible and the second network as admissible.
For example, the sentence "This is the week you have been dying" was found to be inadmissible according to a GPT-2 type network, and completely admissible according to a BERT type network. On the other hand, the sentence "That is the narrative we have been sold" was found to be admissible according to GPT-2 and inadmissible according to BERT. After the researchers created hundreds of pairs of such sentences, the sentences were presented to 100 English-speaking human subjects, who were asked to judge for each of the pairs which sentence was more acceptable. In such a test, one of the networks must fail, because they do not agree with each other.
The researchers found that under this strict test, all networks exhibit significant differences in their judgments compared to humans. They accepted ungrammatical and illogical sentences as acceptable, and at the same time rejected grammatical and logical sentences as unacceptable. The network found to be most similar to humans was GPT-2, which learns by trying to predict the next word in the text, the same principle applied in the first and main training phase of chatbots such as ChatGPT.
"The research reveals gaps between the way artificial neural networks and humans process written language," explains Dr. Golan. "The more the field of artificial intelligence advances, the easier it is for these networks to fool us and pretend to be very similar in their behavior to humans. But when we activate the appropriate tools, we can see that there is still a way to go until we reach algorithms that very accurately imitate human behavior. It is possible and we will be able to build neural networks that accurately simulate human linguistic judgments only when the networks realize additional cognitive skills, such as environmental sensing and movement control, and not just read millions of books," concluded Dr. Golan.
The research group included: Professor Christopher Bledsnow and a professor Nikolaus Kriegskurta from Columbia University's Department of Psychology.
This research (grant number 1948004) was funded by the American Science Foundation and the Zuckerman Fellowship.
More of the topic in Hayadan: