The chatbot developments that have revolutionized many areas of our lives in recent years are the result of many gradual developments, which have gone through neural networks, psychological therapy, hot dogs (for example), chess, and the Nobel Prize in Physics.
Uri Fogel, Davidson Institute, the educational arm of the Weizmann Institute

grant Nobel Prize in Physics The year 2024 marks a symbolic milestone for the field for two pioneers of artificial intelligence. It is a good opportunity to look back and reflect on the journey humanity has taken in this field, from the initial definitions of the term "artificial intelligence" to the enormous achievements we are experiencing in the field today, affecting many areas of our lives.
first steps
You The history of artificial intelligenceIt is appropriate to begin scrolling from 1950, when he published Alan Turing (Turing), the British mathematician considered the father of computer science, an article in which he posed the iconic question: "Are machines capable of thinking?". To answer this, he presented a thought experiment, which would later be called the "Turing Test." In this experiment, a human interviewer talks to a human and a machine in separate rooms, without being able to tell which one is answering him. If, after a specified time, the interviewer is unable to distinguish between the human and the machine, the computer has passed the test. Although the Turing Test was considered problematic to implement, this paper is considered a milestone in the development of the concept of artificial intelligence.
In 1956, two years after Turing's death, Gather together A small number of mathematicians and computer scientists from various research fields at Dartmouth College in New Hampshire. The goal of the workshop was to begin to create a forum that could characterize the field of "thinking machines," and develop new ideas and methods that would help advance it. One of the organizers of the event was the American computer scientist John McCarthy, who also coined the phrase "artificial intelligence" when he had to decide on the name of the conference theme. It can be said that this event achieved its goal, and it established the first framework for discussions and research related to artificial intelligence. In the eyes of many, the Dartmouth conference is considered the event that established this field of research.
With the establishment of the field of computer science in the 60s and 70s, the field of artificial intelligence took its first practical steps. The capabilities demonstrated at the time were of course very far from what we recognize today as artificial intelligence. The results of the research remained mainly within the walls of academia, and did not trickle down to the practical world. The algorithms in those years were mainly based on dividing the problem into various parameters, and setting conditions according to which the machine would make decisions.
Joseph Weizenbaum, a Jew of German descent, created the first chatbot, Eliza (ELIZA), which provided a kind of psychological therapy to the user. It recognized certain words entered into the chat and responded with a sentence based on an appropriate template. Even today, it is possible Correspond with Eliza and test her skills.
Later, the first artificially intelligent robot appeared, developed at Stanford University. Its name was Shakey (SHAKEY) and he was capable of performing a number of simple actions that he decided on himself, such as moving through space or moving objects from place to place, although as his name suggests, his movement was not particularly smooth.
In 1959, IBM employee Arthur Samuel first coined the term machine learning. This is another form of artificial intelligence that is common today. In this method, a computer learns to solve problems or perform tasks from data fed to it.

Winter flowering
1974 was a watershed year for the field of artificial intelligence. Applied mathematician James Lighthill published Audit ReportPDF file A scathing critique of the field of research in Britain, his main argument being that the field's output is very poor in relation to the resources invested in it. The article led to sharp cuts in academic investment in Britain. Subsequently, the US military industries also cut their investments in American academia. The field entered a two-decade slowdown, during which no applied breakthroughs were recorded in the field. This period is known as the "winter of artificial intelligence."
In the midst of this winter, in the 80s, the seeds of the ideas that would become the basis of artificial intelligence as we know it today, based on artificial neural networks, were planted. These ideas are credited to their developers, John Hopfield and Geoffrey Hinton. The 2024 Nobel Prize in Physics.
In fact, such neural networks were not a new idea at the time. As early as 1934, neuroscientist Warren McCulloch and his logician colleague Walter Pitts published a paper proposing an abstract mathematical model for describing the brain as a complex network of nerve cells (neurons) and the connections between them. A few years later, psychologist Donald Hebb further developed the idea and argued that learning in the brain occurs in such a way that the connections between neurons that are significant to the learning process are strengthened, while the connections to neurons that contribute little are weakened.
To understand how such a network could be used for artificial intelligence, one can recall the entertaining application developed by Jian Yang in the comedy series silicon Valley, which is designed to separate images of hot dogs from images of non-hot dogs. To do this, the network will need to use examples, and after some mathematical processing, images of hot dogs and non-hot dogs will be fed into it. During learning, each "neuron" in the network will be responsible for a certain calculation, a simple multiplication and addition exercise. Connections between neurons that contributed to the learning process will be given greater weight than connections that did not. After the network has been fed enough examples, it will be able to separate hot dogs from non-hot dogs. Over the next few years, various proposals were made to use these models for artificial intelligence, but they did not bear fruit.
Weighting the strength of connections between the "cells" in the network allows the software – after sufficient training – to classify things, such as hot dogs | Screenshot from the series "Silicon Valley", HBO
In 1982, American physicist John Hopfield published The neural network model which would be named after him. Like previous models, it was composed of neurons and the connections between them. Hopfield's innovation was to use the principles of statistical physics to describe the connections between neurons. In physics, a multi-particle system tends to stabilize around a ground state in which the energy of the system is minimal. For example, when an external magnetic field is applied to Fragmentary materials, such as iron or cobalt, the electrons tend to line up in the same direction. We can say that this is "the easiest state for electrons to line up in," and therefore, the total energy of the electron system will be minimal. Similarly, when new information relevant to the learning process is fed into the neural network, the weight of the connections between the neurons will change, until they reach a state of equilibrium.
Geoffrey Hinton, a cognitive psychologist by training, promoted the idea of the Hopfield network and proposed a practical idea in 1986. How artificial neural networks work that we use today. Hinton's network is divided into three types of layers. The first is the input layer, where a mathematical representation of the information is obtained, and if we return to the previous example – an image of a hot dog/not a hot dog. From there, the information is passed to a hidden layer, or several hidden layers, where the information is processed. Finally, the processed information is passed to the last layer, the output layer, which is able to give a certain assessment of the input. Hinton also used ideas from the field of statistical physics, and in this case the Boltzmann probability, which describes the movement of particles under certain thermodynamic conditions. Therefore, Hinton's network is also called a "Boltzmann machine."
The networks proposed by Hinton and Hofield were, in retrospect, revolutionary ideas, but as we recall, those years are considered the winter of artificial intelligence. To use the methods they proposed, a lot of data and computers with very powerful processing capabilities were required – conditions that did not exist at that time. At the same time, other methods for artificial intelligence were developed, which provided good results relative to the conditions that existed at the time. The use of artificial neural networks went into a freeze for the next few decades.
The computer is allowed from the person
In 1996, the first sign of the end of the artificial intelligence winter and the beginning of spring arrived. The supercomputer developed by IBM called Deep blue (Deep Blue) faced the then world chess champion, Garry Kasparov, who represented the best that humanity has to offer. The competition was held in Philadelphia, and Kasparov inflicted a defeat on the computer, winning 2-4 in the match. In 1997, a rematch was held in New York, after Deep Blue had undergone a series of significant improvements, and could calculate 200 million moves per second. This time, the computer beat the human by a score of 2.5-3.5. Although the main achievement stemmed from strong computational ability and not from a demonstration of "true intelligence" that can be attributed to humans, the victory reignited public interest in the potential inherent in artificial intelligence.
The 2000s were marked by the development of increasingly powerful processors, which increased the computing power of computers. At the same time, the Internet and social networks created a vast amount of information that was previously unavailable to humanity. One of the first to understand the great potential of this change was a Chinese-American computer scientist named Fei-Fei Li. In 2005, she completed her doctorate at the California Institute of Technology (Caltech) and accepted a teaching position at the University of Illinois. That year, she noticed that most artificial intelligence research was mainly focused on developing and improving algorithms. The variety, quality, and quantity of data on which these algorithms were trained had received less attention. To address this gap, she began working to create a database of labeled images so that they could be used to train algorithms. In 2009, she published ImageNet, which at the time was the largest database of its kind. Users from all over the world uploaded images to the database and tagged them according to their content.
Since 2010, the ImageNet group has initiated a competition in which different image processing algorithms will try to produce the best result in classifying images from the database. In 2012, AlexNet, which used multiple layers of neural networks, participated in the competition and won first place by a significant margin. Following it, image recognition or voice recognition algorithms, such as Apple's Siri and Amazon's Alexa, began to become increasingly common. Various and diverse architectures of artificial neural networks for handling different types of information were developed with increasing frequency in academia and industry. The pace of development of artificial intelligence accelerated further and further, and in 2017 another significant piece of news came from Google.

Age of Transformers
Google introduced a new neural network architecture called Transformer, which has increased the level of language processing and comprehension. The transformer used a mechanism that breaks down the input into meaningful parts, and then gives weight to each part according to its relationship to the other parts. If the input is text, the division will usually be into words, but sometimes also into word parts and symbols. For example, in the sentence "I returned to my old house", the sentence can be divided into the words "I", "returned", "to my house", "old", ".". In relation to the word "I", the words "returned" and "to my house" will receive a high score, since they are directly related to it. On the other hand, "old" will receive a lower score in relation to "I". Another option for division is: "I", "returned", "ti", "l", "house", "y", "e", "old", ".". In the last option, the division was carried out so that some of the words were broken down into different meaningful components such as prepositions or inflections. Of course, the rules of division vary from language to language depending on its nature.
Transformers have revolutionized the performance of algorithms related to language understanding, such as transcription or translation. Just a year after their appearance, OpenAI introduced the initial version of what would become the most well-known chatbot in the world today: GPT. The acronym GPT stands for Generative Pre-trained Transformers, meaning transformers that have been pre-trained on a huge amount of information from all over the internet, and now have the ability to create text themselves, according to the command entered into them. The company improved the chat in various versions, until it decided that it was ready for widespread use and launched ChatGPT in 2022, a user-friendly version of GPT that has already really changed the world. A few months earlier that year, Dall-E demonstrated the ability to create images from a text description alone, using another complex network architecture – diffusion models – that left many people speechless.
Over the past two years, artificial intelligence has become a fundamental tool that many of us use every day: whether for searching for information, creating images, writing code, creating music, and a variety of other uses. It seems that it is already difficult to keep up with the emergence of new tools. Just this week, OpenAI released its new tool for creating video, SoraWhile some believe that the coming years will bring increasingly powerful AI achievements, there are Those that slightly dampen enthusiasmTheir main argument is that we are mainly looking at improving existing capabilities, rather than revolutionary concepts that shake up the world of artificial intelligence. A bit like the smartphone revolution of a decade and a half ago. We just have to wait and see who is right.
More on the subject of the science website: