So what do you do there at the university? Chapter 13: How to get better at checkers in three simple steps - about genetic programming

I met with my colleague Amit Ben-Best to ask him what is being done there at the university

Amit is a PhD student in the Department of Computer Science at Ben-Gurion University of the Negev. He is doing his doctoral thesis in Prof. Moshe Zipper's group, researching the use of genetic programming to develop players in board games. In his free time, Amit writes on the website of the Israeli Skeptics Organization, and lectures here and there on popular science topics.

Amit, so what are you doing there?

Our research group deals with a variety of topics related to evolutionary algorithms, which is a field in computer science that belongs to artificial intelligence. Evolutionary algorithms mean the creation of an abstract simulation of the process of evolution by natural selection on a computer to produce better solutions to problems. I work in a subfield called genetic programming.

How does genetic programming work?

The first step is to produce an initial population of software that solves a particular problem, for example how to play a board game well. Our strategy is usually to improve in the evolution process only a part of the software that is inside an unchanging shell. The part that goes through the process of evolution is what differentiates the individuals in the population.

Can you give an example?

Yes, take for example the game of checkers. The part of the software that we improve in evolution is the one that detects the state of the board and returns a number. The 'board state' describes a single legal state of the game, and in checkers will be defined by the number, type and position of the pieces on the board, and who is next in line to play. The shell software will be some sort of search algorithm.

Figure 1: In case you came from outer space, a checkerboard is ready for play. The illustration is for illustration only! Source: Wikipedia.
Figure 1: In case you came from outer space, a checkerboard is ready for play. The illustration is for illustration only! Source: Wikipedia.

How does the software define a player?

By defining his game strategy. The same player can use the software to check the situation he is currently in, and compare it in his computerized imagination to the situations resulting from all possible legal moves. That player can now choose the mode that got the best score. If we want a better player, we can require him to check for several moves forward as well. The price is runtime, and it grows exponentially.

Can you give an example of the differences in the population?

Yes, for example one software in the population can score a board state by the difference between the number of tools of the two players. If the difference increases in my favor, I am satisfied because I probably ate the opponent's tool. Other software in the population can score a board position according to the same principle but add a proviso that a king is worth more in the count.

These programs are built in a data structure called a GP tree, and consist of different elements in a modular form (see Figure 2). We randomly generate the initial population of at least one hundred programs or trees using software we wrote for this purpose.

Figure 2: An arithmetic expression represented by a tree-shaped data structure. Source: Wikipedia.
Figure 2: An arithmetic expression represented by a tree-shaped data structure. Source: Wikipedia.

How do you determine the identity of the best software?

As the football cliché says, the table doesn't lie. To determine which are the successful programs (in fitness parlance) we hold a tournament among the individuals in the population. The programs get a point for each win, and half a point for a draw. The higher a software is in the table, the higher its chances of being selected for the next generation. For example, if we have a population of XNUMX individuals, we will choose XNUMX individuals to make up the next generation. Each program can be selected more than once, therefore the programs in a high position in the table will be selected a large number of times at the expense of programs in a low position. From the hundred programs selected at the end of the process we will create the next generation.

What do you mean? How is the next generation produced?

There are two actions that follow the selection process and aim to produce a change in the population. One operation is called a mutation, in which a subtree is randomly selected from the entire tree representing the player. The same subtree is replaced by a new subtree. An example of a mutation could be the replacement of the subtree x/11 from Figure 2, to a subtree expressing for example x-5 (see Figure 3). We usually determine the chance of mutation in a tree so that about one tenth of the trees will undergo a mutation.

Figure 3: The tree from Figure 2 before and after mutation.
Figure 3: The tree from Figure 2 before and after mutation.

The second operation is called crossover and simulates sexual reproduction. We choose two trees, and within each tree a random subtree is chosen. Now we do one of the following two possibilities: swap between the two subtrees, or swap the subtree with the tree that did less well in the tournament in the subtree than the tree that did better.

After the mutation and crossover stages, a population consisting of one hundred new individuals is obtained, called generation 1 (the initial population is called generation 0). To obtain a 2nd generation we will repeat the process: league, selection and reproduction (mutation and cross-over), and so on until a predetermined number of generations (see Figure 4). The level of the players is determined objectively by playing against an external player that we wrote without a genetic programming algorithm. The end of the process is the search for the best player of all generations.

Figure 4: Flow chart depicting the genetic programming process.
Figure 4: Flow chart depicting the genetic programming process.

So what about checkers?

One of the first things I did with the help of genetic programming was to write a player for a reverse checkers game, without being a great expert in the game. I let evolution do the work for me. Later I showed that the process can also be used for regular checkers and other games. The genetic programming allowed me to write successful players for various games without the need to invest much effort on top of what I had already invested.

Sounds wonderful, is there a 'but' somewhere?

Look, it must be understood that we are very limited in the number of forward moves that we allow players to calculate. For a large number of moves we will not be able to finish the tournaments of the selection process in a reasonable time because the running time increases, as mentioned, exponentially. It can also be shown that if we produced a player in an evolution process where there was a check of a certain number of forward moves, we could not significantly improve his quality in a simple way by increasing the number of forward moves he checks. That is, the final product is optimal for the conditions in which it was created.

אז מה עושים?

Trying to be smart. Suppose that in a certain board situation each forward move leads to ten action possibilities. If we check only five of them for each move, we can afford to check more moves ahead. To choose the five best possibilities from the ten we can use a player who sees only one move ahead. This process is called pruning in the professional parlance. During the research we were able to show that following the pruning we profit twice. First, these players are better, and second, they can be improved directly by reducing the pruning, for example in the example here to choose a possible seven out of ten instead of five, without further evolutionary process.

How would you summarize the advantages of the method?

The genetic programming makes it possible to learn and improve in the game, without actually knowing anything about it except for the basic moves. The method is generic and easily adapted to different problems, and it can also succeed in solving problems where human logic is not an advantage.

----------------------

I would be happy to meet and talk with any research student (maybe you?) who is willing to participate and tell me a little about what he is doing (and all for the price of a not too long conversation). You can contact me through the contact form.

It's time to tell everyone what you're doing, maybe this time they'll understand too 🙂

I would be happy to meet and talk with any research student (maybe you?) who is willing to participate and tell me a little about what he is doing (and all for the price of a not too long conversation). You can contact me via Contact Us Form.

It's time to tell everyone what you're doing, maybe this time they'll understand too

The article was published on Oren Shaya's blog.to a constant extent"

Leave a Reply

Email will not be published. Required fields are marked *

This site uses Akismet to filter spam comments. More details about how the information from your response will be processed.