Comprehensive coverage

The real picture

One of the biggest challenges facing computer scientists is how to teach these machines to extract the information about the XNUMXD world from the XNUMXD representation they have - just like people do; That is, through actions such as sorting, classifying, comparing, and using the knowledge learned.

From the right: Noam Eigerman, Dr. Roi Foran and Dr. Yaron Lipman. real life
From the right: Noam Eigerman, Dr. Roi Foran and Dr. Yaron Lipman. real life

Many of us, in the modern world, spend much of our time looking at XNUMXD representations of the XNUMXD world: paintings, photographs, and electronic screens. Our brain knows how to automatically translate the flat image, and give it depth. In addition to this, the brain also knows, among other things, how to group objects together, fill in missing information, recognize familiar objects even when they are presented from unfamiliar angles, and measure distance between objects. The average computer, on the other hand, treats images as a collection of different colored dots placed on a two-dimensional grid. One of the biggest challenges facing computer scientists is how to teach these machines to extract the information about the XNUMXD world from the XNUMXD representation they have - just like people do; That is, through actions such as sorting, classifying, comparing, and using the knowledge learned.

One of the fields of activity of Dr. Yaron Lipman, who joined the Department of Computer Science and Applied Mathematics at the institute in 2011, can be defined as "mathematics of transformations". These are both changes in shape that distinguish two objects that are similar to each other, and the changes in shape that actually occur when it rotates, curves, stretches or twists. The applications of his work span a wide variety of fields, from biology and engineering, to graphics, animation and computer vision.

One of the basic questions he faces is: how can one determine if two given objects are the same or different? This is not a trivial task, even for humans. To the untrained eye, a pile of animal bones, for example, will appear as a collection of identical objects, but a professional, such as a morphologist or paleontologist, will be able to classify them into different animal species. This expertise develops over years of practice, but just as the average person is able to distinguish between an apple and a pear without any hesitation, so the expert distinguishes between bones from different sources without consciously following every step of the thought process that leads him. How, then, can this thought process, which is, at least in part, unconscious, be translated into a computer algorithm?

Dr. Lipman and his research partners developed an algorithm that compares and classifies anatomical surfaces, such as bones or teeth, by analyzing probable shape changes that occur in the 1D models that represent them. A person approaching a task of comparison between objects will operate in a "bottom-up" approach: he will usually focus on looking for prominent identifying signs, such as, for example, a bump or a depression with an easy-to-recognize shape, and connect all the clues to reach a conclusion. The computer takes the opposite approach, "from top to bottom": it creates a total match between the two objects, treating their entire surface as one geometric unit, while minimizing the total amount of interference to the match. Dr. Lipman tested this algorithm by comparing its ability to classify bones and teeth with that of an expert paleontologist. In all the tests, the computer's achievements were very close to those of the person (Figure XNUMX). Dr. Lipman says that the algorithm provides a good solution for scientists who have not yet acquired enough expertise, who wish to quickly and accurately identify animal species based on samples of bones and teeth. In the future, he hopes, it will be possible to extract broader biological information from similar algorithms.

Figure 1: A match between an anatomical surface (tooth) made by a computer (the four images in the top row) and one made by an expert paleontologist (below). You can see the similarity - and the difference - between the result reached by the computer (right picture above) and that of the expert
Figure 1: A match between an anatomical surface (tooth) made by a computer (the four images in the top row) and one made by an expert paleontologist (below). You can see the similarity - and the difference - between the result reached by the computer (picture above) and that of the expert (Figure 1a)
Figure 1a - A tooth as drawn by an expert.
Figure 1a - A tooth as drawn by an expert.

The challenge facing computer vision scientists - how to automatically interpret, analyze and compare visual content - is becoming more urgent than ever. Unlike a surface of objects, images contain many prominent identifying marks and other features that make it easier for the human brain to compare them. The computer, on the other hand, assigned equal importance to every point in the image. Therefore, usually, a person will easily recognize that a pair of photos taken in different lighting and from a different angle actually show the same object, but a computer algorithm based on matching points may have difficulty doing so. The solution that Dr. Lipman found is to add an algorithm that detects deformation, that is, it sets a mathematical limit to the ways in which one cluster of points can change shape into another cluster. Although this may be surprising, this method prevents most errors in this area (Figure 2).

A third topic that interests Dr. Lipman is the creation of models of 3D shape changes, which describe copies and distortions with desired geometric properties. This topic is related to the field of computer animation, where there is a constant search for methods to create "live" and more real movement on the background; to the field of engineering, where computer models of objects change their shape and are copied from their place; as well as for areas such as medical imaging and the creation of computer models. The copying is based on the representation of the objects as a network of pyramids, and during the creation of the movement model, the computer examines how the network of pyramids should be moved, that is, how each pyramid moves in relation to the others. In real life, many variables are involved in this movement, such as flexibility, and typical movement patterns of joints and of meeting areas between different surfaces. Dr. Lippman develops unique models for movement and shape changes (Figure XNUMX), which prevent the formation of large deformations or the penetration of one part of the bone into another part of it (for example, preventing the hand from penetrating into the head when a person scratches his head) - two of the main requirements from models for shape changes , for their application as representatives of "real life".

Figure 2: Match between two images. In the center: an array of possible comparison pairs between the two images. Below: the results of the algorithm that selects what it considers to be the "best matching pairs of points"
Figure 2: Match between two images. In the center: an array of possible comparison pairs between the two images. Below: the results of the algorithm that selects what it considers to be the "most suitable pairs of points"
Figure 3: Deformation of a palm model (left). Using the standard morphing algorithm (center) leads to large distortions (in red) and inversions (in yellow). A deformation model with constraints on the size of the deformation can create a similar copy without inversions and with a blocked deformation (right)
Figure 3: Deformation of a palm model (left). Using the standard morphing algorithm (center) leads to large distortions (in red) and inversions (in yellow). A deformation model with constraints on the size of the deformation can create a similar copy without inversions and with a blocked deformation (right)

4 תגובות

  1. World visitor
    Our brain is very different from a computer processor. Even Alan Turing thought so, and argued that the brain should be represented as a "link machine" and not as a serial computer (of course he did not use the term "Turing machine"...). The computer has theoretical limitations that the brain probably does not have (even the simplest minds). In particular, the brain does not run an algorithm.

  2. Right point in the first part (the image that falls on the retina is two-dimensional....).
    But wrong in the second part (just that the brain does not do all the analysis....).
    The neural network of the brain is very similar, in principle, to the computer processor.
    "Complicated" is not permanent, but a matter of perception only!
    Since mathematics deals with describing patterns, it cannot be said that there is something that cannot be described mathematically, because we have no ability to know about the existence of a pattern that we cannot describe in any way.
    If there is a way to describe a pattern, it is also possible to give it a mathematical representation, because mathematics only describes patterns, it does not create them!
    For example 1+1, is a description of a situation in which a combination of 2 "quite similar" or identical representations occurs, therefore the value is 2.
    Although, contrary to what is taught in school ("How much is 1+1?"), in the real world the value comes before an "exercise" (since one thing cannot be added to another, without the existence of both things in the first place).

  3. The brain recognizes most objects in their three-dimensional format, so it is much easier for it to recognize them in two-dimensional format like in a picture, for example, a point of view that computers usually get.

  4. The image that falls on the retina of the eye is two-dimensional and the brain also starts just like a computer, from a two-dimensional world.
    It's just that the brain doesn't do all the analysis by algorithms but by a highly complex neural network, and it doesn't belong at all to try to give a mathematical description of what an entire network does, you can try to derive principles but it still won't be even close to what the network as a whole does.

Leave a Reply

Email will not be published. Required fields are marked *

This site uses Akismat to prevent spam messages. Click here to learn how your response data is processed.