Comprehensive coverage

To recognize a dog, for example, the computer needs to grasp the essence of canineness

Why is it so difficult to develop automatic object recognition systems?
By Uri Nitzan

In every bank there are employees who are responsible for deciphering the letters and numbers
that are listed on the checks. The sorting operation is easy but time-consuming, and every year
Banks in the US spend more than two billion dollars for services
This kind of sorting.

Machines that read handwriting could do the sorting work for the path
the banks, but the technology required to recognize handwriting at a human level of accuracy
does not yet exist. "Every seven-year-old child is able to read his handwriting and handwriting
the hand of others," says Prof. Shimon Ullman from the Weizmann Institute, "and is asked
The question is why the most advanced supercomputers are not able to do this."

Ullman, a mathematician by training, has been researching the visual system for many years,
And as part of an artificial intelligence laboratory, he designs and develops "seeing machines".
The doctoral thesis, which he wrote at the Massachusetts Institute of Technology (MIT), dealt
In the connections between three-dimensional vision and the perception of movement. then stayed
At MIT as a faculty member for 15 years, until he joined the Department of Mathematics and Science
The computer at the Weizmann Institute, which he currently heads.

"Our goal," says Ullman, "is to decipher the total activity of cells
The nerve involved in vision, and understand the computational side of the system.
The work is carried out in cooperation with the neuroscientists who study the side
the biological and electrical activity of individual nerve cells, and one of its goals
The main one is the development of an artificial vision system that recognizes objects".

A human easily differentiates between a dog and a cat, yet, researchers struggle
restore this ability. Thousands of people have worked for decades to
To solve the problem of automatic object detection, there is still no system
who does it at the level of a three-year-old child. An artificial vision system is a must
to take into account the fact that images of a particular object can be
very different from each other. If the mind were to retain two or three images of
the bone and would compare them to a new picture, most likely we were not at all
identify objects. The angle of view of the object changes, there are games of light
Vetzel, and when you plan the operating systems of "seeing machines" you have to
Understand and address all possible changes. The human mind knows the
The variety of appearances of the bone, and "neutralizes" the obstacles at the unconscious level.

"Humans filter irrelevant information, and are able to locate the characteristics
that turn a dog into a dog and a cat into a cat. We all know first hand that the mind
Able to grasp the essence of canineness", says Ullman, "and from the moment we met
With a biting dog, be careful of any dog ​​that resembles him, and from any angle of vision."

Prof. Ullman and his colleagues are trying to reproduce this ability in systems
computerized, and indirectly also to discard and learn about vision processes.
"The ambition is to develop computer programs that perceive an image and are able to describe the
the objects that appear in it and define them. For this, algorithms are developed
mathematical and try to understand the parallels between the activity of the brain and that which dictates
The algorithm".

The conventional approach to object recognition is based on a comparison of the image of the object
for many previously studied images. According to this approach identifying a bone as a cat or
As a dog it will be done as follows: feed the computer with many pictures of dogs
and cats, with an emphasis on different types of cats and dogs and photographs
different angles. The computer compares the image of the bone to the image database
existing, and a sufficient similarity to one of the examples will identify the object as a dog or a cat.
The conventional approach does not take into account the fact that pets can swing the
the tail at one point in time and drop it at another point in time.
The dynamism of the bone structure will make it difficult for the computer to match it successfully
For examples stored in memory.

Ullman's research is based on automatic detection of partial features.
These characteristics are subforms and segments of the object, which constitute a species
A basic alphabet from which different objects can be assembled. the algorithm
developed by Ullman and his students cuts the image of the bone into "puzzle pieces"
and determines the relative importance of each "piece" to the definition of the kennels or
The cats. "This way we can calculate the amount of unique information given
to produce from the tail of the dog or the ear of the cat. The computer stores the
The formal alphabet from which dogs and cats are built, and when required to identify
A dog, he will use the database of the partial characteristics that define the dogs."

An experimental system developed by Ullman's research students already
"knows" how to define and clarify from pictures the basic characteristics of
Cars, faces and other objects. In a second step, the system utilizes
The features she defined to locate faces and cars within a number
Lots of new photos. "It is possible to make a parallel between our system and the child
Seeing objects for the first time, cars for example. The child locates and stores
remembering the unique characteristics of the car", explains Ullman,
"And from that day he will be able to recognize cars he has never seen before."

Israeli research in the field of computer vision is one of the leading in the world, and there is
It has many practical consequences. Prof. Ullman is one of the founders of the company
"Orbotech", which implements computer vision systems. The company is already producing
and markets a device that performs automatic visual inspection of printed circuits
(The test was previously carried out by employees equipped with a magnifying glass).
The printed circuits are part of every electronic system, and their function is to communicate
Between the chips - the building blocks of the system.
The automated visual inspection has become an integral part of the production process
of the circuits, and it includes "reading" of the electronic links
printed and identification of the mistakes and defects.

Leave a Reply

Email will not be published. Required fields are marked *

This site uses Akismat to prevent spam messages. Click here to learn how your response data is processed.