About ABBYY Products
HomeSearchDownloadOnline-storeRegistration
Home page > Products > ABBYY FineReader 4.0 
Company  

Contacts  

Products  


ABBYY FineReader

About OCR?

Inside FineReader Engines 

Product Details

Reviews and Awards

ABBYY FineReader for Mac

Form Processing (ICR)

Developer Toolkits

ABBYY Lingvo   


Where to buy  

Tech. support  

How Fine Reader recognizes printed and handprinted documents?

Providing human abilities to machine, creating machine able to read, for example û is an old dream of mankind. During the last 50 years this dream began to come true. Character recognition is a class of the most successful applications of artificial intelligence field. And now there is a solution û machine able to read almost as good as human. 

This solution is based upon recognition principles used by humans and animals û principles 
of Integral Purposeful Adaptive perception, IPA-technology. This solution is called ABBYY FineReader û an example of the highest standard of printed and handprinted symbols recognition. ABBYY's approach to the recognition systems development springs from ideas set forth in 70s by pioneer scientist in artificial intelligence field, Professor Marvin Minsky. Marvin Minsky was the inventor of frames, structures serving to represent and use the human knowledge itself. His theory of frames is a concept providing solutions to various problems such as computer vision, natural language understanding, information retrieval, planning. According to his theory human view of the world is kept in structured frames, and the process  of thinking is based upon various kinds of frames, used to recognize images, understand words, deeds, logic and so on.

We believe that activity is, as it appears, not only necessary, but also vital for living beings - neurons must be active, behavior must be active, and, last but not least, perception must be active too. Inactivity is lethal for any living being. These notions are nearly the same that form the basis of the theory of stable inequilibrity that E.S.Bauer proposed in his book "Theoretical biology" published in the year 1937. ABBYY FineReader is active as regards the environment it exists in: even as living beings do, it proposes hypotheses, and then proves them or rejects them. Similarly to a living being analyzing object or situation, FineReader analyzes a given object or a situation according to certain known properties, proposing hypotheses as for its class, then proves or rejects them. Such scheme makes it possible for FR to make decisions, i.e. recognize, with better accuracy.

ABBYY developers are trying to replace traditional recognition of separate symbols by "recognition with understanding". The traditional approach that interprets what is present on the image is replaced by the purposeful search for what is supposed to be present on the image.

Such system would have been unimplementable but for principles of Integrity, Purposefulness and Adaptivity. These principles were proposed by Professor Alexander Shamis, as of now ABBYY staff member, and a team of scientists at the Research Center of Electronic and Computer Technique, in 70s. Principles of Integrity, Purposefulness and Adaptivity became the basis of FineReader technology, i.e. of the system that recognizes both printed and handprinted documents. These principles are state-of-the-art as regards modern understanding of human visual perception.

Integrity

An object of recognition is described as a single entity via a set of basic elements and relations between them. An object is recognized as belonging to a certain class of objects only if it appears to contain all the necessary elements and relations between them. For example, if we recognize furniture in the room, and we find a geometric object placed on the floor consisting of four vertical sticks and horizontal surface attached to the upper ends of sticks, we recognize this object as a table. 

Purposefulness

Recognition is the process of generation and verification of hypotheses. Traditional approach that interprets what can be present on the image is replaced with the approach that purposefully looks for certain image features that are mostly expected to be there. In our example we can hypothesize that the object is a table if it has three or four vertical sticks. 

Adaptivity

The ability of the system to learn and train for effective recognition of heterogeneous objects. For example, if the system didn't know three-legged tables, if it recognized once this table as a table, it will further confidently recognize similar objects as tables.

In accordance with these three basic principles ABBYY engineers developed a new structural pattern algorithm of character recognition. It is now used in FineReader alongside with other widely known algorithms: feature and raster classifiers.

A structural pattern describes the character as a set of structural elements bound by spatial relations to each other. There are four types of structural elements: segment, arc, circle, point. Structural pattern is matched against a character image by establishing correspondence between structural elements and parts of the input image, which satisfy all spatial relations. Structural pattern algorithm provides the highest accuracy of recognition even for variable symbols, this feature being very important particularly for handprinted texts. So symbol size and font are no longer crucial for recognition system. 

In future we are planning to develop a system which will not just recognize separate symbols and words, but will "read and understand" due to its ability to expect and confirm the expected. Such system can be developed by using syntactic semantic context allowing the machine to "understand" the text. This system will work as an active perceptive system, similarly to living beings. 

These principles are of crucial importance not only for character recognition but also for many other artificial intelligence applications, and ABBYY makes full use of these principles in development of a whole range of its software products today.

Integral Purposeful Adaptive technology can be used not just for recognition purposes but also to analyze any type of structured objects. In particular, ABBYY is now investigating the possibilities of use of IPA technology for syntax analysis of natural language.

About ABBYY © 1996-2000 ABBYY Software House
Tel: +7 095 234-44-00,
Fax: +7 095 956-47-87
office@abbyy.ru