How Fine Reader recognizes printed and handprinted documents?
Providing human abilities to machine, creating machine able to read, for example û is an old dream of mankind. During the last 50 years this dream began
to come true. Character recognition is a class of the most successful applications of artificial intelligence field. And now there is a solution û machine able to read almost as good as human.
This solution is based upon recognition principles used by humans and animals û principles
of Integral Purposeful Adaptive perception, IPA-technology. This solution is called ABBYY
FineReader û an example of the highest standard of printed and handprinted symbols
recognition. ABBYY's approach to the recognition systems development springs from ideas
set forth in 70s by pioneer scientist in artificial intelligence field, Professor Marvin Minsky.
Marvin Minsky was the inventor of frames, structures serving to represent and use the human
knowledge itself. His theory of frames is a concept providing solutions to various problems
such as computer vision, natural language understanding, information retrieval, planning.
According to his theory human view of the world is kept in structured frames, and the process
of thinking is based upon various kinds of frames, used to recognize images, understand
words, deeds, logic and so on.
We believe that activity is, as it appears, not only necessary, but also vital for living beings -
neurons must be active, behavior must be active, and, last but not least, perception must
be active too. Inactivity is lethal for any living being. These notions are nearly the same that form
the basis of the theory of stable inequilibrity that E.S.Bauer proposed in his book "Theoretical
biology" published in the year 1937. ABBYY FineReader is active as regards the environment
it exists in: even as living beings do, it proposes hypotheses, and then proves them or rejects
them. Similarly to a living being analyzing object or situation, FineReader analyzes a given
object or a situation according to certain known properties, proposing hypotheses as for its
class, then proves or rejects them. Such scheme makes it possible for FR to make decisions, i.e. recognize, with better accuracy.
ABBYY developers are trying to replace traditional recognition of separate symbols by "recognition with understanding". The traditional approach that interprets what is present on the image
is replaced by the purposeful search for what is supposed to be present on the image.
Such system would have been unimplementable but for principles of Integrity, Purposefulness
and Adaptivity. These principles were proposed by Professor Alexander Shamis, as of now
ABBYY staff member, and a team of scientists at the Research Center of Electronic and
Computer Technique, in 70s. Principles of Integrity, Purposefulness and Adaptivity became
the basis of FineReader technology, i.e. of the system that recognizes both printed and
handprinted documents. These principles are state-of-the-art as regards modern
understanding of human visual perception.
Integrity
An object of recognition is described as a single entity via a set of basic elements and
relations between them. An object is recognized as belonging to a certain class of objects
only if it appears to contain all the necessary elements and relations between them. For
example, if we recognize furniture in the room, and we find a geometric object placed on the
floor consisting of four vertical sticks and horizontal surface attached to the upper ends of
sticks, we recognize this object as a table.
Purposefulness
Recognition is the process of generation and verification of hypotheses. Traditional approach
that interprets what can be present on the image is replaced with the approach that
purposefully looks for certain image features that are mostly expected to be there. In our
example we can hypothesize that the object is a table if it has three or four vertical sticks.
Adaptivity
The ability of the system to learn and train for effective recognition of heterogeneous objects.
For example, if the system didn't know three-legged tables, if it recognized once this table as
a table, it will further confidently recognize similar objects as tables.
In accordance with these three basic principles ABBYY engineers developed a new structural
pattern algorithm of character recognition. It is now used in FineReader alongside with other
widely known algorithms: feature and raster classifiers.
A structural pattern describes the character as a set of structural elements bound by spatial
relations to each other. There are four types of structural elements: segment, arc, circle,
point. Structural pattern is matched against a character image by establishing
correspondence between structural elements and parts of the input image, which satisfy all
spatial relations. Structural pattern algorithm provides the highest accuracy of recognition even
for variable symbols, this feature being very important particularly for handprinted texts. So
symbol size and font are no longer crucial for recognition system.
In future we are planning to develop a system which will not just recognize separate symbols
and words, but will "read and understand" due to its ability to expect and confirm the
expected. Such system can be developed by using syntactic semantic context allowing the
machine to "understand" the text. This system will work as an active perceptive system,
similarly to living beings.
These principles are of crucial importance not only for character recognition but also for many
other artificial intelligence applications, and ABBYY makes full use of these principles in development of a whole range of its software products today.
Integral Purposeful Adaptive technology can be used not just for recognition purposes but also
to analyze any type of structured objects. In particular, ABBYY is now investigating the
possibilities of use of IPA technology for syntax analysis of natural language.
|