ABOUT OCR
OCR (Optical Character Recognition) applications are used to
input data from paper documents in editable electronic form. A paper
document may be old fax, press-cutting, dot matrix printer or
typewriter output, photocopy or page from old book û any document
containing information needed quoting or input into a corporate
electronic document system.
In this case user has two choices û 1) retype the document,
reconstructing its original formatting (tables, columns, pictures
etc.) or 2) use scanner and OCR-system, which is much faster and
easier if an OCR-application recognizes documents accurately.
The process of using an OCR is quite simple û put the document in
the scanner, activate "Scan&Read" wizard (when using
ABBYY FineReader), verify the recognition result and export it to one
of the office applications (MS Word, WordPad, WordPro etc.) or save it
in one of the supported formats (RTF, HTML, PDF etc.).
The most laborious parts of the process are verification and
reconstructing the original document layout.
That is why users highly appreciate accuracy of recognition and
document layout retention of OCR systems. This fact is proved by
research held by ABBYY, the results of which you can find below.
What do OCR users consider important?
A recent survey among FineReader users showed that the most
important features for OCR users are:
Accuracy |
95% |
Page layout retention for word processing
publications
(in MS Word, MS Excel, Word Pro, Word Perfect formats) |
89% |
Page layout retention for word processing
publications
(in PDF, HTML formats) |
87% |
Table and multi-columns text recognition |
87% |
Easiness of use |
85% |
Reliability |
82% |
Easy-to-use error searching procedure |
80% |
Colour features (retention of colour pictures,
font and background colour) |
63% |
Direct export to other applications |
61% |
Speed |
55% |
Multi-language recognition |
25% |
During the verification stage some additional tools, like
spell-checker built-in into some OCR packages, strongly helps user to
decrease error rate. For example, ABBYY FineReader is able to
recognize documents in 52 languages, and has a built-in spell-checker
for 21 languages.
FineReader is a leader of accuracy recognition in all competitive
tests, published in Russia since April 1995 (launch of FineReader
2.0). Since the launch of FineReader 4.0 Standard/Professional
received 19 awards from famous IT magazines worldwide for its
unsurpassed accuracy of recognition.
FineReader has one more very important quality û retention of
original document layout, including tables, columns and pictures and
export of recognition results in all popular office applications
(Word, WordPro, WordPerfect, Excel). FineReader saves the results in
the following formats - RTF, XLS, CSV, DBF, PDF, HTML etc.
Thus, all the process of data input from paper to computer (i.e.
from scanning to export of recognition results) takes less than a
minute * (* depending of the document and used hardware), and the
"electronic" document looks just like the original paper
one!
|