Making Good Use of Adobe Acrobat PDF Files

This topic gives detailed information about the PDF format supported by Readiris and ways in which you can make good use of the PDF files.

File formats

Note: compression is used for all elements. Black-and-white images are Group 4 compressed TIFF files, greyscale and color images are JPEG files (with (0.8) high quality). The text is compressed using the Gzip mode.

Tip: Readiris allows you to create bookmarks automatically.

Tip: Readiris allows you to embed the fonts in PDF documents!

Advantages

Editing the recognized text

The recognized text can obviously be edited and re-used. (Bitmap images can be viewed but not edited.)

Use the TouchUp Text tool of the Acrobat software to correct small recognition errors in the PDF file.

Tip: it takes the appropriate version of Adobe Reader to correctly display the resulting PDF files! To view and print Central-European texts (such as Czech and Polish), Baltic texts, Turkish and Cyrillic (“Russian”) texts in the PDF format, you must have the special “CE” version (Central-European) of the Adobe Reader software. (You can find this software on the Readiris CD-ROM.)

Exporting text to other applications

Intelligent searching

Use the Search command of your Adobe Acrobat or Adobe Reader software for simple searches within a document and for advanced searching across several PDF documents.

The button Search of the Adobe Acrobat or Adobe Reader software finds complete words or word parts in the current PDF document. Acrobat looks for the word by sequentially reading every word on every page in the file.

The button Search of the Adobe Acrobat or Adobe Reader software also allows you to perform advanced and fast searching on a collection of indexed PDF documents.

Index-based searching implies that the “full-text” index was created for a collection of PDF files with the command Catalog. (A “full-text” index is an alphabetized list of every word used in a document or a series of documents. Index-based searching is much faster than sequential reading: Adobe Acrobat and Adobe Reader go right to the word in the list rather than progressively reading through the documents.)