Training OCR for Special Characters

A training file is a set of pre-recognized text characters that OmniPage Pro compares with characters on a page image during OCR. You can create a training file for special characters that might normally be difficult to recognize such as the copyright symbol © or the registered trademark symbol ®.

NOTE: Most characters do not need to be trained. Look for uncommon characters only. Do not train OmniPage Pro for regular characters because it may interfere with recognition.

To create a training file:

  1. Click the Document Source button on the Manual OCR toolbar to open the image file or scan the page that includes characters you want to train.

  2. Create zones around the text that you want to train using the Image toolbar.

  3. Choose Edit/Create Training File... in the Tools menu.

  4. Click the Original Layout and Output Format button in the Manual toolbar, or choose Recognize Current Page in the Process menu.

    OmniPage Pro analyzes the document and then opens the Train Characters dialog box.

  5. Double-click a character you want to train. Or, select it and click Specify.

    The Specify Character dialog box shows how the selected character appeared in the original page image.

  1. Specify how you want OmniPage Pro to interpret the character during OCR by entering a character in the Character edit box.

  2. Click OK to return to the Train Characters dialog box.

  3. Repeat steps 5–7 to continue specifying characters.

  4. Click Save to save the specified characters to a training file.

    Or, click Append to add the specified characters to another training file.

After saving or appending to a file, you are asked if you want to make this the current training file. Click Yes to recognize the current page using the training file you just created. Click No to return to the image without recognizing it.

Training files are saved in the data folder in your installation folder. You can select them in the OCR tab of the Options dialog box.

To edit a training file:

  1. Choose Edit/Create Training File... in the Tools menu.

  2. Or, select Use Training File, then select a training file in the drop-down list and click the Edit button.

    A dialog box appears listing all your training files.

  3. Double-click the training file you want to edit. Or, select it and click Edit.

    The Train Character dialog box displays characters in the selected file.

  4. Edit the characters as desired.

  5. Do one of the following after editing the training file: