home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
DP Tool Club 17
/
CD_ASCQ_17_101194.iso
/
vrac
/
sctn11.zip
/
SCANTUNE.MAN
< prev
next >
Wrap
Text File
|
1994-08-10
|
78KB
|
1,388 lines
SCANTUNE -- Version 1.1
A unique multifaceted menu driven program for cleaning up Optical
Character Recognition (OCR) text files on IBM and compatible computers.
by George McCoy
Documentation by Skip Gore
Copyright (c) 1993, 1994 by George McCoy. All rights reserved.
SCANTUNE USER MANUAL Page 1
TABLE OF CONTENTS:
ScnBRAND:....... 3
WHAT IS SCANTUNE?....... 3
SCANTUNE USERS LICENSE AGREEMENT:....... 4
WARRANTY DISCLAIMER:....... 4
TUTORIAL:....... 6
ON-LINE HELP:....... 6
GETTING STARTED:....... 6
STARTING SCANTUNE:....... 7
CURSOR MOVEMENT KEYS INSIDE THE SCANTUNE MENU:....... 7
THE SCANTUNE MENU:....... 7
CURSOR MOVEMENT KEYS INSIDE THE PROOFREADER:....... 8
CONTINUOUS READ MODE:....... 9
REFRESH SCREEN:....... 9
PROOFREAD THE DOCUMENT:....... 10
COMMAND FUNCTIONS INSIDE THE PROOFREADER:....... 10
CLEAN UP SCANNER ERRORS:....... 15
LEARN MODE:....... 15
SPELLCHECKER:....... 17
BLANK LINES REMOVED:....... 20
STRIP UNNECESSARY LEADING SPACES:....... 20
INDENT PARAGRAPHS 5 SPACES:....... 20
TWO SPACES AFTER SENTENCES:....... 20
RIGHT MARGIN FOR WORD WRAP:....... 20
WORDWRAP TEXT:....... 20
MACROS:....... 21
MACRO ENTRIES:....... 21
EDIT LOAD/CREATE DELETE OPTIMIZE QUIT:....... 22
LOAD/CREATE MACROS:....... 23
DELETING MACROS:....... 23
OPTIMIZE MACROS:....... 23
CREATING MACROS WITH A TEXT EDITOR:....... 24
NON TEXT CHARACTERS REMOVED:....... 24
DECOLUMNIZE TEXT:....... 24
LOG CORRECTIONS:....... 25
MARK ALL OPERATIONS:....... 25
MARK STANDARD OPERATIONS:....... 25
UNMARK ALL OPERATIONS:....... 25
EXECUTE MARKED OPERATIONS:....... 25
ALARM:....... 25
QUIT TO DOS:....... 26
SCANTUNE FROM THE COMMAND LINE:....... 26
SCANTUNE FROM A BATCH FILE:....... 26
USAGE TIPS:....... 28
THE BLOCPARA MACRO:....... 29
THE PARA MACRO:....... 30
FOR YOUR INFORMATION:....... 30
TECHNICAL INFORMATION:....... 30
SAMPLE MACROS:....... 32
OB.stm MACROS:....... 33
TECHNICAL SUPPORT:....... 33
SCANTU NE USER MANUAL Page 2
scnbrand:
In order to remove the demonstration version of the ScanTune
screen, type scnbrand and hit enter. This program will brand the copy of
ScanTune with your name and your registration number. NOTE: Always
make a backup copy of all programs in the event you somehow damage or
delete it. Bang Bang.
WHAT IS SCANTUNE?
ScanTune is a utility for cleaning up corrupted text files produced
by OCR systems.
That's the simple definition. Bang Bang.
To elaborate, ScanTune is a macro utility that cleans up scanner
errors, removes blank lines, strips unnecessary leading spaces, indents
paragraphs five spaces, retains two spaces after sentences, word wraps
text, allows the right margin to be modified, retains a log of
corrections made during specific operations, decolumnizes text, allows
selected operations to be performed inside of "blocked text" and if that
isn't enough, ScanTune has a built-in Spellchecker and Learn mode.
As an additional feature, ScanTune also allows the removal of
nonstandard text characters to be removed from word processors that
imbed their own code such as Word Perfect files, thus making them
readable with ScanTune and Study (ScanTune's companion.)
But wait!! The heart and sole of the ScanTune program is the
intelligent mode which keeps track of OCR errors and their corrections.
This is accomplished with various editable (e d i t a b l e) macros. By
continually adding corrections to the various macro files, ScanTune will
make more and more corrections automatically, without your intervention.
This means that as ScanTune gains experience with your system, the
amount of editing you have to do in order to get good, readable text
becomes less and less.
With the "Learn Mode" feature ScanTune goes through your document,
macro files and the Spell Checker and builds a list of words it doesn't
recognize. When it's completed, based upon the speed of your machine,
ScanTune provides you with options to make the OCR file more readable.
ScanTune can even be operated from a batch file or from the command
line.
SCANTUNE USER MANUAL Page 3
SCANTUNE USERS LICENSE AGREEMENT:
ScanTune is distributed as a demonstration version. ONLY 50k of an
OCR file can be manipulated until the program is registered where upon
the size of the file is unlimited. See the last screen of the ScanTune
program for details. As an introductory offer, GLM Enterprises is
offering both ScanTune and its companion Study for the low price of
$100.00. This price is subject to change without notice. Bang Bang.
If ScanTune is not what you've been looking for, simply delete the
program from your computer and thank you for trying ScanTune.
If on the other hand, you decide to register ScanTune you are
granted a limited single user license. You are permitted to use
ScanTune on any computer that you're working on. If others use the
computer they must register ScanTune. You can make as many copies as
you need as long as you are making them for your own needs.
Upon registration you will be entitled to technical support. See
the end of this document for details.
If you want a friend to try out ScanTune please give them the
demonstration version. Remember, your name and license number are in
the opening screen so passing around your licensed copy will also pass
around your name.
Because we have marketed ScanTune at a fraction of its worth we're
sure all will be able to afford their own copy.
If you continue to use ScanTune, you agree to the terms set forth
in this agreement.
If you are interested in a sight license please contact GLM
Enterprises for details. That information is located at the end of this
documentation.
GLM Enterprises reserves the right to terminate any license at any
time for violating any aspect of this license agreement.
Any attempt to reverse engineer, decompile or disassemble ScanTune
is forbidden and violates copyright laws.
WARRANTY DISCLAIMER:
GLM Enterprises provides the ScanTune utility as is and without any
warranty. To the extent permitted under applicable law, GLM Enterprises
disclaims all warranties, express or implied. Specifically, GLM
Enterprises makes no representation or warranty that the software is fit
for any particular purpose.
GLM Enterprises shall not be liable for any damages resulting from
the use of this software, including but not limited to, loss of profit,
data or use of the software, or special, incidental or consequential
damages or other similar claims, even if GLM Enterprises has been
specifically advised of the possibility of such damages.
SCANTUNE USER MANUAL Page 4
Some states do not allow the exclusion of incidental or
consequential damages, so the foregoing limitation may not apply to you.
GLM Enterprises believes the information in this publication is
accurate as of its publication date. This information is subject to
change at any time without notice.
SCANTUNE USER MANUAL Page 5
TUTORIAL:
For those individuals who prefer tutorials, check out
"Tutorial.doc" on the distribution diskette. It is a step by step
tutorial to get you started quickly in the use of ScanTune.
ON-LINE HELP:
A unique feature has been implemented for ScanTune. Like many
other programs, help is available from anywhere within the program. By
hitting F1, a pop-up screen will appear displaying the information
pertaining to that particular feature. But, by hitting alt-F1, you can
re-write the help information to your liking. The alt-F1 is a mini-text
editor. All of the standard editing keys are available. Over-strike
mode is active upon entry. To re-write the entire text help screen, hit
pgup and begin typing. To save your new help file and escape from the
mini-text editor hit ctrl-w. You are back at the exact spot you were
before entering help. NOTE: If you decide to re-write the help files,
make sure that you save them when you upgrade to a new version of
ScanTune. Bang Bang.
GETTING STARTED:
First, double check to ensure that the following files are located
on your distribution diskette:
ScanTune.exe
SCTN.dbt
SCTN.hlp
SCTNhelp.hlp
SCdict.spl
ScanTune.man
Accent.stm
b2h.stm
Blocpara.stm
Default.stm
Ob.stm
Para.stm
Tutorial.raw
Tutorial.doc
SCANTUNE USER MANUAL Page 6
Once you have determined that the files exist do the following:
1. Copy the contents of the distribution diskette to any location you
choose and unzip it. It doesn't make any difference to ScanTune where
it is located. However, if the location you choose is not in your path
statement as reflected in your autoexec.bat file, you'll have to correct
it. NOTE: All associated program files must be located in the same
directory in order for ScanTune to work properly.
STARTING SCANTUNE:
Now that the program files are ready, you need an OCR text file to
work on. We recommend that you rename the OCR file with a .txt
extension until you become familiar with the various extensions ScanTune
uses. NOTE: Do not start a session with a .org extension. Why?
Because whatever filename.ext you start with, ScanTune will copy it and
rename it with a .org extension. This was done to keep you from loosing
or destroying the original file that has taken valuable time to scan.
ScanTune will then use the filename.ext you chose to begin. When you
exit ScanTune, a copy of the file just completed is copied and renamed
filename.bak so that you can always go back one step in case of a
problem. Once you have renamed the OCR file using the .txt extension,
and you're at the command line, type:
ScanTune filename.txt and hit enter. ScanTune will then display the
main menu.
Before we get into ScanTune's menu, you should be aware of various
cursor movement keys. They are as follows:
CURSOR MOVEMENT KEYS INSIDE THE SCANTUNE MENU:
Up and down arrows move up and down one line at a time.
Pgup and pgdn move up and down one screen at a time.
Ctrl-home highlights the first menu item on the screen.
Ctrl-end highlights the last menu item on the screen.
Ctrl-pgup puts cursor at first menu option.
Ctrl-pgdn puts cursor at last menu option.
Escape quits the ScanTune program and returns to DOS.
THE SCANTUNE MENU:
The ScanTune menu has two selective modes. One is immediate and is
activated upon selection. It has a star (*) to the right of the item
indicating that the selection is immediate. The other is the "marked"
mode. When you first select a marked option it is toggled from "no" to
"yes". Selecting the option again changes it back to "no". For your
convenience, after an option has been selected, the next menu item is
SCANTUNE USER MANUAL Page 7
automatically highlighted. To skip a function hit down arrow. Nothing
will happen until the execute marked operation is selected.
The menu arrangement has been designed for optimum use and even
though it appears illogical, its not. See for yourself:
Proofread a document (*)
Cleanup scanner errors no
Learn Mode (*)
Spell Check the Document (*)
Blank lines removed no
Strip unnecessary leading spaces no
Indent paragraphs five spaces no
Two spaces after sentences no
Right margin forward wrap 72 *
Word wrap text no
Macros *
Non text characters removed no
Decolumnize text no
Log corrections no
Mark all operations *
Mark standard operations *
Unmark all operations *
Execute marked operations *
Alarm Off
Quit to dos *
It's pretty straight forward. To navigate the menu highlight an
item by arrowing to it or by typing its first letter. In the case of
items that have the same beginning letter just type the letter again
until the option you want is highlighted. Select or toggle the
highlighted item by hitting enter.
CURSOR MOVEMENT KEYS INSIDE THE PROOFREADER:
SCANTUNE USER MANUAL Page 8
Up arrow moves the cursor to the previous line of text.
Down arrow moves the cursor to the next line of text.
Right arrow moves the cursor to the right one position.
Left arrow moves the cursor to the left one position.
Home moves the cursor to the beginning of the line.
End moves the cursor to the end of the line.
Pgup moves the cursor to the previous screen of text.
Pgdn moves the cursor to the next screen of text.
Ctrl-pgup moves the cursor to the beginning of the file.
Ctrl-pgdn moves the cursor to the end of the file.
Ctrl-right arrow moves the cursor to the beginning of the next word.
Ctrl-left arrow moves the cursor to the beginning of the previous word.
Ctrl-home moves the cursor to the top left corner of the screen.
Ctrl-end moves the cursor to the bottom right corner of the screen.
CONTINUOUS READ MODE:
Continuous read is where your text will automatically scroll
forward. You can continuously display text on the screen by using the
space bar. The rate at which the text moves up the screen can be
adjusted by using the left and right arrow keys. You may hit any key to
stop continuous reading and return to proofread mode.
To use this feature, follow these steps:
1. It is assumed that you have started ScanTune as outlined above and
have entered the Proofread mode.
2. Hit the space bar to begin continuous reading. The text will begin
to scroll slowly up the screen allowing you to sit back and listen or
watch for ocr errors.
3. Use the left and right arrows to adjust the scroll rate. The left
arrow slows down the scrolling of the text while the right arrow speeds
it up.
4. To stop reading, hit any key and you will be returned to proofread
mode.
REFRESH SCREEN:
SCANTUNE USER MANUAL Page 9
There is a refresh command for voice synthesizer users who want
immediate access to the screen. These commands are a second way to
access the screen within ScanTune.
Since these commands are unique and created for a specific type of
user, a prefix key is used to access them. The prefix key is Slash (/).
If you hit the slash key and any letter from a through x, the
corresponding line will be redisplayed and your cursor will be moved to
that line. Slash (/) left arrow and slash (/) right arrow work like
ctrl-left arrow and ctrl-right arrow, in that they move to the previous
word and next word respectively.
The following slash commands are also available: /Y = Re-display
the last status line message. /Z = repaints the entire screen via voice
synthesis.
NOTE: The slash (/) key is not held down while other keys are being
hit. It is a prefix key only.
PROOFREAD THE DOCUMENT:
Finally we're ready. To start the proofreader highlight the
proofread the document * menu option and hit enter. Your document's
first screen of text is ready and waiting. NOTE: Some voice
synthesizers will start reading the screen automatically before you're
even ready. To stop this, see your voice synthesizer manual for
instructions. In many cases hitting the alt key will stop the reading.
Before you start proofreading, you should note that at your
fingertips you have numerous command functions that allow you to make
corrections to the loaded OCR file. They are:
COMMAND FUNCTIONS INSIDE THE PROOFREADER:
A = Search and replace. When the letter "A" is hit, ScanTune will
prompt: enter search string. Type the exact characters you want changed
or eliminated and hit enter. ScanTune will prompt: enter replacement
string for ... (the word or phrase you just typed in). Type in the
characters with which you want to replace the search string and hit
enter. ScanTune will prompt: all occurrences, words only cancel entry.
NOTE: If "A" is chosen, the replacement will be made every time the
error is encountered during the cleanup process. If "W" is chosen the
correction will be made only if the error is surrounded by spaces,
punctuation marks or appears at the beginning or end of a line. Cancel
entry will return you to the Proofreader. After you have made your
selection, ScanTune will prompt with: prompt before correcting y/n. By
selecting "y" ScanTune will prompt you about the correction during the
Cleanup process which will be discussed later. By selecting "n"
ScanTune will make the correction automatically during the cleanup
process. Make your selection and ScanTune will prompt: Temporary,
Universal, both cancel entry. NOTE: Temporary and universal are macro
files. One macro, as cited in the examples above, is part of a macro
file. Each file can hold many macros. For the time being consider the
two main macro files, temporary and universal. The temporary macro file
SCANTUNE USER MANUAL Page 10
is where you should keep those errors and corrections that are
infrequent and pertain to that particular OCR file. The universal macro
file is named Default.stm and is where frequently occurring errors and
corrections produced by the OCR system are saved. More about temporary
and universal macro files later. Select an option by typing the letter
of your choice (remembering that the "c" will cancel the entry.) After
making your selection, you will be returned to the proofreader. The
correction will be made when you elect to run the Cleanup process.
Period (.) plus letter toggles menu options on and off. This feature
allows you to toggle a menu item from the Proofread keyboard without
having to flip back and forth between the proofreader and the main menu.
You can enter either an upper or lower case letter, if it is preceded by
a period (.). The valid letters are:
.A = All options
.B = Blank lines removed
.C = Cleanup scanner errors
.D = Decolumnize
.I = Indent paragraph five spaces
.L = Log corrections
.M = Mark standard operations
.N = Nontext characters removed
.S = Strip leading spaces
.T = Two spaces after sentences
.U = Unmark all operations
and
.W = Word wrap.
To execute the options you have selected from the keyboard, you
must hit escape and return to the ScanTune menu before executing the
marked operations. NOTE: These options are available from the command
line and from batch files, but instead of a period (.) you must use the
slash (/). If you type "e" from the Proofreader keyboard, you will be
automatically placed into the editor for macros. And, you're not ready
for that yet. See Macros for details. The Spellchecker is not
available from the keyboard while in proofread mode. It can only be
accessed from the menu; the command line, or from a batch file.
B = Search backward in your file for any non case sensitive word or
phrase. When you type "B", you will be prompted: type backward search
phrase or escape to exit. At this point, ScanTune is waiting for you to
SCANTUNE USER MANUAL Page 11
enter the information you would like to search for. Type in your search
phrase and hit enter. If your search is found, the found phrase will be
highlighted on the top line of your screen and the rest of your screen
will be filled with text. If the text is not found, a bell will sound
and a message to that effect will be displayed before you are returned
to where your search was initiated.
Shift-B = Case sensitive backward search. Whatever you type in is
what ScanTune will search for.
ALT-B = Continue same backward search pattern. If the previous
search was case sensitive, it will be continued as a case sensitive
search. The same applies for non case sensitive searches. This key
function will automatically search for the word or phrase you entered
the first time you initiated the search.
Colon (:) Inside the proofreader will display a file report which
includes size of file in bytes, if there are any lines longer than 80
characters, whether the file contains block style or indented paragraphs
or if there are a large number of blank lines in a row within your file.
This command helps you decide on which options to run to arrange the
file the way you want it.
Semi-colon (;) inside the proofreader displays the percentage read,
current files name and size. It's a good way to know exactly where you
are in the file.
E = Editor for macros. By hitting "e" from the Proofread mode, you
are placed inside the macro editor where changes can be made to all
macros.
Alt-e = Proofread Mini-Editor. By activating the Proofread
Mini-Editor, you have the capability of making corrections to text in
memory while in Proofread mode. The keys that are available inside the
Proofread Mini-Editor are: Ctrl-W = Exits and saves changes. Ctrl-y =
Delete current line. Ctrl-t = Delete word right. Ctrl-b = Reform
paragraph. Ctrl-v = Toggle insert mode. NOTE: When the Proofread
Mini-Editor is activated you are automatically in insert mode. Escape =
Return to original document and do not save corrections.
Enter = Add the current word the cursor is on to a macro file. You
will be prompted for a replacement string, as you were when using the
"A" function. Follow the prompts as outlined above and in the Macros
section later in this manual.
Escape = Mark your place y n y. If you select y, or hit the enter
key which is the default, your position will be saved so that you can
come back to it later. If you hit n, ScanTune will prompt: remove place
marker y n n. NOTE: If you remove the place marker, ScanTune will be
unable to locate that same position in the current document.
F = Search forward in your file for any non case sensitive word or
phrase. When you type F, you will be prompted: type forward search
phrase or escape to exit. At this point, ScanTune is waiting for you to
SCANTUNE USER MANUAL Page 12
enter the information you would like to search for. Type in your search
phrase and hit enter. If your search is found, the found phrase will be
highlighted on the top line of your screen and the rest of your screen
will be filled with text. If the text is not found, a bell will sound
and a message to that effect will be displayed before you are returned
to where your search was initiated.
Shift-F = Case sensitive forward search. Whatever you type in is
what ScanTune will search for.
ALT-F = Continue same forward search pattern. If the previous
search was case sensitive, it will be continued as a case sensitive
search. The same applies for non case sensitive searches. This key
function will automatically search for the word or phrase you entered
the first time you initiated the search.
ALT-G = Delete block. This key is only functional after you have
marked a block of text with alt-l. See alt-l for details.
Ctrl-g = returns you to the marked position when you entered
proofread mode.
I = Move to first line of next indented paragraph and display text
from that position.
SHIFT-I = Move to the first line of the previous indented paragraph
and display text from that position.
TAB = Move to next block style paragraph (preceded by a blank line)
and display text from that position.
SHIFT-TAB = Move to previous block style paragraph and display text
from that position.
J = Jump to DOS. NOTE: Remember, you have to type "Exit" from the
DOS command line to return to ScanTune.
ALT-J = Joins two lines placing the cursor on the second line.
Your cursor can be anywhere on the line when this function is activated.
It will set an internal mark and put your cursor at the beginning of the
next line. The lines and any exact duplicate lines will be joined
during the next cleanup operation.
L = Split current line in two. Place your cursor on the last
character of what you want to be the topmost of the two new lines before
hitting l. This line and all exact duplicate lines will be split during
next cleanup.
ALT-L = Sets marks for delete block, move block and allows user
selected options to be performed on that block. Put your cursor on the
character where you want the mark to begin. Hit alt-l. Move the cursor
to the end of the block placing your cursor on the last character you
want marked. Hit alt-l again. The block is marked. You can delete the
block with alt-g, move it with alt-m or you can return to the ScanTune
menu and select an option to be performed within the marked block.
SCANTUNE USER MANUAL Page 13
CTRL-L = Deletes line and all existing lines that are exact
duplicates. Put your cursor anywhere on the line you want deleted.
Remember that it will delete ALL lines that are exact duplicates, so be
careful.
M = Marks text string. To mark a few words or a phrase that is not
quite correct, put your cursor on the first character you want marked
and hit "m". Move your cursor to the last character you want to mark
and hit the letter "M" again. The marked text becomes the search
string. You are prompted for a replacement string. Follow the prompts
as outlined elsewhere. NOTE: You can cross line boundaries with this
feature.
ALT-M = Moves a block of text. Mark a block of text with the alt-l
key as described above. Place the cursor on the character to the left
of where you want the block to be moved and hit alt-m. The text is
deleted from its original position and inserted to the right of the
current cursor position.
P= Move to next page, if a form feed (ctrl-l) is present. This
function will not work if a form feed (ctrl-l) is not present.
SHIFT-P= Move to the previous page that is marked by a form feed
character (carrot l).
ALT-P = This command will ask you for the page number you want to
go to. Type in 26 for example. You will be taken to the 26th page with
your cursor on the top line of that page. When using this command, you
can precede the page number with a plus (+) or dash (-). Plus (+) will
take you forward from your current position the number of pages you
specify. Dash (-) will take you backward from your current position the
number of pages you specify.
R = Repeat the word that the cursor is on. To hear the word
spelled hit "r" again. To hear the word spelled phonetically hit it a
third time. NOTE: There is a two second delay built into this feature
so the spelling of a suspected word can be more understandable. If the
delay is too long for you just hit any key.
ALT-U = Unmark text or block. This command erases the marks made
with the alt-l key without doing anything to the blocked text. This key
is used when you have marked a block of text and then decide you don't
really want it marked after all.
ALT-X = quit program and exit to dos. ScanTune will prompt: mark
your place y n y. If you select y, or hit the enter key which is the
default, your position will be saved. If you hit n, ScanTune will
prompt: remove place marker y n n. NOTE: If you remove the place
marker, ScanTune will be unable to locate that same position in the
current document.
NOTE: When you exit ScanTune and answer the prompts correctly, your
corrections and any modifications you have made to the current file will
be retained in the temporary or universal macro file as selected.
SCANTUNE USER MANUAL Page 14
CLEAN UP SCANNER ERRORS:
This is a "marked" option. It is executed when the "e" execute all
marked operations option is selected. It is the first option performed
when "execute" is selected. The clean up option searches the source
file for all errors in the currently loaded macro files and corrects
them according to their execution instructions. If you have some
prompted corrections in the macro files, you will need to tend the
process and answer the prompts. Your prompts are:
y n v r c q
Which means: "Y" = yes, make the correction. "N" = no, don't make
the correction. "v" = vary context character length (from current 78 to
any number from 1 to 255). "R" = make the remainder of the changes
everywhere from current position to the end of the file. "c" = allow
change to be entered manually. "Q" = don't make the change or ask me
about it anymore this cleaning session. These are hot keys. See macros
for more details. Otherwise the process is automatic. As the text is
corrected, ScanTune reports the percentage and the number of
corrections. The number of corrections is incremental and is not the
number of corrections for that percentage of the file that has been
cleaned. When ScanTune is finished making corrections and changes, it
will prompt: flush temporary macro file (y/n). If you hit "Y", all
entries in the temporary macro file will be deleted. If you hit any
other key the temporary macro file is retained. NOTE: If you retain the
temporary file, be sure to delete it when you have finished with that
named OCR file. The temporary macro file is the name of the OCR file
with the .stm extension. For best results, place only those entries
that you want permanently saved into the universal macro file. See
macros for more details.
For fastest results during cleanup use macro entries that correct
errors and/or format with the least number of changes possible.
A feature has been included that will allow the cleaning process to
begin at the point where you last exited proofread mode. This is a
handy feature since portions of a book or manual are formatted
differently than the body of the text. When you select the cleanup
scanner error function, ScanTune will prompt: begin cleaning at place
marker (y /n). If you hit "Y", cleanup begins at the place marker. If
you hit "n" cleanup begins at the beginning of the file. The prompt is:
mark your place y n y. NOTE: The Escape key will stop the cleanup
process and prompt: Save what we've all ready done y/n. If you select
"y" ScanTune will save the cleaned portion and combine it with the
uncleaned portion for a later cleanup session. If you select "n" that
particular session is aborted and nothing is saved.
NOTE: You can activate log corrections during the cleanup operation
to see exactly what ScanTune is doing. This could be useful for many
things.
LEARN MODE:
SCANTUNE USER MANUAL Page 15
To start the Learn mode highlight Learn Mode (*) at the main menu
and hit enter. Your document is al ready loaded. The ScanTune program
will load the macro files you have selected (see macros for details), or
default.stm at the very least and the Spell Checker. ScanTune will
prompt: Creating Learning environment. Please standby. At this point
ScanTune is creating a .lrn file which is unreadable with a text editor,
loading the macro files you have selected or the default.stm at the very
least and the Spell Checker. ScanTune will prompt: Analyzing text. The
process is time consuming, so be prepared to do something else after you
select this option. You could run this option at night while you're
sleeping. It all depends on the length of the document and the speed of
your machine. Using this combination, ScanTune is creating a list of
words, found in your document, that are unrecognizable as compared with
your macro files and the Spell Checker. The list shows: number of
exceptions in number listings. Accuracy number.number (percent). Under
that ScanTune lists the words found and the number of times they occur
in the document. ScanTune also shows: Highlight item with arrows.
Enter selects, Escape exits. At this point you have the option of
"ignore, view/correct, dictionary, macro or quit.
Ignore = Ignore this word and clear it from the list. It could be a
proper name and you might not want it added to the dictionary. The list
reappears without the word you ignored. The next item moves to the top
of the list and is highlighted.
View/Correct = By selecting this option ScanTune will move to the first
location, in the document, where the word is found. You are now in the
Proofreader. The cursor is placed on the first occurrence of the
exception and all command functions inside the proofreader are again
available. For example: By hitting "e" you are placed inside the macro
editor where changes can be made to all macros. You can also hit alt-f
and move forward to the next occurrence of the highlighted word. We
chose to place you into the Proofread mode so that you could get
contextual clues on things that are not obvious in and of themselves as
reflected on the exception list. You can also enter the editor, at this
point, by hitting alt-e where you can make editing changes using the
following cursor movement commands: Ctrl-W = Exits and saves changes.
Ctrl-y = Delete current line. Ctrl-t = Delete word right. Ctrl-b =
Reform paragraph. Ctrl-v = Toggle insert mode. NOTE: When the
Mini-Editor is activated you are automatically in insert mode. When you
hit Ctrl-W or the escape key you are placed back into the proofreader.
Hit escape again and choose Learn mode to get back to the list.
Dictionary = By selecting this option, you are indicating to ScanTune
that you want the word, found on the list, to be added to the
dictionary.
Macro = By selecting this option you can hit the enter key and the word
will be added to the search string (see Macros for details.) ScanTune
will then prompt: enter replacement string. Type in your replacement
and ScanTune will prompt: all occurrences or words only. Make your
selection and ScanTune will prompt: prompt before correcting y/n.
Again, make your selection. ScanTune will prompt: Temporary universal
SCANTUNE USER MANUAL Page 16
both cancel. See macros elsewhere for details.) NOTE: The "both"
option has been added here in the event you might want to add the macro
to both files thereby making the word appear on the next cleanup in your
default.stm macro file. To get back to the list hit escape.
Quit = Takes you back to the main menu.
NOTE: A feature has been added here that will spell the word the cursor
is resting on. To access the word spelling function hit alt-w and the
word will be spelled. Alt-I will ignore the word, alt-d will add the
word to the dictionary, alt-m places you in the macros function while
alt-q quits back to the main menu. NOTE: If you highlight an item and
use the alt-xx hot keys, you don't have to hit the enter key or deal
with the menu bar.
SPELLCHECKER:
To start Spellchecker highlight the Spellchecker * menu option and
hit enter. This is an immediate option. Spellchecker will prompt:
enter minimum word length to be checked. The default is three. By
selecting three, you are telling Spellchecker not to bother with
checking one and two lettered words. You can change this at your
discretion. Select the number you wish and hit enter, or hit enter to
accept the default. NOTE: Spellchecker does not check for numbers!
Spellchecker will then prompt: loading dictionary and macro files,
please standby. NOTE: the first time that the Spellchecker is used, It
will create a file called scdict.ntx. This is the index for the
spelling dictionary and will take some time, depending on the speed of
your machine. When Spellchecker is ready, it will report:
Spellchecking.
When Spellchecker finds a suspected word, it has all ready checked
the dictionary and the macro files, so it's up to you on what you want
to do. The suspected word will be displayed on the first line of the
screen, in the upper left-hand corner.
You have several options at this point. Consider:
Add word to dictionary
Add word to macro file
Suggested spellings
Correct word manually
Ignore word
Mark word
Repeat word
Display sentence
SCANTUNE USER MANUAL Page 17
Prompts off
Maintain dictionary
Unattend no
Log to disk yes
Vary Context Length
Quit the spellchecker
To select a menu item hit enter on the highlighted option or arrow
down to your selection. NOTE: The corresponding highlighted lettered
option is available from the keyboard, but the enter key has to also be
hit to activate that selection. The only exception to this is the
letters "d", "I", "r" and "s". See below for details.
By selecting add words to dictionary, Spellchecker automatically
appends that word to the currently loaded dictionary.
By selecting add word to macro file, Spellchecker will prompt:
enter replacement string for.... (See entering macros for more
details). Type in your replacement string and follow the prompts.
By selecting suggested spelling, ScanTune displays a list of words.
Use the up and down arrow keys to select the word you want. If you
don't find the word you are looking for select "none of the above and
choose suggested spellings again. More words will appear. If you find
the word you want, select it by hitting the enter key. That word will
replace the word at your cursor.
By selecting correct word manually, Spellchecker will prompt: enter
correction or escape to exit.
By selecting Ignore word, Spellchecker ignores the word and returns
to searching your document for the next suspected word.
By selecting mark word, Spellchecker places two (2) accent marks at
the beginning of the word. Later you can search for these characters in
the Proofreader using the functions outlined elsewhere.
By selecting repeat word, Spellchecker will repeat the word. By
selecting repeat word for a second time, the word will be spelled. By
selecting repeat word for the third time, the word will be spelled
phonetically. A two second delay has been built into this feature so
that the spelling of a suspected word can be more understandable. If
this delay is too long for you just hit any key.
By selecting display sentence, Spellchecker will repeat the
sentence in which the suspected word was found.
By selecting the prompts option, Spellchecker will switch from
announcing the word and the sentence to just announcing the word. If
SCANTUNE USER MANUAL Page 18
you wish to switch back, just hit enter on this prompt and when the next
suspected word is found the process will be reversed. NOTE: When
Spellchecker is first activated the default is off resulting in only the
suspected word being spoken.
By selecting maintain dictionary, Spellchecker prompts:add delete
quit. You can either add a word from the keyboard or delete a word from
the keyboard. when you delete a word from the dictionary, relax and
have a cup of coffee, or something. ScanTune is deleting the word, but
in the process has to rewrite the scdict.ntx file.
By selecting unattend mode, Spellchecker will announce the
suspected word and place two accent marks in front of the word, so you
can attend to them later. NOTE: by turning on the log corrections
option in the ScanTune menu or inside the Spellchecker, Spellchecker
will place the suspected word in the ScanTune.log file, so that you can,
at a later time, use the list to write correcting macros, prepare a list
of words to be added to the dictionary, or whatever. The sentence in
which the suspected word was found will also be written just below the
suspected word in the ScanTune.log file if you have prompts on. NOTE:
If you utilize the unattend mode, and are not creating a list with the
log corrections option, you need to use the accent.stm macro to strip
out the two accent marks that were placed there while you were away from
your machine. NOTE: This macro should be the only macro loaded when you
are finished spellchecking your document, or at the very least should be
the last macro loaded if run with other macros. The alt-Rightbracket
key is a toggle to either activate attend mode or to deactivate it.
Your computer continues to Spellcheck the document. In this way you can
start a spelling session and type alt-Rightbracket and let the computer
work while you do something else, and if you want a record kept, don't
forget to turn log corrections on. When you come back press
alt-Rightbracket again and continue from the location the spellchecker
is presently working.
By selecting log to disk, you can turn logging on here, instead of
having to do so in ScanTune's main menu. NOTE: The feature is available
in both places for your convenience.
By selecting vary context length, ScanTune will prompt: enter
minimum context length 78. Currently there are 78 characters being
displayed contextually, but you can change this all the way up to 255
characters. NOTE: This feature stays active with the number entered
until changed or a new spelling session is begun.
By selecting quit, Spellchecker prompts: mark your place y n y.
Answer "y" if you want to come back later and do more spellchecking from
where you left off. NOTE: after you have used the spellchecker on a
file and answered this question, the next time, Spellchecker will prompt
you at the beginning of the session on whether or not you want to start
from the previous position. ScanTune will save the corrected document
and return you to the main menu.
NOTE: The Escape key will stop the Spellchecker and prompt: Save
what we've all ready done y/n. If you select "y" ScanTune will save the
SCANTUNE USER MANUAL Page 19
corrected portion and combine it with the uncorrected portion for a
later spellchecker session. If you select "n" that particular session
is aborted and nothing is saved.
When logging spellchecking activity, the phrase "suspected word"
precedes the suspected word.
BLANK LINES REMOVED:
When this option is turned on, all single blank lines are removed.
Also, all groups of blank lines are deleted and replaced with a single
blank line. This option should only be run once on each file. Running
it twice or more will cause all blank lines to be removed. This is a
"marked" option. To turn it on and off, highlight it and hit enter. It
runs when the "execute" (e) option is selected.
STRIP UNNECESSARY LEADING SPACES:
The leading spaces stripper removes all leading spaces from all
lines except for paragraph indentions. Centered text becomes left
justified. This is a "marked" option. To turn it on and off, highlight
it and hit enter. It runs when the "execute" (e) option is selected.
INDENT PARAGRAPHS 5 SPACES:
The paragraph indenter makes sure every paragraph begins with 5
spaces. This is a "marked" option. To turn it on and off, highlight it
and hit enter. It runs when the "execute" (e) option is selected.
TWO SPACES AFTER SENTENCES:
The sentence option makes sure that every sentence is followed by
two blank spaces and only one after a lettered initial. It is useful
for reading with voice synthesizers. Many of them read better if there
are two spaces after sentences. This is a "marked" option. To turn it
on and off, highlight it and hit enter. It runs when the "execute" (e)
option is selected.
RIGHT MARGIN FOR WORD WRAP:
Selecting this option allows you to enter a new right margin for
word wrapping. The new setting remains in effect until you exit the
program or change it during a session. The default is 72. This is an
immediate operation. You are prompted for a new setting right after you
select the option. The margin can be changed to any number up to 255.
We recommend that the default be used for best results with Study, the
companion to the ScanTune utility.
WORDWRAP TEXT:
The word wrapper insures that no line will be longer than the right
margin. It also "unsylablizes" words that are divided by a dash at the
ends of lines. At present, this is a "line wrap". This means that it
looks at every line, rather than an entire paragraph when it wraps.
SCANTUNE USER MANUAL Page 20
This causes some very short two or three-word lines. We enabled this
kind of wrapping because paragraph wrapping can recolumnize decolumnized
text and do it in such a way that it can't be easily undone. This is a
"marked" option. To turn it on and off, highlight it and hit enter. It
runs when the "execute" (e) option is selected.
MACROS:
What is a macro?
A macro is an instruction to the computer to perform repetitive
corrections, such as: spelling, OCR errors and formatting commands. In
addition, macros may be used to execute a series of complex search and
replace sequences. A macro file contains numerous individual entries
and each is called a macro. An ScanTune macro possesses a special
structure.
1. A search string. A letter, word or phrase.
2. A delimiter. A character marking the beginning or end of a unit of
data.
3. A replacement string. A letter, word or phrase.
4. Another delimiter.
5. An execution instruction.
6. A hard carriage return line feed pair.
MACRO ENTRIES:
You can add an entry to the temporary or currently loaded universal
macro file from either the proofreader or the macro editor. ScanTune
will prompt: enter search string. Type the exact characters you want
changed or eliminated and hit enter. ScanTune will prompt: enter
replacement string. Type in the characters with which you want to
replace the search string and hit enter. If you want the replacement
string empty hit enter. ScanTune will prompt: "search string will be
deleted everywhere! Are you sure you want to do this (y/n). If the
answer is yes, ScanTune will prompt: all occurrences words only cancel
entry. If "A" is chosen, the replacement will be made every time the
error is encountered during the cleanup process. If "W" is chosen the
correction will only be made if the error is surrounded by spaces,
punctuation marks or appears at the beginning or end of a line. Make
your selection and ScanTune will prompt: prompt before correcting y n n.
If you select "y" ScanTune will prompt before making the correction. If
you select "n" ScanTune will not make that correction. ScanTune will
then prompt: temporary universal both cancel entry. Make your
selection.
NOTE: During the cleanup process, when a prompted correction is
displayed, you will be prompted: y n v r c q. Which means: "Y" = yes,
make the correction. "N" = no, don't make the correction. "v" = vary
SCANTUNE USER MANUAL Page 21
context character length (from current 78 to any number from 1 to 255).
"R" = make the remainder of the changes everywhere from current position
to the end of the file. "c" = allow change to be entered manually. "Q"
= don't make the change or ask me about it anymore this cleaning
session. These are hot keys.
There are two types of macro files. One is the Universal macro
file. It is called default.stm and is where you should keep errors and
corrections obtained from various OCR files. They can include spelling
errors, and any other error that you want to keep on a permanent basis.
For example. Say your OCR exchanges the letter "b" for the letter "h"
and you end up with bave, instead of have. The default.stm would be the
appropriate place for this correction. The second type is the Temporary
macro file. It is titled the same as the document, but with a stm
extension. The file is in the same directory as the document to which
it belongs. This is where you should keep those errors and corrections
that are unique to that particular OCR file. These can also include
spelling errors and any other correction you might want to make.
You can include high ASCII and control characters in your search
and replacement strings. To enter a control character, type a carrot
shift-6 followed by the letter of the control character. To insert a
ctrl-l (ascii 12) in to a search or replacement string you would type:
^l. Similarly, ctrl-a is ^a while ^z is ctrl-z. If you want to search
for characters with other ASCII values, type a carrot followed by a
number representing the ASCII value of the character you want. To enter
the numeric value of the soft carriage return (ASCII 141), for example,
you would type ^141. If you need to search for or replace with a
carrot, put two of them side by side. The first one will be ignored and
the second one will be retained in the string. The search string and
the replacement string are case sensitive. So what you enter is what
you get.
By choosing the "Macros" function at the menu, you will be shown:
EDIT LOAD/CREATE DELETE OPTIMIZE QUIT:
To select the edit function, hit either the letter "E" or enter.
ScanTune will prompt: Temporary universal quit. Select the macro file
to edit by typing "u", "t" or "q". If you type "u" you will edit the
currently loaded universal macro file. If you choose "t", you will edit
the current temporary macro file.
For this example, let's select universal. Select universal by
either hitting the letter "U" or enter. ScanTune will prompt: loading
macro file. A list of the search strings contained in your selected
macro file appears. Select the entry to edit by arrowing down to it and
hitting enter. ScanTune will prompt: revise delete quit. Select an
option by typing "r", "d" or "q". If you choose "r", you will be
prompted for a new replacement string and execution instruction as you
are in the proofreader. See macro entries for a complete explanation.
The entries list reappears. If you choose "d", you will be asked if you
are sure you want to delete the selected entry. Hit "y" for yes and "n"
for no. The entries list reappears. If you choose "q", the Macros menu
reappears.
SCANTUNE USER MANUAL Page 22
LOAD/CREATE MACROS:
It is possible to load a macro file other than the default as the
universal macro file. This option will allow you to load multiple macro
files for use when you next run cleanup. It is used to combine special
formatting tasks and error correction in one run of the cleanup process.
For example, you could use it to run para.stm followed by default.stm.
The various .stm files will be discussed later in this manual. You can
arrange macro files anyway you wish. Arrow to load/create and hit
enter. At this point ScanTune will display a list of macro files as
they appear inside the directory where ScanTune.exe is located. The
list will not include the .stm extension. This should make it easier to
understand. Make your selection by arrowing down to it and hitting
enter. ScanTune will prompt: Adding filename.stm to the load list. You
can keep on making more selections adding to the list. It's your
choice. NOTE: ScanTune processes the macro files from top to bottom in
the load list, so make sure you have your macro files in the order you
want them. If you enter one by mistake, select it again and ScanTune
will prompt: removing filename.stm from the load list. The macro file
is unloaded and all subsequent macro files are moved up one place in
line. To inform ScanTune that you are satisfied with the arrangement
hit F10. ScanTune will then report: writing new load list ... and
repeat load create in case you changed your mind. To return to the
ScanTune menu hit escape.
To create a macro of your own, arrow down to *create and hit enter.
You will be prompted to enter a name for the new macro file. Choose a
name with not more than eight characters and do not type the .stm
extension. A new macro file is created with the name you entered and it
is loaded as the universal macro file. ScanTune prompts: load/create.
Your new macro file has been created, but there are no entries inside.
It is empty. To make entries into your new macro file arrow up to edit,
hit enter and choose the universal macro file. You will be prompted: do
you want to add entries yes or no. If you select "no" ScanTune will
return to the menu. If you choose "yes" ScanTune will prompt: enter
search string. Follow the same procedure as when you added entries from
the proofreader. See the section on macro entries for more details.
When you are finished adding entries, hit enter when prompted to enter
search string. You will be returned to macros.
DELETING MACROS:
To delete a macro file select the delete option from the Macros
menu. A list of macro files will appear. Arrow down to the one you
want to delete and hit enter. ScanTune will prompt: delete macro file
c:\pathfilename\macro file name (y/n). Hit "Y" and hit enter to delete
the macro. The Macros menu reappears. Be extremely careful on this
one. Once it has been deleted, it's gone!!
OPTIMIZE MACROS:
This option has been added to check your macros for duplicates and
for macros that do nothing. Select Optimize from the menu and a list of
SCANTUNE USER MANUAL Page 23
macro files will be displayed. Make your selection by moving up and
down through the list with the up and down arrows. To choose a macro
file hit enter. Optimize will check that macro file and make any
changes according to internal rules of the macro editor.
A duplicate macro might look like this:
ber|ber|A
As you can see this macro does absolutely nothing, but rack up
changes during the cleanup process. Since macros are sometimes
difficult to read and since users like to create their own in text
editors, this option is essential for the ScanTune program to function
as designed. Therefore it is a good idea to periodically optimize your
macro files.
CREATING MACROS WITH A TEXT EDITOR:
If you are intent on creating your own macros with an editor or
word processor, make sure that it is a text base program and that it
does not embed its own code. ScanTune will not recognize it and it
could create a problem you can't fix. Follow the procedure outlined in
"Technical Information" and remember that the delimiter cannot, I
repeat, cannot be changed. ScanTune will only recognize the bar (|)
symbol!! We recommend, however, that you manage macro files from the
ScanTune program to avoid problems.
NON TEXT CHARACTERS REMOVED:
The Nonstandard text option removes all characters with ascii
values above 126 or below 32 except for carriage returns line feeds and
form feeds. It converts soft carriage returns (ascii 141), to carriage
return/line feed pairs and it strips word perfect headers and codes
making the output from a word perfect file readable by any text reader.
This option runs rather slowly because it checks each byte of the entire
file. Therefore, don't run it unless you're pretty sure you need it.
This is a "marked" option. To turn it on and off, highlight it and hit
enter. It runs when the "execute" (e) option is selected.
DECOLUMNIZE TEXT:
The decolumnizer stacks adjacent columns from left to right with
the left hand column on top. If your ocr software does decolumnization
to your satisfaction, you should go that route rather than use the
ScanTune decolumnizer. The decolumnizer was designed for use with ocr
systems which do not decolumnize or do so erratically. This is a
"marked" option. To turn it on and off, highlight it or hit ".D" and
hit enter. It runs when the "execute" (e) option is selected.
The decolumnizer can even be used inside of blocked text. See
alt-l on blocking text and executing marked operations.
NOTE: The Escape key will stop the decolumnizing process and
prompt: Save what we've all ready done y/n. If you select "y" ScanTune
SCANTUNE USER MANUAL Page 24
will save the decolumnized portion and combine it with the
undecolumnized portion for a later decolumnizing session. If you select
"n" that particular session is aborted and nothing is saved.
LOG CORRECTIONS:
This is a "marked" option. To turn it on and off, highlight it and
hit enter. It only works with cleanup scanner errors, optimizing macros
and the Spellchecker. It can be an extremely handy tool. This feature
is also accessible from inside the Spellchecker.
MARK ALL OPERATIONS:
This option turns on all formatting and cleaning options. It is an
immediate function. All operations will be run when the execute (E)
option is selected.
MARK STANDARD OPERATIONS:
This operation turns on all formatting and cleaning options with
the exception of nonstandard text removal and decolumnization. It is an
immediate option. The marked operations will run when the execute (e)
option is selected.
UNMARK ALL OPERATIONS:
This option turns off all formatting and cleaning options. At the
moment, marks are not cleared after execution allowing you to perform
the same set of operations over and over again. The unmark option is a
good one to keep in mind to clear everything and start with a clean
slate. It is an immediate option.
EXECUTE MARKED OPERATIONS:
This option executes all the cleaning and formatting options you
have turned on in the menu or in the proofreader using the period (.)
prefix or selected in the menu.
This is also where, if you have marked a block of text using the
alt-l key (see alt-l above), ScanTune will prompt: A block of text has
been marked. Operate on this block only y/n. NOTE: If you answer "yes"
the options that you have selected will be performed inside the blocked
text only. If you answer "no", ScanTune will prompt: Begin cleaning at
place marker y/n. Make your selection. NOTE: The blocks that were set
are still marked and can be unmarked with the alt-u, delete the block
with alt-g or can be moved with the alt-m function. It's entirely up to
you.
ALARM:
This is a "marked" option. To turn it on and off, highlight it and
hit enter. If active an alarm will sound when ScanTune has finished any
process that you have selected while in the main menu. It is especially
handy during cleanup, spellchecking and non-text character removal in
SCANTUNE USER MANUAL Page 25
that when the computer is normally silent, a slight clicking sound can
be heard. NOTE: The decibel level of the alarm, at the end of the
program, is dependent upon your particular machine. Therefore, if you
work at night and your significant other is a light sleeper beware!!
QUIT TO DOS:
You can quit to dos by selecting this option or hitting escape.
SCANTUNE FROM THE COMMAND LINE:
You can start ScanTune from the command line. The menu is bypassed
when this is done, unless the /p switch is used.
Syntax: ScanTune sourcefile macro file option switches.
Sourcefile is the file you want to clean up.
The macro file is the name of the macro file you want to be loaded
in place of the default macro file default.stm. You do not have to type
the .stm extension.
The optional switches are those preceded by the slash (/) and the
letter for that option. Slashes are used on the command line and in
batch files even though inside the proofreader it is done with a period.
NOTE: There can be no spaces between the options. Example: /c/b/t
The /u switch is unique in that it will run the Spellchecker with
prompting on, log corrections on and in the unattend mode. In this way
you can run the Spellchecker on a file while you're having dinner or out
for a walk. Then upon your return, the file ScanTune.log is available
for you to edit in your favorite text editor or word processor.
SCANTUNE FROM A BATCH FILE:
You can start ScanTune from a batch file. The menu is bypassed
when this is done, unless the /p switch is used.
Syntax: ScanTune sourcefile macro file option switches.
NOTE: Do not use a filename.org to begin a session. We recommend
that you start a session with a .txt extension because ScanTune will
copy whatever filename.ext and give it a filename.org. This is done to
keep you from loosing or destroying the original document that has taken
valuable time to scan. When you exit ScanTune a copy of the working
file is copied and renamed filename.bak. This is so that you can always
go back one step in the process and begin again if a problem occurs.
Sourcefile is the file you want to clean up.
The macro file is the name of the macro file you want to be loaded
in place of the default macro file default.stm. All macro files must
reside in the c:\pathfilename subdirectory and have the .stm extension.
You do not have to type the .stm extension.
SCANTUNE USER MANUAL Page 26
Switches are those outlined in the menu, but are preceded by a
slash (/) and can be activated from either the command line or from a
batch file. NOTE: There can be no spaces between option switches. For
example, this is the right way to use switches: /c/t/w.
Your entry might look something like the one listed below, but
remember "THIS IS ONLY AN EXAMPLE!!"
ScanTune ScanTune.txt para /m
The line above indicates that:
1. The first ScanTune is the instruction to activate the ScanTune
program.
2. ScanTune.txt is the file you want to correct.
3. And para is the macro file you want to use.
The ScanTune switches are as follows:
/A = all operations. Use of this switch enables all other options
except the proofreader.
/B blank lines removed. Removes all instances of single blank lines.
This option also substitutes a single blank line for any set of
consecutive blank lines.
/C clean up scanner errors. This option runs the universal and
temporary macro files against the source file, correcting all errors.
/D decolumnize. This option stacks columns of text with the left-most
column on top and the right-most column on the bottom. The decolumnizer
was designed for users of ocr systems without this feature.
/I indent paragraphs. This option makes sure that every paragraph
begins with 5 spaces.
/L Log Corrections. This option keeps a list of all work performed with
Cleanup and the Spellchecker. The file that is generated is called
ScanTune.log.
/N Nonstandard Text removed. This option strips all characters with
ascii values below 32 except line feeds, carriage returns and form
feeds. It also removes all characters with ascii values above 126. If
a Word Perfect file is encountered, the header is removed making the
file readable by normal text readers. Finally, this option converts
soft carriage returns to carriage return/line feed pairs.
/M Mark Standard Operations. This switch turns on all options except
decolumnization and Nonstandard Text removal.
/P enters proof read mode and turns on the menu.
SCANTUNE USER MANUAL Page 27
/r right margin. This option allows the user to reset the right margin
for word wrapping purposes. The default is 72. So if you wanted to
change the number your switch would look like: /r80 or /r30, whatever
number you choose.
/S strip leading spaces. This option removes all leading spaces except
for paragraph indention. Centered text is left justified.
/T two spaces after sentences. This option makes sure that every
sentence is followed by two and only two spaces and one after a lettered
initial.
/W word wrap text. Text is wrapped to the right margin, one line at a
time.
NOTE: If the /P switch is called from the command line or from a batch
file it will clear all other switches and turn on the menu. Therefore,
we do not recommend using the /p switch within a batch file where
unattended processing is desired.
A /u switch has been enabled and is unique in that it will run the
Spellchecker with prompting on, log corrections on and in the unattend
mode. You can create a batch file that will Spellcheck one, two and
maybe even three books while you're sleeping, depending of course on the
speed of your machine. You need a lot of memory for this feature, if
you're going to Spellcheck more than one book.
The ScanTune.log file will show what you have selected, plus the name of
the file, the time the file was started and the time it ended. Enjoy!!
A slash (/) z switch has been enabled for users who prefer to clean a
file that is called from a batch file. Your batch file entry might look
like:
ScanTune sourcefile mymacro /z/c
The above entry informs ScanTune that the cleanup option is being
operated from a batch file and that ScanTune should execute the cleanup
process as if it had been selected from the ScanTune menu. You will
have to attend this process to inform ScanTune what to do when a
prompted macro is encountered. Your options are:
y n v r c q. Which means: "Y" = yes, make the correction. "N" =
no, don't make the correction. "v" = vary context character length
(from current 78 to any number from 1 to 255). "R" = make the remainder
of the changes everywhere from current position to the end of the file.
"c" = allow change to be entered manually. "Q" = don't make the change
or ask me about it anymore this cleaning session. These are hot keys.
USAGE TIPS:
This section describes various ways of using the ScanTune program
to clean up your ocr output. As more people start using the system and
contribute suggestions, this section will change and expand.
SCANTUNE USER MANUAL Page 28
A note on macros. The basic idea is to "teach" ScanTune about your
scanner's errors. To do this you must proof read some files. When you
spot an error more than once, it is most likely the results of your
scanner, so mark it and add it to your universal macro file. If you
find later that it wasn't, you can delete that particular entry.
Subsequent files will be checked for these errors when cleanup is run.
Mark errors unique to the file you are reading and save them to your
temporary macro file. These errors will be corrected in the current
file only. As you proof read more files you will gradually build up a
universal macro file that pretty well covers the errors your system
commonly makes. This will make your initial cleanup progressively more
effective giving you very readable text with little proofreading. For
the same reason, it is sometimes desirable to clean a file in "stages".
Proof read a while and when you've marked 50 or 100 temporary errors, go
ahead and run cleanup. It doesn't take as long for each run and you can
actually see the text quality improve as you proof read. After a
cleanup operation, you might want to consider using the Learn Mode
function. It is a little time consuming, depending upon the speed of
your machine, but it's well worth it. Try taking a raw scan and mark
standard operations. Go through the initial cleanup process and then
run Learn Mode. You should end up with a decent OCR file that can be
polished in the Proofread Mode.
THE BLOCPARA MACRO:
This macro file replaces punctuation marks and carriage return line
feeds with the paragraph (^t) symbol. NOTE: Any character can be
chosen, but ensure that it is not a part of the text you are cleaning.
It could be disastrous. The blocpara.stm macro file looks like this:
.^m^j|.^t|A
?^m^j|?^t|A
!^m^j|!^t|A
.)^m^j|.)^t|A
."^m^j|."^t|A
?"^m^j|?"^t|A
!"^m^j|!"^t|A
At this point you take all the remaining carriage return line feeds
and replace them with a space. Make sure that you have gone past the
title page and other information that is normally formatted in single
lines before using this macro file.
^m^j| |A
At this point the remainder of the blocpara.stm macro file will
exchange the paragraph (^t) symbol for carriage return line feeds. You
SCANTUNE USER MANUAL Page 29
can replace the paragraph (^t) symbol with either one ^m^j or two ^m^js
depending on how you want your paragraphs spaced.
.^t|.^m^j|A
?^t|?^m^j|A
!^t|!^m^j|A
.)^t|.)^m^j|A
."^t|."^m^j|A
?"^t|?"^m^j|A
!"^t|!"^m^j|A
Your OCR file is now formatted to paragraph blocked boundaries.
THE PARA MACRO:
This macro file is exactly the same as the blocpara.stm macro file,
but replaces two carriage return line feeds instead of one. A macro
would look like:
.^t|.^m^j^m^j|A
You can accomplish almost anything with macros, so play around with
them and see how creative you can be. If you develop something that
other users might enjoy send your macro files to George McCoy.
FOR YOUR INFORMATION:
The ScanTune utility goes through the entire file a chunk at a time
performing all the tasks that you selected, plus the entries in the
macro files. Therefore, the longer the file is that you are cleaning
up, the longer it will take to process it. ScanTune makes all the
changes on the first pass through each chunk and goes into reformatting.
This procedure will also depend on how long the file is.
TECHNICAL INFORMATION:
George McCoy would like to take this opportunity to explain the
heart and soul of ScanTune. The operative word as explained above is
"macro." The concept behind a macro is quite simple, but....
A macro is an instruction to the computer to perform a series of
complex search and replace sequences. An ScanTune macro possesses a
special structure:
1. A search string. A letter, word or phrase.
SCANTUNE USER MANUAL Page 30
2. A delimiter. A character marking the beginning or end of a unit of
data.
3. A replacement string. A letter, word or phrase.
4. Another delimiter.
5. An execution instruction.
6. A hard carriage return line feed pair (Ctrl-m Ctrl-j) also known as
the "enter" key.)
Let's say for example that your OCR system translates the number
1987 as I987. To correct this we would use the macro:
I987|1987|A (followed immediately with the enter key.)
the I987 is the search string. The bar (|) is the delimiter. 1987
is the replacement string. Another delimiter separates the replacement
string from the upper case "A" which is followed with the enter key.
The upper case "A" is instructing the computer to correct the error
everywhere it is found. ScanTune also uses the upper case "W" to
correct the error only if it is surrounded by spaces, punctuation marks
or appears at the beginning or end of a line. The lower case "a" and
lower case "w" act the same as their counter parts, but prompt you
before making the correction.
NOTE: You do not, I repeat, do not have to write your own macros.
ScanTune does it for you internally after asking you for the search
string, the replacement string and how you want the macro to perform.
You will be prompted at every turn. But the over all concept is
necessary to use ScanTune to its full capacity.
Let's reenforce the concept of macros in this example:
Say your OCR file has the word modem in it. If your OCR file is
about telecommunications, then this is appropriate. However, if your
file is about modern history, then modem would be incorrect. To fix it,
let's use the following macro:
modem|modern|w (lower case "w" followed by the enter key.)
By using this macro, ScanTune would prompt you with: correct modem
with modern.
One last example.
Let's say that your file has part of a word on one line and the
remainder on another. Example: The word "translate". "trans-" appears
at the end of a line while "late" appears on the next line. To correct
this we could use:
trans-^m^jlate|translate|A
SCANTUNE USER MANUAL Page 31
NOTE: ScanTune allows you to search for and replace control
characters. The ^m^j is the hard carriage return line feed pair as
mentioned above.
This would correct the problem, but there is an easier way:
-^m^j||A
Notice how we searched for the dash (-) and the hard carriage
return line feed pairs (^m^j)and replaced them with nothing. This macro
removes the dash (-) and the hard carriage return line feed pairs from
your OCR file.
SAMPLE MACROS:
Below are several sample macros that can be included in your
universal macro file, or in a macro file of your choice. Each is
briefly described. If you are really interested in macros and how
they're made, read on. If you're a normal user, this information might
not be of much immediate use to you.
Say your OCR translates the number five for the letter s. In order
to correct this you could:
5|s|a (note the lower case a.)
This macro would prompt you each time ScanTune came across the
number five and prompt: y n v r c q.
Say your OCR always translates the word quickly to quicKly. The
following macro would correct it.
quicKly|quickly|A
Or, say you end up with dollar signs where the letter "s" should
be. How about:
$|s|a (note the lower case a.)
Below are a few macros that you might like to try:
th's|this|A
Th's|This|A
h'm|him|A
H'm|Him|A
w'll|will|A
W'll|Will|A
SCANTUNE USER MANUAL Page 32
'ournal|journal|A
etemal|eternal|A
moming|morning|A
warml|warm|A
Ianguage|language|A
'mper'al|emperial|A
pol'tical|political|A
dom'nat'on,|domination,|A
OB.stm MACROS:
A file called Ob.stm has been included for those that use "Open
Book" OCR translations. It may prove helpful.
TECHNICAL SUPPORT:
If you care to purchase ScanTune or require technical assistance
contact:
GLM Enterprises
George McCoy
179 Rockvale Road
Union Grove, AL 35175
voice: (205)-498-3877
bbs:
End of ScanTune documentation!!
SCANTUNE USER MANUAL Page 33