home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
DP Tool Club 31
/
CDASC_31_1996_juillet_aout.iso
/
vrac
/
tspell25.zip
/
TSPELL.NWS
< prev
next >
Wrap
Text File
|
1996-03-17
|
10KB
|
198 lines
News on the TSPELL package by Timo Salmi
========================================
University of Vaasa, Finland, Linux Pentium garbo.uwasa.fi has a
large collection of Shareware, Freeware and Public Domain PC
programs available by anonymous FTP, WWW (World Wide Web) and mail
server. The file ftp://garbo.uwasa.fi/pc/ts/0news-ts contains news
about the TS-programs in the /pc/ts directory (in reverse order).
This text, which you now have, is an extract from the 0news-ts file
and the Usenet news.
....................................................................
Prof. Timo Salmi Co-moderator of news:comp.archives.msdos.announce
Moderating at ftp:// & http://garbo.uwasa.fi archives 193.166.120.5
Department of Accounting and Business Finance ; University of Vaasa
ts@uwasa.fi http://uwasa.fi/~ts BBS 961-3170972; FIN-65101, Finland
....................................................................
Sun 17-Mar-96: I have updated my old screen oriented spelling
checker to be
ftp://garbo.uwasa.fi/pc/ts/tspell25.zip
Screen oriented spelling checker & word frequency counter
1) Included a bigger version of the dictionary.
2) Increased the dictionary capacity from 9000 to 15000 words.
3) Put a selftest into the prorgams to guard against viruses and
tampering.
4) Brought the information material in the package up to date. Also
added the standard FILE_ID.DIZ identification file into the package.
As you probably know, many BBS systems display the contents of these
comment files on their file lists.
OLDER RELEASE NOTES FOR TSPELL
==============================
Release 1.2:
If a file is not found, the user can invoke the directory from
within the programs. The directory routine has been rewritten for a
more relaxed syntax.
Release 1.3:
Some routines of the programs have been optimized for speed. The
programs are now about twice as fast as in the first release.
Release 1.4:
The speed has been further increased in 1.4, and the potential out
of memory condition is now tested the first thing.
Release 1.5:
A new program SPELMERG.EXE has been added for fast merging of
dictionaries.
Release 1.6:
Release 1.6 means a major update for the main program SPELLER.EXE.
It now has an inbuilt help. The help is invoked by entering ? at any
question made by the program. Also, the dictionary in memory can now
be updated immediately upon encountering new words in the text file.
It is now also possible to omit checking short words. The details of
the new options need not be documented here, since they are best
learned by using the ? help.
The error causing SPELLED.EXE to hang in iterated use, has been
corrected. For those in the know: the pointers were released
prematurely.
The maximum capacity of WORDLIST.EXE has been increased from 3000
to 3100 words. A wordlist program BGWRDLST.EXE with a capacity of
12500 ordered words is being tested. A second computer and business
economics oriented dictionary SPELLBIG.DNY is under preparation.
BGWRDLST.EXE and SPELLBIG.DNY will part of a different archive for
registered users (having the potential of using up to 22800 word
dictionaries instead of the 9000 in PD).
A sample run has been added to the end of this document.
Release 1.7:
The screen of SPELLER.EXE has been made larger in order to display
more text at a time.
WORDLIST.EXE now also displays the size the source text file.
Release 1.8:
The programs have been recompiled with Turbo Pascal 5.0, and they
now can handle read-only files when appropriate.
Release 1.9:
SPELLER and SPELLED have again been made faster. This has been
achieved as follows. The two programs work by loading the dictionary
in memory. Since there is a 64K limit for arrays in Turbo Pascal 4.0
and 5.0, the dictionary was divided into several parts in the
earlier versions, although this is invisible to the user. The new
releases utilize the huge array routines of Turbo Professional 5.0
of TurboPower Software, which partly use fast inline code. In fact
there are virtually (pardon the pun for the initiated) no immediate
technical limits to the potential size of the dictionary (although a
strict limit has been imposed in this set).
Release 2.0:
The spelling checker SPELLER can optionally made to scroll one line
at time besides scrolling a whole page of the text to be checked.
Scrolling by one line is invoked by pressing return (or enter or
whatever it is called on your system), and scrolling the whole page
by pressing the space bar. (Users having Unix experience will find
this feature very familiar.) This feature is also included in the
WORDLIST program.
SPELLED, SPELMERG, and WORDLIST have added safeguards against disk
full situations.
Release 2.1:
This release enhances the WORDLIST program for extracting all the
words of a text file and calculating the frequency of each word: The
capacity of the program has been increased from handling of 3100
*different* words to 8000 different words, and the maximum length of
a word has been increased from 20 to 25 characters. (This was
achieved by using the so-called huge vectors.) The program has been
made more multi-lingual. It now recognizes all the upper ascii
Scandinavian, German, and French characters.
As a minor enhancement WORDLIST, SPELLER, SPELLED, and SPELMERG
directory routines and help screen redirection have been upgraded.
Release 2.2:
Weeded out some incorrect words from the SPELLED.DNY dictionary.
My thanks are due to Mr. Hannu Hirvonen for his help in doing this.
Introduced line-editing and input recall. This means e.g. that if
you make an error in giving the dictionary name, you can invoke your
input again next time at the same question simply by pressing the
CrUp key. You can edit what you have written by applying the cursor
keys, BackSpace, Delete, Insert, and Esc keys. Try it. It is very
convenient. I use it constantly myself. This improvement has been
made to all the programs in this set.
Release 2.3:
This update brings a very significant increase in the speed of the
dictionary editor SPELLED. The increase is of an order of as much as
five times the earlier speed. One telling weakness of SPELLED has
been that updating the dictionary is slow. This is because the
alphabetical order of the dictionary must be retained when new words
are added, or old ones deleted. For each new word a slot must first
be made by moving all the words above it up one notch. Since MsDos
arrays are limited to 64K, this means very much load on the program,
because a huge array system must be used within the program code
(unseen by the user). Starting from release 1.9 of TSPELL I have
used the huge array management in Turbo Professional by TurboPower
Software, but now I have replaced it with my own huge array
management code, which is very much faster. One of the reasons is
that TurboPower's code is general. It is written for any variable
type, and has no size limits except the available memory. The code I
have now used is a double pointer system adapted for strings in
particular, and in this technique the size of an array is limited to
16383 rows, and as many columns. In practice this means, that the
maximum number of words that the dictionary may hold is 16383. At
the face of it this seems a severe limitation, but, in fact, it is
not. I have noticed that if one builds the dictionary for one's own
purposes, a vocabulary of 10000-12000 words is quite enough. There
is a huge amount of slack in the generic vocabularies, and still
they often do not fulfil the user's specific needs. And the
distributed PD version of the spelling checker is limited to 9000
words, anyway. A 22800 word version of SPELLER will still exist, but
it is not (nor has been) released to Public Domain.
There is also another, slight price to pay for the change of the
technique. At the beginning of the program a brief spell (pardon the
pun) is taken to build up the pointer system for the dynamic memory.
I cannot be really sure, but it seems to me that old huge array
technique may have been a source of an occasional crash of the
system. In writing programs with pointer arrays I have noticed that
if they are not controlled very accurately, they cause unexpected
behavior. This is natural, since a "wild" pointer may point to
anywhere in memory potentially changing it.
I have made the same change of technique in the spelling checker.
SPELLER was fast to begin with, and in just checking text the
difference does not really matter, even if it is there. SPELLER was
very fast already. But if you choose to update the dictionary
immediately with new words as you proceed, then the difference is
significant. If you choose to store new words as you check a text,
they are written in a file called dict.tmp on the default device.
If you break out of SPELLER (ctrl-c or break) the temporary file
is now properly closed. Earlier it was left with zero length in case
of a break-out.
A few words about the original TSPELL philosophy. I made the
spelling checker screen oriented on purpose. There are so many
spelling checkers which make a list of the incorrectly spelled words
(or rather words that aren't found in the dictionary), that I
decided rather to complement them with the screen orientation than
to write yet another conventional checker. If you use a Unix system,
Unix spell command is a good example of a list oriented spelling
checker. On a fast machine it is very nice to use.
Release 2.4:
The WORDLIST:EXS program (v2.2) now includes the option of giving
the input and output files already in the program call as e.g.
WORDLIST /iMYTEXT.TXT /oMYLIST.LOG
When WORDLIST asks for the file names you can recall them simply by
pressing the CursorUp key. Very convenient if you wish to examine
the same file repeatedly (assuming, of course, that you have a
command line editor, such as dosedit or ced).
SPELLER.EXE, SPELLED.EXE, and SPELMERG.EXE have some minor
refinements, not worth listing.
I have now also released another spelling checking package which
uses the Unix-like method. That is, it makes an alphabetical list
of words which it cannot find in the dictionary. It is available as
/pc/ts/tschek10.arc (or whatever version number is the latest).