home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The World of Computer Software
/
World_Of_Computer_Software-02-387-Vol-3of3.iso
/
w
/
wrdfrq11.zip
/
WORDFREQ.DOC
< prev
next >
Wrap
Text File
|
1992-11-14
|
24KB
|
484 lines
WORDFREQ.EXE
version 1.1
from JB Utilities
released as SHAREware
Copyright 1992, Jules Brenner BRT71EFR.EXE copyright Microsoft, Inc.
SORT.EXE copyright Microsoft., Inc.
WORDFREQ, short for Word Frequency, is a text file analyzer that allows any
writer to check for words he/she may be using too repetitively. Note: to use
it, the DOS command file, SORT.EXE MUST be in the PATH and is tested with the
5.0 version. See below for a fuller description of its purpose to the
writer.
Registration fee is a mere $15. If you find something that doesn't seem to
work right, or if you wish WORDFREQ did something it doesn't now do, use your
registration to make requests. In any case, if you find yourself using WORDFREQ
regularly, registering is the right thing to do. Please remember the
SHAREware concept: you get to TRY it free. Your trial is the only thing
that's free, though. If you like it and use it, the trial is over and your
obligation (cf., morality, ethics) is to register it, pay the rather modest
fee and encourage me to improve the program and write others which may be of
benefit to you. But... if you truly can't afford the fifteen, at least let
me know you're out there. No software police will visit.
Mail fee to: Jules Brenner
JB Utilities
P.O. Box 46116
Los Angeles, CA 90046-0116
ALTERNATE FEE!!!!!!!!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As an alternate to remitting a fee, you may purchase our book, THE BRENNER
RESTAURANT INDEX, A Computerized Guide to Selected Restaurants in the Greater
Los Angeles Area. Its cover price is $12.95, plus $1.07 sales tax if you live
in California, plus $1.05 shipping. The book may be ordered directly from us
and all you have to do is enclose a check or MasterCard/Visa authorization.
With your order, just mention that you wish to have your copy of WORDFREQ.EXE
registered. As a registered user, you will receive an update as well as
additional JB Utilities. By ordering the book, you get three for one: the
book, more utilities, AND the registration.
And, now, the commercial:
Among restaurant books, this one is quite unique. It's organized in such a
way that it will help you pick your restaurant according to what
considerations are important on any given occasion. In L.A., with its
massive size, it's common to find yourself in a part of town you don't know
too well. Just check the restaurants in the Location section for those in
that specific area, and you get an immediate summary of the selected
restaurants there (and, on your way there). Another section sorts the
restaurants by cost in case you're on a budget or want to just consider a
certain price range. If you're in the mood for a particular kind of food,
check the Cuisine section. Finally, there's the alphabetical listing. All
restaurant listings include vital details, including special dishes, hours,
parking, credit cards, and more.
THE BRENNER RESTAURANT INDEX is available in some Los Angeles bookstores,
so if you live here, you can also ask for it in your local bookstore. A few
of the bookstores carrying it at the time of this writing include: Dutton's,
Samuel French, Book Soup, Big & Tall Bookstore. It is also being carried at
some newstands and restaurants. If you obtain the book from a retail outlet,
just send us a note that you request registration for WORDFREQ.EXE and include a
photocopy of your receipt.
To see what this book looks like, you can use the technology at hand. Simply
download the file RESTINDX.ZIP from a local BBS. It contains a .GIF image of
the book. You can also send a request to us for the file but please send $1
with the request to cover expenses.
While it may appear that this is a book exclusively for Los Angeles residents,
it is perhaps even more valuable for those visiting from other parts of the
world. If you're planning a trip to Disneyland (or wherever), you'll want to
know which are the better restaurants, where to find them and how close they
are to you. If you're staying at a hotel, ask the clerk to help you locate
all the restaurants in the area. Let your local travel agent help you plan your
meals ahead of time if he or she knows a bit about the geography of this huge
county.
Special note for BBSers: the file RESTINDX.ZIP is being distributed around
Los Angeles BBSs. It contains a .GIF picture of the book. Look for it.
A few scenarios:
1. You live in Hollywood and need to meet a business associate who has his
office in Santa Monica. Neither of you wants to come all the way across
town. You check the zip code map (as in the Thomas Guide) and find the area
that's right in the middle of the two areas. It's not too far for either of
you. You next check out the restaurants that are available and recommended
in that area and find 3 that you've been wanting to try and one of your
favorites!
2. You and your sweetheart are rushing to a screening (or, game, concert,
etc.). Getting ready took a little longer than you expected and time is
tight. You decide to eat en route and know of a great little restaurant.
You get there and find others had the same idea and there's a long line. You
pull The Brenner Restaurant Index out of your glove compartment and find one
great restaurant two blocks away and another 4 at slightly greater distances
en route to your goal.
3. You're going to a movie that's hot and you know if you get there merely in
time for the movie, you might not even get into the parking lot. You pull
out The Brenner Restaurant Index and find there's a truly fine restaurant so
close to the theater that you can park at the theater OR the restaurant and
walk over. Getting there at dinner time instead of at screening time
ensures you an easy place to park--and only once! (I've done this by dining
at Chan Dara, one of the great Thai restaurants, when going to the Cinerama
Dome).
- - - -
Note to those using credit cards: Please include the following with your order:
your name (printed) as it appears on the card (include middle initial), the card
number, the expiration date and your signature. (See the registration form
included in the WORDFREQ library [zip] file).
------------------------------------------------------------------------------
WORDFREQ.EXE is self-documenting. Just type
WORDFREQ
on the command line (followed, of course, by a carriage return).
SYNTAX in brief:
WORDFREQ [textfile] [/n]
Where the /n switch counts numbers as words.
Textfile is, ideally, an ASCII file.
(See C below, FILES TO BE PROCESSED).
------------------------------------------------------------------------------
Table of Contents
PREAMBLE ......... ALL ABOVE
A ................ PURPOSE
B ................ WORDFREQ WITH NO PARAMETERS
C ................ FILES TO BE PROCESSED
D ................ MEMORY and WORDFREQ
E ................ CASE SENSITIVITY
F ................ WHAT ARE WORDS... TO WORDFREQ?
G ................ .FRQ: THE ANALYSIS FILE
H ................ ON THE FLY ANALYSIS
I ................ Additional notes:
J ................ Printing this file
K ................ Updates
L ................ BUG REPORTS
M ................ DISTRIBUTION
N ................ REGISTRATION INFORMATION
------------------------------------------------------------------------------
A. PURPOSE
If you write to communicate to other people, you are or
should be concerned with the boring details of grammar,
punctuation and style of expression. To forego these
considerations is to limit the effectiveness of what you
write and the quality of your communication.
WORDFREQ offers one method of analysis in the third
category: style of expression. One thing many writers do
is use a word or a particular group of words with a
frequency of which they're not even aware. The inherent
problem with this, besides rendering your writing
ineffective by being far too repetitive, is not
using the vocabulary you have. The use of a good
vocabulary is a means to more precisely express your
thoughts and a bad habit should not impinge on precision
in language, thought nor expression.
The question is, ARE you using certain words too often?
To find out, run WORDFREQ on a document you've recently
written. The results will amaze (and probably not delight)
you!
B. WORDFREQ WITH NO PARAMETERS
Typing WORDFREQ (cr)
bring up the syntax screen. At the end of the syntax message
you are prompted for a file name. This affords alternative usage,
the first method by including a filename on the command line,
the second by invoking the program and then typing the filename.
C. FILES TO BE PROCESSED
WORDFREQ is a text file word-frequency analyzer. In theory,
any file in ASCII is a text file. In practice, all ASCII files
are not necessary appropriate for word frequency analysis. A
data file in ASCII would be one example. If the data file had
nothing more than a series of numbers, WORDFREQ would report
the frequency of the numbers (with the /n switch only). Not
useful to most of us.
Document files created by a word processor differ from straight
ASCII files in that they included embedded commands for
formatting and other purposes. Though these kinds of files
have not been tested as of this writing, it is likely that
the embedded commands will be seen by WORDFREQ as words and
will be counted for frequency. In case of trouble with these
types of files, go back into your word processor and save the
files as ASCII (aka, non-document).
Kinds of files that would be totally inappropriate for WORDFREQ
are executable and other binary files. Examples of these are
files with the extensions .COM and .EXE. The results of an
analysis of such files would produce nothing more than a lot of
symbolic garbage.
The kinds of files that are most appropriate for a word
frequency analysis are documents of communication. This would
include book chapters, letters, reports, files such as this one.
D. MEMORY and WORDFREQ
Files above a certain size cannot be processed by WORDFREQ.
The limitation is the string space alloted by the conventions
of programming languages and of DOS. This is not the same as
RAM. String space memory is a portion of RAM and is not a
reflection of how much RAM your system has. A system with
16 megabytes of RAM will run out of string space as soon as
a system with 1 megabyte.
Another problem of memory are files that are so large they
can't even be loaded. This is not fully tested but we expect
that limit to be in the neighborhood of 120k.
Above around 80k and that size which can't be loaded are
files which are likely to run out of string space. When and
if this occurs, WORDFREQ will display exactly how many lines
out of the total number of lines have been processed. In
this way, you can see how much is missing. The resulting
analysis may still be useful, so long as that portion not
counted is kept in mind.
E. CASE SENSITIVITY
WORDFREQ is case-sensitive. A future version is planned to
~~ to make this switchable.
Case sensitivity means that 'The', 'the' and 'THE' are seen
as unique words. Keep this in mind when you read the
analysis file (6).
F. WHAT ARE WORDS... TO WORDFREQ?
Words are those strings of characters that appear between
certain word delimiters. Examples of delimiters are spaces,
commas, semi-colons, colons, periods, end-of-lines, question
marks, and a few others.
In its default mode, numbers are NOT words (unless they are
attached to characters or are embedded within a string).
The switch /n allows you to tell WORDFREQ to consider numbers
as words and to count their frequency.
What all this means is that there are a number of characters
and abbreviations that are not excluded and will therefore
be considered as unique words. You might, for instance, use
the | character to set apart columns, or use single letters
as symbols and abbreviations. All of these will be regarded
as unique words and will be counted for frequency. You must
consider the results against the style of the file and your
preferences and needs when writing it. Strange occurrences
might well be fully appropriate to the file under analysis.
G. .FRQ: THE ANALYSIS FILE
WORDFREQ creates a file which lists all the unique words it has
found in your file and the frequency of each. This list is
sorted with the words of highest frequency first. The
analysis file will be named with the name of your text file
but with the extension .FRQ appended. It is an ASCII file
and is readable with any text editor, word processor and/or
file reading utilities, including the DOS TYPE command.
The successful creation of the .FRQ file is reported on
screen at the conclusion of the sorting process. The
program then returns to DOS.
The breakdown of words, the original filename and the date
of analysis is included in the .FRQ file as a header. The
header also includes a reminder about words that may appear
the same but are counted as unique.
IMPORTANT NOTE: Sorting is done by the DOS command file,
SORT.EXE. This file must be in your path or on the default
directory. If it's not, you will get the error message:
"Bad Command or File Name" and the process will be incomplete.
If you need help with the PATH command or the word "default"
we will be happy to help you if you call or write us.
A word about analysis: there are many words that are going
to have a high count in any document. Articles, pronouns,
the glue that holds phrases together are all used frequently
as a matter of course. These are not the target of a word
frequency analysis. Disregard, therefore, high counts of
words such as "the", "and", "an", etc. Pay attention to
descriptive words, adjectives, nouns that you use habitually
and perhaps too frequently. Frequent analysis will point
out a bad habit you may not have been aware of and that is
the purpose of WORDFREQ.
Note also that an analysis of this document will show a
high count of the words, "frequent", "frequently", "words"
and "analysis". You don't even have to run WORDFREQ to
know that. But these words are the subject of the document
and therefore will naturally be repeated many times. So,
expect the subject of your document, as well as words that
describe the subject, to have a high count and this should
not be considered a problem of habit.
H. ON THE FLY ANALYSIS
While WORDFREQ is reading and analyzing your file, it reports
its activity and current findings on screen. It will show
you the line number being processed as well as the total
number of lines in the file. In this way you can judge its
progress.
Next, it reports the total unique words and the total number
of words that it's accumulating as it goes along.
Finally, it indicates the average number of words per line.
Analysis can be aborted by pressing the [ESCAPE] key during
the processing. This can be a little sticky, however, and
may require repeated pressing for abort to be effected.
Since WORDFREQ can only handle as many unique words as
string space memory allows, it will stop its analysis when
the limit is reached. The program will not abort here,
however, and will proceed to report on that portion of the
file it has managed to read. The important thing to observe
in this instance is the number of lines read 'out of' the
total in the file. This gives you an fairly exact idea of
the portion of the file the report applies to. If it's
important to have the full file analyzed, it would have to
be broken into parts. This is possible with certain
utilities that are available as shareware. It's a
cumbersome procedure, but if it's essential there's a way
to do it.
------------------------------------------------------------------------------
I. Additional notes:
The non-standalone version of WORDFREQ must have BRT71EFR.EXE in the default
directory or in the path. This version is best for you if you are running
more than one program requiring BRT71EFR.EXE since the individual program
files will be smaller.
To make WORDFREQ as useful as possible, it, too, should be on a directory
that is in your path, such as \UTIL. If you run it from another drive, best
results are achieved with the standalone version, which doesn't require the
presence of BRT71EFR.EXE but which is larger in terms of disk space.
WORDFREQ does not contain much in the way of error-checking, beyond what DOS
might or might not offer. Where this may be a problem is if you do a
WORDFREQ on a file in a drive which doesn't exist, where the floppy drive
door is open, something like that. Rare, and we leave it to the user to
avoid such mistakes. An opened drive door could hang up your computer (a
warm or cold boot will cure the temporary situation). More often than not,
however, you'll be able to insert the disk and close the door and then enter
an 'R' for Retry.
------------------------------------------------------------------------------
J. Printing this file
The first thing you may notice if you send this file to your printer, is that
there are no embedded form feeds. My thinking on this is that your printer
is likely to furnish its own form feed when a certain number of lines are
reached. Mine does this, but perhaps that's because it's a laser printer. I
don't know for sure, but maybe dot matrix and impact printers don't do that.
If that's so, then you might want to do your own formatting. It merely
requires you to place a Control-L character every 60 lines or so.
For those whose printers work like mine, not having form feeds built in allows
you to print out with your own utility or application in your own preferred
way. I never have liked those pesky embedded commands, making an assumption
that didn't fit my life's plan or defaults. My print utility allows me to
specify font and font size. I like it that way. I hope the majority of you,
too, like it that way. For those who don't, sorry. Write me a letter. I'd
sincerely love hearing from you.
------------------------------------------------------------------------------
K. UPDATES:
~~~~~~~~
Version 1.1 Recompiled with PDS BASIC 7.1.
The analysis breakdown is now put into the .FRQ
(report) file.
A note in this file regarding the use of SORT.EXE
as an essential file to be in your path or in
your default directory. SORT.EXE is part of
DOS and is copyright Microsoft, Inc.
------------------------------------------------------------------------------
L. BUG REPORTS:
Users are encouraged to report all bugs to the author. One of the
most common way for bugs to creep into any program is when new
features are added and all the previous capabilities which were
working flawlessly are not thoroughly tested. Since the author
uses this utility, even obscure bugs will eventually come to his
notice, but in the spirit of partnership, all users are encouraged
to report a bug they encounter. The file, FEEDBACK.DOC, is included
with the program to facilitate such reports. (Of course, feel free
to send us the form merely to report good things, too).
In the event of a bug appearing in a commonly used option, we
advise you go back to the previous version you were using until
the bug(s) is(are) corrected.
------------------------------------------------------------------------------
M. DISTRIBUTION
WORDFREQ may be distributed freely so long as no alterations are made to the
program nor to the archive nor to this documentation. Shareware distributors
may charge up to $5 for the WORDFREQ utility without specific permission so
long as they inform us of their distribution with a copy of the ad or catalog
in which it is offered. For any greater charge, distributors must first
write us to seek permission and to explain the reason for the higher charge.
------------------------------------------------------------------------------
N. REGISTRATION INFORMATION
Registration fee to the end user is a mere $15. (See above for alternate). A
standalone version is available on request when registering. Go ahead. Make
my day. Encourage me to write more by registering. If you find yourself
using WORDFREQ regularly, it's the right thing to do. Registered users will
receive at least 1 update when available as well as notifications (or copies)
of other JB utilities.
This program may not be used in a business, corporation, organization,
government or agency environment without a negotiated site license.
Suggestions, bug reports, etc. appreciated and encouraged -- even by non-
registered users. Yes, if you say, for instance, "hey, if you'd only add
[this capability or that capability], I'd pay your filthy fee!", we'd
probably add it just to part you from your fifteen. So, don't be shy. We'll
even listen to a hard luck story. Heck, if you're only using 1/3 of
WORDFREQ's capabilities, we'll accept $5. You be the judge and executioner.
("Ouch, that rope is rough!").
~~~~~ Mail fee or excuses to JBUTILS, P.O. Box 46116, Los Angeles, CA
90046-0116. Messages may be addressed to me on Prodigy. ID# FTSN96A.
-- J.B., for JBUTILS