home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Black Box 4
/
BlackBox.cdr
/
finance
/
magev40c.arj
/
DOCS.ZIP
/
SPEED.DOC
< prev
next >
Wrap
Text File
|
1991-07-06
|
12KB
|
200 lines
The DATAMAGE data management system: Comments on execution speed.
Speed of execution is a topic of interest in any program that performs
significant tasks. Databases can be said to live or die by their ability to do
MASSIVE processing in literally no time flat. Many of the programs that
compete with DATAMAGE waste incredible amounts of disk space and include
methods that render their files and indexes unimpregnable to any other program
in order to accomplish, or to SEEM to accomplish, this impossible goal.
There are ways to speed things up, and it is the purpose of this document to
explain them. These methods fall into three basic categories: Get a fast
computer, use the computer's memory instead of the disk drive, and keep your
files all in one place on the disk drive.
HARDWARE:
Certainly, the raw power of the target computer is a very significant
ingredient in the equation. If you are using an old PC or PCXT whose
microprocessor is the original 8088 and runs at 4.77 Mhz you will be quite dis-
satisfied with the performance of this, or any other program. (A Mhz, by the
way, is one million clock cycles per second.) If you can not afford better
hardware one thing that will enhance your current computer is to pull the 8088
chip out and replace it with a NEC V-20. This will almost double your
processing speed but your computer will still be, in modern terms, VERY SLOW.
The original PCAT, running at 6 Mhz, and the later "enhanced" version of same
running at 8 Mhz were fast machines in their day. Their time has passed. A
modern AT-class machine should run at a minimum of 12 Mhz, and hardware is
available that runs all the way up to 44 Mhz! There is little, if any
difference between an 80286, 80386 or even the latest 80486 microprocessors
when shuffling datafiles. The CLOCK rules the computer, and dictates it's
speed. No other enhancement can approach the clock speed for doing it fast.
SUBSTITUTING MEMORY FOR THE DISK DRIVE:
Memory above MS-DOS's 1,024,000 byte (1 MEGABYTE) limit is available ONLY on
AT-class machines, or with special E.E.M.S cards on XT-class machines. MS-DOS
can not use this memory, but your computer can make use of it via DRIVERS which
switch the microprocessor into it's native mode, make use of the memory, then
switch it back into 8088 emulation mode and continue to execute the program
that runs under MS-DOS. That's right - when you are running MS-DOS your
microprocessor, be it a '286, '386 or '486, is NOT running it's potential
instruction set, but is emulating an 8088. The newer microprocessors CAN NOT
run MS-DOS, nor can MS-DOS access or manage memory over 1 megabyte.
There are two types of drivers that can add significant speed to DATAMAGE, or
any other program that makes heavy usage of the disk drive: RAM-DISKS and DISK
CACHETING. These drivers allow the computer to use memory above the 1 megabyte
limit and substitute it for the slower disk drives.
With the RAM DISKS a "fake" disk drive is added to your computer. You can use
it just as you would any real drive, you can do everything except remove and
replace the disk, which is also not possible with a hard disk. DATAMAGE will
prompt you for the disk drive to use for various tasks. If you have sufficient
memory you can set up a ram-disk and greatly profit in terms of speed.
With the DISK CACHETING drivers the data in the files currently open on your
computer is read from the disk where it resides and moved into memory above the
1 megabyte limit. When you read or write data it is moved from/to this memory
which takes quite a bit less time than doing the same operation from the disk.
BENCH MARKS:
Before beginning it may be helpful to explain that the standard benchmark of PC
execution speed has become Peter Norton's SYSINFO program. The version of
SYSINFO used for these illustrations was 3.0. This program assigns a numeric
rating to the computer tested, 1.0 being the speed of a standard PC running an
8088 at 4.77 Mhz. So, 1.0 is DEAD SLOW.
In order to demonstrate just how much speed can be gained by the various
methods detailed above here are some operations, done with and without disk
cacheting and at different clock speeds. They were all done on the same
computer, a CHIPSET (NEAT) '286 running at 10 or 20 Mhz. This machine is
certainly not the fastest computer that money can buy, but it aint too shabby.
This machine consistently attains SYSINFO ratings of 11.5 or 15.5 (depending on
external or internal bus timing) at it's 10 Mhz speed and 23.0 at it's 20 Mhz
speed. There are utilities that will slow a computer down in order to play a
video game designed for the older, slower hardware. Such a program:
VARISLOW.EXE was used to decrease the speed of the computer used for testing as
much as possible, to 3.6 or 4.0 SYSINFO rating.
Of the many operations that can be done by DATAMAGE, sorting the records into
order is the most demanding and time-consuming. The sorts were, therefore,
used for the benchmark tests. The datafile used for the bench mark tests
comprises 3,332 records. The size of this file after it was converted to the
DATAMAGE format (DATAMAGE also found and rejected all the duplicate records it
contained during the conversion!) is 2,050,512 bytes. It was imported from the
dBase format, and is distributed with the SHAREWARE MARKETING SYSTEM from Jim
Hood. Thanks, Jim!
HARDWARE:
For the hardware test the records were first arranged into an order that was
the same for each test, and completely random in terms of the desired order on
RECORD NUMBERS. The record numbers are in the computer's memory, so the disk
drive was not accessed.
SI RATE 1ST 1,000 2ND 1000 3RD 1000 TIME
===========================================================
3.6 1:15 3:20 5:25 12:11
15.5 :23 1:00 2:23 3:09
23.0 :14 :36 :58 2:11
As you can see, the few extra dollars spent to procure a modern, fast computer
PAY OFF! As I have learned, long ago: There's NO economy in buying junk! And
there is no point of reference that will demonstrate this axiom more clearly
and frequently than computer hardware.
HARDWARE and DISK CACHETING:
Even more difficult than the sorting of numbers already in the computer's
memory is the sorting of string data. This must be gotten from the disk drive
as the sort progresses, and takes far longer to compare than numeric data.
The same file contains a string field: COMPANY NAME. The file was arranged in
an order that placed the company name at as near as you could come to random
intervals within the file, then sorted into alpha order. As I watched the sort
progress I realized that each and every record was moved.
DBASE AND SORTING ALPHA:
While completing this portion of the documentation I asked a young friend who
has, like a lot of other people, been FAKED OUT by dBase, how long it would
take the industry standard to complete this task. I pointed him to the file in
dBase format, and he got right back to me and, with swollen chest and in a VERY
loud voice, informed me that dBase did this in less than a minute.
I KNEW BETTER!
So, I went over there and asked him to do it for me. He gave dBase the
commands and it did SOMETHING in less than a minute. What it did was to create
a whopping 355K b-tree file containing the data in the company name field. It
did NOT sort the records nor, apparently, is it able to do so!
As we attempted to USE this index we noticed that, while getting the next
record in the order, there was a delay. This time was spent by dBase in
climbing it's b-tree in order to FIND said record.
DATAMAGE, on the other hand, produced a MARKER file of only a little over 6K.
The reading of the next record in the order was INSTANTANEOUS, as the MARKER
file contained it's location in the file. DATAMAGE did ALL of the work
involved in the sort at one time. You could use the results of this work for
the next thousand years and not have to wait one additional nanosecond to see
your records in the desired order.
In the immortal words of one Laut Su (Chinese profit crica 1,000 A.D.): "The
foremost thoughts in a wise man's mind are these: THINGS ARE SELDOM AS THEY
SEEM!" Maybe there is a program that will do a REAL alpha-sort faster than
DATAMAGE. Be that as it may, I will NOT lie or cheat to make my program SEEM
faster than it is. But soon, maybe next year, I will re-write the BASE program
in C, and thereby honestly increase it's speed.
HONEST TIMES TO SORT THE DATA:
With these times the speed of access to data on disk becomes a very important
factor, and disk cacheting begins to have great value. The proceeding table
relates the computer's speed as well as the presence of cacheting.
NORTON BYTES 1ST RECORDS 2ND RECORDS 3RD RECORDS TOTAL RECORDS
SI RATE CACHET 1000 SECOND 1000 SECOND 1000 SECOND TIME SECOND
==============================================================================
4.0 0 20:14 .78 29:53 .55 33:26 .50 1:37:14 .57
11.5 0 11:30 1.45 15:45 1.01 17:15 .97 50:13 1.10
11.5 1024 7:35 2.20 9:01 1.85 10:29 1.59 30:40 1.81
23.0 0 9:02 1.83 12.20 1.35 13:17 1.41 39:15 1.41
23.0 1024 5:08 3.25 5:29 3.03 7:23 2.26 18:07 3.07
You are quite likely more mathematically reclined than myself. Hoards of
numeric data could be extrapolated from the above - have fun! Just one point:
If you have a computer that actually rates 4.0 on SYSINFO you will be hard-
pressed to match the first line. Though the time wasting utility slowed the
microprocessor down this machine STILL communicates with a drive that's 3 times
as fast as the drive that came with the original PC or XT, down a 16-bit bus.
The point should be made, here, that DATAMAGE can, and usually will, sort your
datafile MUCH faster than this. Look at the times above as sort of a worst-
case scenario. See the file MAIN.DOC, under the heading F-10, MARKER FILE and
the sub-heading RESTORE REMAINING RECORDS to discover how to get your work done
fast. You will only need to sort the raw order ONCE, after that you'll make
much better time via the use of a MARKER FILE.
KEEPING THE FILES TOGETHER - FRAGMENTATION:
As DATAMAGE files grow your disk drive is bound to get FRAGMENTED. The
datafiles will grow piece by piece. You have a DATAMAGE file that is 100K in
length; you enter several documents with your word processor, download a couple
hundred Kbytes with your modem, then add some records to your DATAMAGE file.
All the things that happened in between will BE in between on your disk drive,
resulting in DATAMAGE having to hunt all over the drive to read the file.
There are utilities galore to defragment hard disks. Two GOOD ones that are
available as SHAREWARE from your disk vendor or on a local BBS are DOG
(DiskOrGanizer) and SST. DOG is really safe, but also kinda slow. SST is
really fast but, should your power fail during the session, bye bye disk!
Commercial programs such as NORTON UTILITIES also include defragmentation in
their bag of tricks. Whatever your choice of vehicle you should defragment
your hard disk AT LEAST once a month, and for heavy usage once a week. Doing
so will have you working more and waiting less.