home *** CD-ROM | disk | FTP | other *** search
- EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
-
-
-
- NAME
- eqanda.txt - Expected Questions And Answers
-
-
- CONTENTS
- This section contains the questions probably raised in using
- concache.exe and the family programs, the DOS disk cache
- program, and their answers. Following is the contents of
- this section.
-
- Why And How Cache Programs Speed Up Disk Io ?
- What Are The Elements To Limit Concurrency ?
- How Much Memory Should Be Prepared For Cache ?
- How Concache.exe Can Be Tuned, In Terms Of Conven-
- tional Memory ?
- Is There Anything To Note With Relation To Serial Com-
- munications Software ?
- Troubleshooting
-
-
- QUESTION
- Why And How Cache Programs Speed Up Disk Io ?
-
- ANSWER
- Actually, disk cache programs don't speed up disk io.
- Instead, they reduce the number of disk io operations. They
- work to the user program as if disk io is completed as soon
- as possible. They buffer disk data in a large memory area
- called disk cache buffer (hereafter simply termed cache).
- For read requests, if the data to be read reside in the
- cache, data is supplied from cache. Also, data to be read
- next by user programs are read and stored in the cache. This
- method of speed up is called "read ahead" or "preread". For
- write requests, the data to be written is copied into the
- cache and user programs "think" the data to be written are
- really written to disks. The data are actually written at
- the cache program's convenience. This method of speeding up
- the write requests is called "delay write", "write behind",
- "write after", or "postwrite".
-
- First generation of PC cache programs were generally reluc-
- tant to use postwrite. This is thought of as a too special
- luxury. Data to be written are written to disks as soon as
- requested. The method to handle writes this way is called
- "write-through".
-
- When cache programs arrived on the market which use post-
- write, it is found the programs more than double the speed
- of writes. This is because disk allocation table, known as
- FAT, is located at the top of disk and data space at the
-
- Concache 1.10 Last Update: 19 June 1996 1
-
-
-
- EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
-
-
-
- opposite corner, every write request first writes FAT mark-
- ing as used and then turns head to the allocated area and
- write data sectors. Postwrite in effect eliminates repeated
- writes on FAT by submitting to DOS yet unwritten FAT. So,
- not only actual number of write operations are reduced but
- most head movements are eliminated by not needing to actu-
- ally go back and forth to FAT area.
-
- When working on floppy, you might have experienced severe
- performance degradation if buffers= statement in config.sys
- file is inadequately written. Also you might have observed
- writes get slow down as your program proceed. What cache
- programs do, up to this generation, is to extend the con-
- fig.sys statement buffers= to a large cache buffer.
-
- Next come so called "advanced" cache programs which attempt
- to write data back concurrently with user programs. These
- cache programs don't wait keyboard idle time, for example,
- to write back cached data. This means traditional DOS pro-
- grams' common inception that because disk writes are slow
- they must be held into application program's buffers until
- absolute needs arise to write them back is wrong. Writing
- data as required is in fact faster and, perhaps less impor-
- tant, eliminates the need of huge buffers from each applica-
- tion program. In addition, because data are written as they
- are produced, there are less chances of accidental data
- loss. They become faster, safer and leaner.
-
- It might be possible to think disk speed up has taken place
- beginning with this generation.
-
- Concache.exe belongs to this generation, and has added
- another generality. It allows concurrency as far as there is
- no reason to refrain from. The result is one floppy, one
- BIOS disk, and as many as SCSI disks configurable into DOS
- can be driven concurrently with DOS/user programs.
-
-
- QUESTION
- What Are The Elements To Limit Concurrency ?
-
- ANSWER
- From hardware point of view, floppies can not perform io
- concurrently each other due to floppy controller design.
- Also, IDE disks cannot. SCSI disks can perform io in paral-
- lel, as seen on many multiprogramming operating systems. At
- this level, one floppy, one IDE disk and SCSI disks can
- operate concurrently.
-
-
-
- Concache 1.10 Last Update: 19 June 1996 2
-
-
-
- EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
-
-
-
- The next level to consider is BIOS to support io operations.
- As far as published BIOS listing is concerned, there is no
- reason floppy and IDE disk cannot operate concurrently.
- SCSI drivers are usually written to do io asynchronously.
-
- Here comes BIOS capability to distinguish disk events. Stan-
- dard BIOS handle only two "type"s of disks, which is suffi-
- cient for floppies and IDE disk environments, as found in
- most PC configurations. Fortunately, ASPI (advanced SCSI
- programming interface) specification, now broadly employed,
- supports a mechanism effectively similar to BIOS disk event
- notification, called command posting. (See appropriate man-
- ual about this.) This allows handle individual disk's
- events.
-
- At this level no situations about concurrency issue is
- changed.
-
- The next level of the factor is device driver's non-
- reentrancy. Even if a device driver manages several disks,
- it expects its requests come serially but not while the pre-
- vious requests are in progress. In fact, most known device
- drivers lose reentrancy necessary for concurrency at the
- very first two steps of driver code execution.
-
- Also, io.sys handles int13, which is passed through by
- almost any disk device call, in non-reentrant way. So, you
- may think if third party device driver is used, for example
- using io.sys for floppies and that for the other disk
- devices, then at least the combination of one floppy and one
- hard disk should work concurrently. But no. If both share
- int13, then they don't work concurrently.
-
- Next comes the DOS drive letter availability. If, for exam-
- ple, a SCSI disk is split into two partitions, with many
- good reasons, the user loses one drive letter for one disk.
- These two partitions cannot share the io operation time.
-
- Those constitute inherent limitations of concurrency. In
- practice, there are resource limitations for programs under
- DOS. For example ASPI drivers may limit the number of pack-
- ets that it can accept at once.
-
- Likewise, ccdisk.exe can limit the concurrency of SCSI disks
- from its command line.
-
- Finally, concache.exe can limit concurrency in two ways.
-
- 1) concurrency= option limits the number of concurrent
- devices.
-
- Concache 1.10 Last Update: 19 June 1996 3
-
-
-
- EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
-
-
-
- 2) io_buffers= option specifies insufficient io buffers to
- let devices work concurrently.
-
-
- QUESTION
- How Much Memory Should Be Prepared For Cache ?
-
- ANSWER
- There are certainly optimal points of cache size. Unfortu-
- nately, the points are too dependent on application and the
- mix. There is no clear way to estimate the size and perfor-
- mance of cache.
-
- Fortunately, concache.exe allows change cache size on the
- fly. You can observe the performance of various cache sizes.
- If adding memory doesn't improve, then probably your mix
- needs more memory, or you decide decrease cache memory size
- without degrading performance.
-
- A "pathetic" looking example is presented below. This kind
- of anomaly is not uncommon in practice.
-
- Consider following hypothetical example. I edit, compile,
- link, and debug programs, just cyclically repeating these
- steps. For simplicity, assume each step requires exactly
- one megabyte. And assume each step needs a set of files
- completely unrelated to the other steps (unrealistic ? but
- think simple this way for now.) Now let's have 3 megabytes
- cache. Then how this 3 mb will be used ?
-
- Each of first three steps loads editor and source files into
- first megabyte, loads compiler, header, source, and object
- files into the next megabyte, finally loads loader, library,
- object and exe files into the last megabyte.
-
- The fourth step finds no free megabyte. So it must select
- one from among three. Now familiar algorithm is in its turn.
- Since the content of first megabyte is least recently used,
- it is considered unlikely to be used very soon. So the
- algorithm loads exe file, debugger, test data into, you see,
- into the first megabyte.
-
- I go back to editor. It is not in the first megabyte as you
- have just witnessed. The editor etc. must be loaded into
- second megabyte under similar fuss. This will purge com-
- piler and so on from second megabyte. ...
-
- In this example cache performance is no better than if I
- used only one megabyte cache. If I added another megabyte
- then the performance will be jump improved but adding the
-
- Concache 1.10 Last Update: 19 June 1996 4
-
-
-
- EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
-
-
-
- more does no good. If your job mix consists of five mutu-
- ally unrelated steps each requiring one megabyte and cache
- size is four megabytes, then four megabyte space is no bet-
- ter than one megabyte.
-
- This extremity comes out of commonly used LRU algorithm and
- extremely simplistic assumptions of usage pattern. The least
- recently used space is unlikely to be used very soon, but
- actually it is in this case. So, to pick up a victim out of
- already used three megabytes, let us select it randomly.
- The probability of the survival of the next needed megabyte
- is 0.67, and cache performance is improved that much, isn't
- it ?
-
- A similar situation is when copying a large file. Never read
- again and never written again records continually flows into
- cache data area, thereby erasing useful data from there. So,
- more than double the file size cache area is necessary to
- keep important data cached.
-
- In practice, however, situation is not that bad. Even for
- file copying, FAT and directory images are repeatedly refer-
- enced from cache data area so disk head movements, as well
- as repeated reads and writes to these area on disks are
- avoided, thus improving the speed of the copy operation. In
- the case of file copy, a rather small cache area works as
- well as large ones.
-
-
- QUESTION
- How Concache.exe Can Be Tuned, In Terms Of Conventional Mem-
- ory ?
-
- ANSWER
- An inevitable penalty of concurrency is memory requirements.
- Each concurrently driven device needs its own io buffer,
- control and stack space to switch to and fro, request packet
- to organize io, and, for ccdisk.exe, SCSI control block, in
- addition to descriptors needed for drives managed by con-
- cache.exe.
-
- Following is the description to save memory space used up by
- concache.exe.
-
- First, you can load concache.exe into upper memory, either
- through config.sys as a device driver or through
- autoexec.bat as a TSR (terminate and stay resident).
-
- Second, io buffer size can be changed by buffer_size=
- option, which can slow down data transfers. Note the size
-
- Concache 1.10 Last Update: 19 June 1996 5
-
-
-
- EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
-
-
-
- must be at least the size of the largest sector to be
- cached.
-
- Third, the number of io buffers can be changed. This change
- can affect io performance done by concache.exe so experi-
- ments are needed.
-
- Fourth, directory space can be made at a minimum to the con-
- currency you want.
-
- Fifth, if full stack space, currently 440 - 500 bytes, is
- not used, then it can be reduced to bare minimum 320 bytes
- provided no SCSI disks are used. However, this may be
- affected by the other external interrupt devices so experi-
- ments may be needed. (After all, under DOS, the proof of
- the stack is in the eating.)
-
- Finally, on ccdisk.exe command line, concurrency require-
- ments can be reduced down to somewhere bare minimum. If
- unfortunately concurrency mode cannot be used, then saying
- "concurrency=1" would save hundreds of bytes.
-
-
- QUESTION
- How Concache.exe Can Be Tuned, In Terms Of Performance ?
-
- ANSWER
- Speeding up is gained by either letting io efficient or by
- taking maximal concurrency.
-
- First, make tick_delay= value larger, to avoid clash between
- DOS and concache.exe write back actions. This goes with
- almost no penalties.
-
- Second, make io buffer size or number of io buffers larger.
- Options for these two factors work almost synonymously,
- since concache.exe doesn't do io in fixed size buffers.
- This will improve each io time and, if number of buffers is
- sufficiently large, will also allow concurrent actions.
-
- Third, as cache data area is split into multiple units of
- 8kb, which is fairly large compared to cluster size many
- people prefer, if the drives are heavily fragmented, then a
- large amount of space can be wasted in cache area. Note
- drive fragmentation is not the least influential on perfor-
- mance, and this is not particular to concache.exe but to all
- disk cache programs that work on FAT oriented file systems.
-
- Fourth, splitting files into disks in a scheme io overlap-
- ping is possible would avoid the io clashes.
-
- Concache 1.10 Last Update: 19 June 1996 6
-
-
-
- EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
-
-
-
- Fifth, although preread improves performance in most cases,
- it can degrade overall performance in certain cases; if read
- pattern is random then preread is not only useless but also
- further slows down by access clashes. If such files are
- frequently accessed, it might be better move them to a par-
- tition that does not preread. If cache data area is of
- marginal size then preread can purge still useful data from
- there and instead read out yet unnecessary data.
-
-
- QUESTION
- Is There Anything To Note With Relation To Serial Communica-
- tions Software ?
-
- ANSWER
- Serial communications are notorious for their severe timing
- requirements. For example, when communication speed is
- 38.4kb and the communication device is a model that lacks
- buffer, then each character received through it must be han-
- dled within 25 microsecond. Failing to handle the received
- character within the interval would result in overrun error
- familiar to programmers. Note this problem is particular to
- receive side; a few delays on send side usually make no
- severe problems.
-
- On the other hand since concache.exe works asynchronously
- with serial io, disk io is initiated and completed concur-
- rently with character transmissions. This means con-
- cache.exe causes various housekeepings in DOS context to be
- performed within the short interval, which is almost impos-
- sible on most PCs other than recent high performance ones.
-
- Alleviations do exist, fortunately. Following lists several
- of possible ways.
-
-
- write after mode
- This is to avoid overlapping operations with serial
- transfers, thus the severe timing problem disappears.
-
- buffered controller
- If controller used for serial communication has receive
- buffer it allows extend the short interval several
- times longer. For example, using NS16550 chip enables,
- when properly programmed, lengthen the interval 16
- times.
-
- hardware flow control
- If this is possible on your PC and the counterpart,
- this prevents receiving when there is no room to do so,
-
- Concache 1.10 Last Update: 19 June 1996 7
-
-
-
- EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
-
-
-
- thus the short interval is (unlimitedly ?) extended.
-
-
- Troubleshooting
- In the following, common conflicts such as irq, dma, memory,
- SCSI option settings are not discussed. They are treated in
- respective manufacturer's manual, and (probably) not partic-
- ular to concache.exe per se.
-
- First, stack issue must be tried, as this causes most
- obscure effects on the workings of DOS programs.
-
- Concache.exe is designed to work in the environment
- stacks=0,0. However, because of variety of BIOS manufactur-
- ers and existence of so many BIOS versions, it is not cer-
- tain the estimate on concache's own stack requirements is
- enough in every environment it encounters. In addition,
- there may exist programs which expect a large stack space is
- available at any time. For testing purpose, first try
- "extremely wasteful" stack space in config.sys. If this
- solves problem, your remaining task is find out the best
- values for the config.sys line.
-
- Alternately, stacksize= option on concache.exe can be tried
- to find if concache.exe is experiencing stack overflow.
-
- Let's discuss the problem in each mode of concache.exe.
- Respective mode is to be given by option or by drive
- description.
-
-
- Fail On Stop Mode
- If concache.exe fails in stop mode, there are two cases to
- consider.
-
- CPU overhead concache.exe incurs can be the problem. See the
- section on the relations to communication. There is no gen-
- eral solutions whatsoever.
-
- The conflict can be between third party device drivers or
- hardware. The gnaw_interrupt option on concache.exe may
- help in some cases.
-
-
- Write Through Mode Doesn't Work
- Added complexity from stop mode to write through mode is the
- actual access to memory manager and device driver.
-
- Empirically, conflicts with memory managers are very rare,
- except for pre-'90 EMS managers.
-
- Concache 1.10 Last Update: 19 June 1996 8
-
-
-
- EQANDA COPYRIGHT 1995-1996 horio shoichi EQANDA
-
-
-
- Some device drivers may not be prepared with recent device
- driver conventions.
-
-
- Write After Mode Doesn't Work
- Concurrency problems start from this mode. A variety of
- assumptions about single-taskness of DOS programs where io
- actions are enclosed within DOS context begin to cause con-
- flicts.
-
- Interrupt intensive applications can fail due to switching
- overhead caused by concache.exe. If this might be the case,
- then try write through mode. Slowing down is far better than
- losing data.
-
- Concurrency Mode Fails
- If write after mode works but concurrency mode doesn't, it
- seems most of problems are of synchronizations. One of
- cases encountered while testing compatibilities are due to
- improper int2a8x handling.
-
- For example a network program ignores int2a8x critical sec-
- tion interrupts while within int13 period, exactly which is
- what concache.exe is going to do. Consequently, the program
- miscounts int2a8x and erroneously identifies DOS idle
- period.
-
- Another example. There are certain periods concache.exe
- does not want to be interrupted and reentered. In such cases
- it issues DOS synchronization interrupt and warns not to
- call DOS. Unfortunately, the interrupt is ignored or ill-
- treated, thus causing hang.
-
-
- SEE ALSO
- ccdisk.txt, concache.txt, floppies.txt, overview.txt.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Concache 1.10 Last Update: 19 June 1996 9
-
-
-
-