home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
PC-Online 1996 May
/
PCOnline_05_1996.bin
/
linux
/
source
/
kernel-s
/
v1.1
/
umsdos-0.001
< prev
next >
Wrap
Text File
|
1995-10-10
|
6KB
|
167 lines
UMSDOS is part of the kernel distribution (since 1.1.36). Many
have complained that it was somewhat slow. Many more accepted that
it had to be slow, but so useful: acceptable tradeoff.
Good news! There was a very good reason why Umsdos (and msdos fs)
was slow. This is history :-)
I will explain why it was slow, how I solve it, give some benchmarks
and tell you what you can do to help me test it.
Please note that I am already running this on my production
machine, so it surely works! It is fully compatible with the
umsdos 0.3 file layout.
History:
========
Umsdos run piggyback on the Msdos file system driver written
a long long time ago by Werner Almesberger. It was rock
solid when I started Umsdos back in november 1992. At that
time the buffer cache of Linux was only manipulating 1k blocks.
It simply splits a disk in 1k blocks and there were no mean for
a file system driver to read or write anything else than full
block. Since blocks are 1k large (2 512 bytes disk sector), each
start on an even sector number.
The FAT msdos file system organises files in clusters. Clusters
are generally 2k, 4k or 8k on a hard drive. And here is the
bad news. The first cluster on a hard drive start on an odd sector
number. This produces this strange relation
|--- B1---|--- B2---|--- B3---|--- B4---|--- B5---|
linux blocks |-S1- -S2-|-S1- -S2-|-S1- -S2-|-S1- -S2-|-S1- -S2|
DOS clusters -S4-|-S1- -S2- -S3- -S4-|-S1- -S2- -S3- -S4-|
|-------file1-------|-------file2-------|
So one DOS cluster (2k) span 2 1k blocks ? Wrong! it spans 3. For
example, one cluster of "file1" touch the half end of "B1"
"B2" completly and the first half of block "B3".
Given the fact that the msdos driver in linux can only read or write
whole 1k block, we get the following problem. Here are the operations
needed to write a full cluster of "file1".
read B1
fill the last half of B1 with the first sector of the cluster
write B1
write B2 with sector 2 and 3 of the clustor
read B3
fill the first half of B3 with sector 4 of the cluster
write B3
It has to be like this because, for one, the first sector of block
B1 may belong to another file. The current msdos file system driver
do read every sector before writing it. This is what kills performance.
The fix:
--------
At the time the msdos fs was done, linux was not flexible enough.
Now it is possible to control at the file system level the size
of the blocks in the buffer cache, and this for each file system.
independantly.
The trick was rather simple: Use a block size of 512. A 2k cluster
will then span 4 full blocks. If we overwrite completly a cluster
we have no need to read any block from the disk.
This fix has numerous advantages beside speed.
It makes the msdos fs simpler.
It make "bmap" always available, so
No special mmap handling anymore.
Use less memory to run programs because
they run directly in the buffer cache.
No special trick for swap file.
LILO can be used in a msdos fs (I have not tried it though).
Very useful for those who use OS/2 and Linux in the same
partition.
I have also introduced some support for read ahead. Most of the
changes were done in linux/fs/msdos/*.c
I have made the minimal changes requiered to make both umsdos and
the msdos fs loadable modules. This is included in the patch.
Some benchmarks:
---------------
I did not do very scientific benchmarks. I have made many tests
and many measurements. I think that one test I have done reflects
the real benefit of my work. I have simply done a recursive copy
of the linux kernel (with object files) into another directory.
This test excercise different things
Copy a large number of small file (958)
A lot of file creation, lot of access time
update.
Copy a large amount of data (10meg) so
So real disk activity has to occur (overrun the cache)
To give an idea, I have also run the test on DOS and OS/2
on exactly the same data (same partition, same everything,
different utilities though).
Here are the results. Test1 demontrates a copy to a target
directory in the same partition. Test2 use a target directory
in a different partition, forcing longer seek of the disk.
The results are in seconds.
test1 test2
DOS with BUFFER=50 117 124
Standard umsdos 0.4 in 1.1.57 99 136
OS/2 with 1024k disk cache 70
DOS with disk cache 58 46
(hyperdisk with 1024k)
New improve Umsdos 0.5 43 43
New improve msdos 32 32
On DOS and OS/2 I have use xcopy. On linux, I have used
"tar cf - linux | (cd dir; tar xvf -)". You will note that
I have used the v option for the second tar to be fair, since
xcopy has no silent mode. I did get the same result
with a "cp -R ..." however.
As you see, the performance improvement is drastic, especially
for test2. I suspect this is the typical case, where you read
something from one end of the disk and write something else
away on the disk. Umsdos 0.4, by forcing a read before write
was defeating the usefulness of the cache (delay write). As you
see, test2 and test1 give the same result with umsdos 0.5.
I did not include results from EXT2. I had little time to
do a fair test. My little attempt to replicate the condition
of the test in a EXT2 partition have shown that EXT2 was slower
both for reading and writing. Much slower... I have no explanation
for this.
I would be nice to run a nice serious benchmark suite. From my
benchmark we can only conclude that Umsdos is not slow anymore.
How to test it ?
---------------
I have upload the following files to
sunsite.unc.edu:/pub/Linux/ALPHA/umsdos
umsdos-0.5-diff
umsdos-0.5-diff.lsm
umsdos-0.5-diff.README (this text).
The patch was made relative to 1.1.57 + small notify_change
buglet fix. Get the latest kernel (1.1.59) kernel and apply
the patch
cd /usr/src/linux
patch -p1 </usr/src/umsdos-0.5-diff
run it, beat it, test it and please email me about anything
even to say "hello, it works".
Keep in mind that this is "file system stuff". A backup is always
a good idea.