home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
PC-Online 1996 May
/
PCOnline_05_1996.bin
/
linux
/
source
/
kernel-s
/
umsdos
/
umsdos-0.000
/
umsdos.doc
Wrap
Text File
|
1994-05-21
|
55KB
|
1,727 lines
1 Introduction 1
2 What is UMSDOS 1
3 General strategy 1
4 File name mangling 1
5 --linux-.---: the EMD file 4
6 Hard links 6
7 Symbolic links 7
8 Special file 7
9 Pseudo root 8
10 Dual mode 10
11 Miscellaneous 11
11.1 UMSDOS_create 11
11.2 UMSDOS_ioctl_dir 11
11.3 UMSDOS_lookup 14
11.4 UMSDOS_notify_change 15
11.5 UMSDOS_readdir 16
11.6 mount and UMSDOS_remount_fs 16
11.7 UMSDOS_rename 17
11.8 Data structure 17
11.9 Inode management 17
12 Synchronisation problems 17
13 Convention and style 18
14 Weakness and features 20
15 utilities 21
15.1 The UMSDOS synchroniser 21
15.2 Other 23
16 Test cases 24
16.1 utstgen 24
16.2 utstspc 26
1 Introduction
This document describe the implementation of the UMSDOS file
system for LINUX. It contains mostly implementation notes
rather than a formal description of the concept. It is
intended for a reader who already has a good knowledge of
the LINUX VFS system. It is expected (hope) that such a
person may find weakness (or bugs) in UMSDOS just by reading
this document. It is the first document to read if you
suspect a bug.
2 What is UMSDOS
UMSDOS stand for "Unix in MSDOS" file system. UMSDOS is a
full feature unixlike file system for LINUX. It operates
within the limits (and semantics) of a normal MSDOS FAT file
system. chkdsk won't complain at all. No dirty tricks.
UMSDOS DOES not use MSDOS (the OS) to run. In case anyone
wonder. Using a special file in each directory
("--linux-.---"), UMSDOS simulate the full UNIX semantic (I
hope :-) ):
Long file name
case sensitive
free format file name like "This.is.a.sample"
Permissions and owner (user and group)
Links (hard and symbolic)
Device special and pipe
UMSDOS is powerful enough to act as a ROOT file system for
LINUX. From a Linux kernel standpoint, this is a true
filesystem. Compile a kernel with UMSDOS, put it on a
disquette, use rdev to specify the proper root partition
(Any msdos partition with linux in it), and boot. In the
following document, the --linux-.--- will be named EMD
(Extension to Msdos Directory).
3 General strategy
UMSDOS operates on top of the MSDOS fs for LINUX. Using the
VFS function table, UMSDOS mostly intercept calls to MSDOS
fs, do some translation and sometime carries itself the
operation. Most of the job is directory search both in MSDOS
fs and EMD.
4 File name mangling
[umsdos/mangle.c,259]
#Specification: file name / non MSDOS conforming / mangling
Non MSDOS conforming file name must use some alias to fit
in the MSDOS name space.
The strategy is simple. The name is simply truncated to
8 char. points are replace with underscore and a
number is given as an extension. This number correspond
1
to the entry number in the EMD file. The EMD file
only need to carry the real name.
Upper case is also convert to lower case.
Control character are converted to #.
Space are converted to #.
The following character are also converted to #.
" * + , / : ; < = > ? [ \ ] | ~
Sometime, the problem is not in MsDOS itself but in
command.com.
[umsdos/mangle.c,26]
#Specification: file name / non MSDOS conforming / mangling
Each non MSDOS conforming file has a special extension
build from the entry position in the EMD file.
This number is then transform in a base 32 number, where
each digit is expressed like hexadecimal number, using
digit and letter, except it uses 22 letters from 'a' to 'v'.
The number 32 comes from 2**5. It is faster to split a binary
number using a base which is a power of two. And I was 32
when I started this project. Pick your answer :-) .
If the result is '0', it is replace with '_', simply
to make it odd.
This is true for the first two character of the extension.
The last one is taken from a list of odd character, which
are:
{ } ( ) ! ` ^ & @
With this scheme, we can produce 9216 ( 9* 32 * 32)
different extensions which should not clash with any useful
extension already popular or meaningful. Since most directory
have much less than 32 * 32 files in it, the first character
of the extension of any mangle name will be {.
Here are the reason to do this (this kind of mangling).
-The mangling is deterministic. Just by the extension, we
are able to locate the entry in the EMD file.
-By keeping to beginning of the file name almost unchange,
we are helping the MSDOS user.
-The mangling produces names not too ugly, so an msdos user
may live with it (remember it, type it, etc...).
-The mangling produces names ugly enough so no one will
ever think of using such a name in real life. This is not
fool proof. I don't think there is a total solution to this.
[umsdos/mangle.c,302]
#Specification: file name / MSDOS devices / mangling
2
To avoid unreachable file from MsDOS, any MsDOS conforming
file with a basename equal to one of the MsDOS pseudo
devices will be mangled.
If a file such as "prn" was created, it would be unreachable
under MsDOS because prn is assumed to be the printer, even
if the file does have an extension.
Since the extension is unimportant to MsDOS, we must patch
the basename also. We simply insert a minus '-'. To avoid
conflict with valid file with a minus in front (such as
"-prn"), we add an mangled extension like any other
mangled file name.
Here is the list of DOS pseudo devices:
"prn","con","aux","nul",
"lpt1","lpt2","lpt3","lpt4",
"com1","com2","com3","com4",
"clock$"
and some standard ones for common DOS programs
"emmxxxx0","xmsxxxx0","setverxx"
(Thanks to Chris Hall <CAH17@PHOENIX.CAMBRIDGE.AC.UK>
for pointing these to me).
Is there one missing ?
[umsdos/mangle.c,255]
#Specification: file name / --linux-.---
The name of the EMD file --linux-.--- is map to a mangled
name. So UMSDOS does not restrict its use.
[umsdos/mangle.c,147]
#Specification: file name / non MSDOS conforming / base length 0
file name beginning with a period '.' are invalid for MsDOS.
It needs absolutly a base name. So the file name is mangled
[umsdos/mangle.c,220]
#Specification: file name / non MSDOS conforming / mangling clash
To avoid clash with the umsdos mangling, any file
with a special character as the first character
of the extension will be mangled. This solve the
following problem:
touch FILE
# FILE is invalid for DOS, so mangling is applied
# file.{_1 is created in the DOS directory
touch file.{_1
# To UMSDOS file point to a single DOS entry.
# So file.{_1 has to be mangled.
[umsdos/mangle.c,212]
3
#Specification: file name / non MSDOS conforming / last char == .
If the last character of a file name is
a period, mangling is applied. MsDOS do
not support those file name.
[umsdos/mangle.c,139]
#Specification: file name / too long
If a file name exceed UMSDOS maxima, the file name is silently
truncated. This makes it conformant with the other file system
of Linux (minix and ext2 at least).
5 --linux-.---: the EMD file
The strategy for inode management. UMSDOS lets the MSDOS fs
run and does simple transformation to the in core inode
content. It also adds information at the end of the inode
structure. See /usr/include/linux/umsdos_fs_i.h.
[include/umsdos_fs.h,47]
#Specification: EMD file / record size
Entry are 64 bytes wide in the EMD file. It allows for a 30 characters
name. If a name is longer, contiguous entries are allocated. So a
umsdos_dirent may span multiple records.
[umsdos/emd.c,233]
#Specification: EMD file structure
The EMD file uses a fairly simple layout. It is made of records
(UMSDOS_REC_SIZE == 64). When a name can't be written is a single
record, multiple contiguous record are allocated.
[umsdos/emd.c,135]
#Specification: EMD file / empty entries
Unused entry in the EMD file are identify
by the name_len field equal to 0. However to
help future extension (or bug corretion :-( ),
empty entries are filled with 0.
[include/umsdos_fs_i.h,80]
#Specification: strategy / in memory inode
Here is the information specific to the inode of the UMSDOS file
system. This information is added to the end of the standard struct
inode. Each file system has its own extension to struct inode,
so do the umsdos file system.
The strategy is to have the umsdos_inode_info as a superset of
the msdos_inode_info, since most of the time the job is done
by the msdos fs code.
So we duplicate the msdos_inode_info, and add our own info at the
end.
4
For all file type (and directory) the inode has a reference to:
the directory which hold this entry: i_dir_owner
The EMD file of i_dir_owner: i_emd_owner
The offset in this EMD file of the entry: pos
For directory, we also have a reference to the inode of its
own EMD file. Also, we have dir_locking_info to help synchronise
file creation and file lookup. This data is sharing space with
the pipe_inode_info not used by directory. See also msdos_fs_i.h
for more information about pipe_inode_info and msdos_inode_info.
Special file and fifo do have an inode which correspond to an
empty MSDOS file.
symlink are processed mostly like regular file. The content is the
link.
fifos add there own extension to the inode. I have reserved some
space for fifos side by side with msdos_inode_info. This is just
to for the show, because msdos_inode_info already include the
pipe_inode_info.
The UMSDOS specific extension is placed after the union.
[include/umsdos_fs_i.h,10]
#Specification: strategy / in memory inode
Here is the information specific to the inode of the UMSDOS file
system. This information is added to the end of the standard struct
inode. Each file system has its own extension to struct inode,
so do the umsdos file system.
The strategy is to have the umsdos_inode_info as a superset of
the msdos_inode_info, since most of the time the job is done
by the msdos fs code.
So we duplicate the msdos_inode_info, and add our own info at the
end.
For all file type (and directory) the inode has a reference to:
the directory which hold this entry: i_dir_owner
The EMD file of i_dir_owner: i_emd_owner
The offset in this EMD file of the entry: pos
For directory, we also have a reference to the inode of its
own EMD file. Also, we have dir_locking_info to help synchronise
file creation and file lookup. This data is sharing space with
the pipe_inode_info not used by directory. See also msdos_fs_i.h
for more information about pipe_inode_info and msdos_inode_info.
Special file and fifo do have an inode which correspond to an
empty MSDOS file.
symlink are processed mostly like regular file. The content is the
link.
fifos add there own extension to the inode. I have reserved some
space for fifos side by side with msdos_inode_info. This is just
5
to for the show, because msdos_inode_info already include the
pipe_inode_info.
The UMSDOS specific extension is placed after the union.
[umsdos/emd.c,146]
#Specification: EMD file / spare bytes
10 bytes are unused in each record of the EMD. They
are set to 0 all the time. So it will be possible
to do new stuff and rely on the state of those
bytes in old EMD file around.
6 Hard links
[umsdos/namei.c,480]
#Specification: hard link / strategy
Well ... hard link are difficult to implement on top of an
MsDOS fat file system. Unlike UNIX file systems, there are no
inode. A directory entry hold the functionnality of the inode
and the entry.
We will used the same strategy as a normal Unix file system
(with inode) except we will do it symbolicly (using paths).
Because anything can happen during a DOS session (defragment,
directory sorting, etc...), we can't rely on MsDOS pseudo
inode number to record the link. For this reason, the link
will be done using hidden symbolic links. The following
scenario illustrate how it work.
Given a file /foo/file
ln /foo/file /tmp/file2
become internally
mv /foo/file /foo/-LINK1
ln -s /foo/-LINK1 /foo/file
ln -s /foo/-LINK1 /tmp/file2
Using this strategy, we can operate on /foo/file or /foo/file2.
We can remove one and keep the other, like a normal Unix hard link.
We can rename /foo/file ou /tmp/file2 independantly.
The entry -LINK1 will be hidden. It will hold a link count.
When all link are erased, the hidden file is erased too.
[umsdos/namei.c,546]
#Specification: hard link / directory
A hard link can't be made on a directory. EPERM is returned
in this case.
[umsdos/emd.c,355]
6
#Specification: hard link / hidden name
When a hard link is created, the original file is renamed
to a hidden name. The name is "..LINKNNN" where NNN is a
number define from the entry offset in the EMD file.
[umsdos/namei.c,565]
#Specification: hard link / first hard link
The first time a hard link is done on a file, this
file must be renamed and hidden. Then an internal
simbolic link must be done on the hidden file.
The second link is done after on this hidden file.
It is expected that the Linux MSDOS file system
keeps the same pseudo inode when a rename operation
is done on a file in the same directory.
[umsdos/namei.c,907]
#Specification: hard link / deleting a link
When we deletes a file, and this file is a link
we must substract 1 to the nlink field of the
hidden link.
If the count goes to 0, we delete this hidden
link too.
7 Symbolic links
[umsdos/namei.c,418]
#Specification: symbolic links / strategy
A symbolic link is simply a file which hold a path. It is
implemented as a normal MSDOS file (not very space efficient :-()
I see 2 different way to do it. One is to place the link data
in unused entry of the EMD file. The other is to have a separate
file dedicated to hold all symbolic links data.
Lets go for simplicity...
8 Special file
[umsdos/namei.c,732]
#Specification: Special files / strategy
Device special file, pipes, etc ... are created like normal
file in the msdos file system. Of course they remain empty.
One strategy was to create thoses files only in the EMD file
since they were not important for MSDOS. The problem with
that, is that there were not getting inode number allocated.
7
The MSDOS filesystems is playing a nice game to fake inode
number, so why not use it.
The absence of inode number compatible with those allocated
for ordinary files was causing major trouble with hard link
in particular and other parts of the kernel I guess.
9 Pseudo root
[umsdos/inode.c,401]
#Specification: pseudo root / mount
When a umsdos fs is mounted, a special handling is done
if it is the root partition. We check for the presence
of the file /linux/etc/init or /linux/etc/rc.
If one is there, we do a chroot("/linux").
We check both because (see init/main.c) the kernel
try to exec init at different place and if it fails
it tries /bin/sh /etc/rc. To be consistent with
init/main.c, many more test would have to be done
to locate init. Any complain ?
The chroot is done manually in init/main.c but the
info (the inode) is located at mount time and store
in a global variable (pseudo_root) which is used at
different place in the umsdos driver. There is no
need to store this variable elsewhere because it
will always be one, not one per mount.
This feature allows the installation
of a linux system within a DOS system in a subdirectory.
A user may install its linux stuff in c:\linux
avoiding any clash with existing DOS file and subdirectory.
When linux boots, it hides this fact, showing a normal
root directory with /etc /bin /tmp ...
The word "linux" is hardcoded in /usr/include/linux/umsdos_fs.h
in the macro UMSDOS_PSDROOT_NAME.
[umsdos/dir.c,65]
#Specification: pseudo root / directory /DOS
When umsdos operates in pseudo root mode (C:\linux is the
linux root), it simulate a directory /DOS which points to
the real root of the file system.
[umsdos/dir.c,480]
#Specification: pseudo root / DOS hard coded
The pseudo sub-directory DOS in the pseudo root is hard coded.
The name is DOS. This is done this way to help standardised
the umsdos layout. The idea is that from now on /DOS is
a reserved path and nobody will think of using such a path
for a package.
8
[umsdos/dir.c,511]
#Specification: pseudo root / .. in real root
Whenever a lookup is those in the real root for
the directory .., and pseudo root is active, the
pseudo root is returned.
[umsdos/namei.c,166]
#Specification: pseudo root / any file creation /DOS
The pseudo sub-directory /DOS can't be created!
EEXIST is returned.
The pseudo sub-directory /DOS can't be removed!
EPERM is returned.
[umsdos/dir.c,580]
#Specification: pseudo root / dir lookup
For the same reason as readdir, a lookup in /DOS for
the pseudo root directory (linux) will fail.
[umsdos/rdir.c,79]
#Specification: pseudo root / DOS/..
In the real root directory (c:\), the directory ..
is the pseudo root (c:\linux).
[umsdos/rdir.c,88]
#Specification: pseudo root / DOS/linux
Even in the real root directory (c:\), the directory
/linux won't show
[umsdos/dir.c,542]
#Specification: pseudo root / lookup(DOS)
A lookup of DOS in the pseudo root will always succeed
and return the inode of the real root.
[umsdos/dir.c,165]
#Specification: pseudo root / reading real root
The pseudo root (/linux) is logically
erased from the real root. This mean that
ls /DOS, won't show "linux". This avoids
infinite recursion /DOS/linux/DOS/linux while
walking the file system.
[umsdos/rdir.c,127]
#Specification: pseudo root / rmdir /DOS
The pseudo sub-directory /DOS can't be removed!
This is done even if the pseudo root is not a Umsdos
directory anymore (very unlikely), but an accident (under
MsDOS) is always possible.
EPERM is returned.
9
10 Dual mode
[umsdos/rdir.c,177]
#Specification: dual mode / introduction
One goal of UMSDOS is to allow a practical and simple coexistence
between MsDOS and Linux in a single partition. Using the EMD file
in each directory, UMSDOS add Unix semantics and capabilities to
normal DOS file system. To help and simplify coexistence, here is
the logic related to the EMD file.
If it is missing, then the directory is managed by the MsDOS driver.
The names are limited to DOS limits (8.3). No links, no device special
and pipe and so on.
If it is there, it is the directory. If it is there but empty, then
the directory looks empty. The utility umssync allows synchronisation
of the real DOS directory and the EMD.
Whenever umssync is applied to a directory without EMD, one is
created on the fly. The directory is promoted to full unix semantic.
Of course, the ls command will show exactly the same content as before
the umssync session.
It is believed that the user/admin will promote directories to unix
semantic as needed.
The strategy to implement this is to use two function table (struct
inode_operations). One for true UMSDOS directory and one for directory
with missing EMD.
Functions related to the DOS semantic (but aware of UMSDOS) generally
have a "r" prefix (r for real) such as UMSDOS_rlookup, to differentiate
from the one with full UMSDOS semantic.
[umsdos/rdir.c,111]
#Specification: dual mode / rmdir in a DOS directory
In a DOS (not EMD in it) directory, we use a reverse strategy
compared with an Umsdos directory. We assume that a subdirectory
of a DOS directory is also a DOS directory. This is not always
true (umssync may be used anywhere), but make sense.
So we call msdos_rmdir() directly. If it failed with a -ENOTEMPTY
then we check if it is a Umsdos directory. We check if it is
really empty (only . .. and --linux-.--- in it). If it is true
we remove the EMD and do a msdos_rmdir() again.
In a Umsdos directory, we assume all subdirectory are also
Umsdos directory, so we check the EMD file first.
[umsdos/namei.c,693]
#Specification: mkdir / umsdos directory / create EMD
When we created a new sub-directory in a UMSDOS
directory (one with full UMSDOS semantic), we
10
create immediatly an EMD file in the new
sub-directory so it inherit UMSDOS semantic.
11 Miscellaneous
11.1 UMSDOS_create
[umsdos/namei.c,55]
#Specification: file creation / not atomic
File creation is a two step process. First we create (allocate)
an entry in the EMD file and then (using the entry offset) we
build a unique name for MSDOS. We create this name in the msdos
space.
We have to use semaphore (sleep_on/wake_up) to prevent lookup
into a directory when we create a file or directory and to
prevent creation while a lookup is going on. Since many lookup
may happen at the same time, the semaphore is a counter.
Only one creation is allowed at the same time. This protection
may not be necessary. The problem arise mainly when a lookup
or a readdir is done while a file is partially created. The
lookup process see that as a "normal" problem and silently
erase the file from the EMD file. Normal because a file
may be erased during a MSDOS session, but not removed from
the EMD file.
The locking is done on a directory per directory basis. Each
directory inode has its wait_queue.
For some operation like hard link, things even get worse. Many
creation must occur at once (atomic). To simplify the design
a process is allowed to recursivly lock the directory for
creation. The pid of the locking process is kept along with
a counter so a second level of locking is granted or not.
[umsdos/namei.c,177]
#Specification: create / . and ..
If one try to creates . or .., it always fail and return
EEXIST.
If one try to delete . or .., it always fail and return
EPERM.
This should be test at the VFS layer level to avoid
duplicating this in all file systems. Any comments ?
11.2 UMSDOS_ioctl_dir
[umsdos/ioctl.c,29]
#Specification: ioctl / acces
11
Only root (effective id) is allowed to do IOCTL on directory
in UMSDOS. EPERM is returned for other user.
[umsdos/ioctl.c,37]
#Specification: ioctl / prototypes
The official prototype for the umsdos ioctl on directory
is:
int ioctl (
int fd, // File handle of the directory
int cmd, // command
struct umsdos_ioctl *data)
The struct and the commands are defined in linux/umsdos_fs.h.
umsdos_progs/umsdosio.c provide an interface in C++ to all
these ioctl. umsdos_progs/udosctl is a small utility showing
all this.
These ioctl generally allow one to work on the EMD or the
DOS directory independantly. These are essential to implement
the synchroniser.
[umsdos/ioctl.c,58]
#Specification: ioctl / UMSDOS_GETVERSION
The field version and release of the structure
umsdos_ioctl are filled with the version and release
number of the fs code in the kernel. This will allow
some form of checking. Users won't be able to run
incompatible utility such as the synchroniser (umssync).
umsdos_progs/umsdosio.c enforce this checking.
Return always 0.
[umsdos/ioctl.c,72]
#Specification: ioctl / UMSDOS_READDIR_DOS
One entry is read from the DOS directory at the current
file position. The entry is put as is in the dos_dirent
field of struct umsdos_ioctl.
Return > 0 if success.
[umsdos/ioctl.c,193]
#Specification: ioctl / UMSDOS_RMDIR_DOS
The dos_dirent field of the struct umsdos_ioctl is used to
execute a msdos_unlink operation. The d_name and d_reclen
fields are used.
Return 0 if success.
[umsdos/ioctl.c,204]
#Specification: ioctl / UMSDOS_STAT_DOS
The dos_dirent field of the struct umsdos_ioctl is
12
used to execute a stat operation in the DOS directory.
The d_name and d_reclen fields are used.
The following field of umsdos_ioctl.stat are filled.
st_ino,st_mode,st_size,st_atime,st_mtime,st_ctime,
Return 0 if success.
[umsdos/ioctl.c,182]
#Specification: ioctl / UMSDOS_UNLINK_DOS
The dos_dirent field of the struct umsdos_ioctl is used to
execute a msdos_unlink operation. The d_name and d_reclen
fields are used.
Return 0 if success.
[umsdos/ioctl.c,147]
#Specification: ioctl / UMSDOS_CREAT_EMD
The umsdos_dirent field of the struct umsdos_ioctl is used
as is to create a new entry in the EMD of the directory.
The DOS directory is not modified.
No validation is done (yet).
Return 0 if success.
[umsdos/ioctl.c,81]
#Specification: ioctl / UMSDOS_READDIR_EMD
One entry is read from the EMD at the current
file position. The entry is put as is in the umsdos_dirent
field of struct umsdos_ioctl. The corresponding mangled
DOS entry name is put in the dos_dirent field.
All entries are read including hidden links. Blank
entries are skipped.
Return > 0 if success.
[umsdos/ioctl.c,164]
#Specification: ioctl / UMSDOS_UNLINK_EMD
The umsdos_dirent field of the struct umsdos_ioctl is used
as is to remove an entry from the EMD of the directory.
No validation is done (yet). The mode field is used
to validate S_ISDIR or S_ISREG.
Return 0 if success.
[umsdos/ioctl.c,228]
#Specification: ioctl / UMSDOS_DOS_SETUP
The UMSDOS_DOS_SETUP ioctl allow changing the
default permission of the MsDOS file system driver
on the fly. The MsDOS driver apply global permission
to every file and directory. Normally these permissions
are controlled by a mount option. This is not
13
available for root partition, so a special utility
(umssetup) is provided to do this, normally in
/etc/rc.local.
Be aware that this apply ONLY to MsDOS directory
(those without EMD --linux-.---). Umsdos directory
have independant (standard) permission for each
and every file.
The field umsdos_dirent provide the information needed.
umsdos_dirent.uid and gid sets the owner and group.
umsdos_dirent.mode set the permissions flags.
[umsdos/ioctl.c,124]
#Specification: ioctl / UMSDOS_INIT_EMD
The UMSDOS_INIT_EMD command make sure the EMD
exist for a directory. If it does not, it is
created. Also, it makes sure the directory functions
table (struct inode_operations) is set to the UMSDOS
semantic. This mean that umssync may be applied to
an "opened" msdos directory, and it will change behavior
on the fly.
Return 0 if success.
11.3 UMSDOS_lookup
[umsdos/dir.c,520]
#Specification: locating .. / strategy
We use the msdos filesystem to locate the parent directory.
But it is more complicated than that.
We have to step back even further to
get the parent of the parent, so we can get the EMD
of the parent of the parent. Using the EMD file, we can
locate all the info on the parent, such a permissions
and owner.
[umsdos/dir.c,556]
#Specification: umsdos / lookup
A lookup for a file is done in two step. First, we locate
the file in the EMD file. If not present, we return
an error code (-ENOENT). If it is there, we repeat the
operation on the msdos file system. If this fails, it means
that the file system is not in sync with the emd file.
We silently remove this entry from the emd file,
and return ENOENT.
[umsdos/dir.c,280]
#Specification: umsdos / lookup / inode info
After successfully reading an inode from the MSDOS
filesystem, we use the EMD file to complete it.
14
We update the following field.
uid, gid, atime, ctime, mtime, mode.
We rely on MSDOS for mtime. If the file
was modified during an MSDOS session, at least
mtime will be meaningful. We do this only for regular
file.
We don't rely on MSDOS for mtime for directory because
the MSDOS directory date is creation time (strange
MSDOS behavior) which fit nowhere in the three UNIX
time stamp.
[umsdos/dir.c,305]
#Specification: umsdos / i_nlink
The nlink field of an inode is maintain by the MSDOS file system
for directory and by UMSDOS for other file. The logic is that
MSDOS is already figuring out what to do for directories and
does nothing for other files. For MSDOS, there are no hard link
so all file carry nlink==1. UMSDOS use some info in the
EMD file to plug the correct value.
11.4 UMSDOS_notify_change
[umsdos/inode.c,331]
#Specification: notify_change / msdos fs
notify_change operation are done only on the
EMD file. The msdos fs is not even called.
[umsdos/inode.c,275]
#Specification: root inode / attributes
I don't know yet how this should work. Normally
the attributes (permissions bits, owner, times) of
a directory are stored in the EMD file of its parent.
One thing we could do is store the attributes of the root
inode in its own EMD file. A simple entry named "." could
be used for this special case. It would be read once
when the file system is mounted and update in
UMSDOS_notify_change() (right here).
I am not sure of the behavior of the root inode for
a real UNIX file system. For now, this is a nop.
[umsdos/inode.c,268]
#Specification: notify_change / i_nlink > 0
notify change is only done for inode with nlink > 0. An inode
with nlink == 0 is no longer associated with any entry in
the EMD file, so there is nothing to update.
15
11.5 UMSDOS_readdir
[umsdos/dir.c,128]
#Specification: umsdos / readdir
umsdos_readdir() should fill a struct dirent with
an inode number. The cheap way to get it is to
do a lookup in the MSDOS directory for each
entry processed by the readdir() function.
This is not very efficient, but very simple. The
other way around is to maintain a copy of the inode
number in the EMD file. This is a problem because
this has to be maintained in sync using tricks.
Remember that MSDOS (the OS) does not update the
modification time (mtime) of a directory. There is
no easy way to tell that a directory was modified
during a DOS session and synchronise the EMD file.
Suggestion welcome.
So the easy way is used!
[umsdos/dir.c,79]
#Specification: readdir / . and ..
The msdos filesystem manage the . and .. entry properly
so the EMD file won't hold any info about it.
In readdir, we assume that for the root directory
the read position will be 0 for ".", 1 for "..". For
a non root directory, the read position will be 0 for "."
and 32 for "..".
[umsdos/dir.c,200]
#Specification: umsdos / readdir / not in MSDOS
During a readdir operation, if the file is not
in the MSDOS directory anymore, the entry is
removed from the EMD file silently.
11.6 mount and UMSDOS_remount_fs
[umsdos/inode.c,380]
#Specification: mount / options
Umsdos run on top of msdos. Currently, it supports no
mount option, but happily pass all option received to
the msdos driver. I am not sure if all msdos mount option
make sens with Umsdos. Here are at least those who
are useful.
uid=
gid=
These options affect the operation of umsdos in directories
which do not have an EMD file. They behave like normal
16
msdos directory, with all limitation of msdos.
11.7 UMSDOS_rename
[umsdos/namei.c,326]
#Specification: rename / new name exist
If the destination name already exist, it will
silently be removed. EXT2 does it this way
and this is the spec of SUNOS. So does UMSDOS.
If the destination is an empty directory it will
also be removed.
11.8 Data structure
[umsdos/inode.c,174]
#Specification: inode / umsdos info
The first time an inode is seen (inode->i_count == 1),
the inode number of the EMD file which control this inode
is tagged to this inode. It allows operation such
as notify_change to be handled.
11.9 Inode management
[umsdos/inode.c,239]
#Specification: Inode / post initialisation
To completly initialise an inode, we need access to the owner
directory, so we can locate more info in the EMD file. This is
not available the first time the inode is access, we use
a value in the inode to tell if it has been finally initialised.
At first, we have tried testing i_count but it was causing
problem. It is possible that two or more process use the
newly accessed inode. While the first one block during
the initialisation (probably while reading the EMD file), the
others believe all is well because i_count > 1. They go banana
with a broken inode. See umsdos_lookup_patch and umsdos_patch_inode.
12 Synchronisation problems
[umsdos/namei.c,238]
#Specification: create / file exist in DOS
Here is a situation. Trying to create a file with
UMSDOS. The file is unknown to UMSDOS but already
exist in the DOS directory.
17
Here is what we are NOT doing:
We could silently assume that everything is fine
and allows the creation to succeed.
It is possible not all files in the partition
are mean to be visible from linux. By trying to create
those file in some directory, one user may get access
to those file without proper permissions. Looks like
a security hole to me. Off course sharing a file system
with DOS is some kind of security hole :-)
So ?
We return EEXIST in this case.
The same is true for directory creation.
[umsdos/namei.c,688]
#Specification: mkdir / Directory already exist in DOS
We do the same thing as for file creation.
For all user it is an error.
13 Convention and style
[umsdos/inode.c,352]
#Specification: function name / convention
A simple convention for function name has been used in
the UMSDOS file system. First all function use the prefix
umsdos_ to avoid name clash with other part of the kernel.
And standard VFS entry point use the prefix UMSDOS (upper case)
so it's easier to tell them apart.
[umsdos/namei.c,760]
#Specification: style / iput strategy
In the UMSDOS project, I am trying to apply a single
programming style regarding inode management. Many
entry point are receiving an inode to act on, and must
do an iput() as soon as they are finished with
the inode.
For simple case, there is no problem. When you introduce
error checking, you end up with many iput placed around the
code.
The coding style I use all around is one where I am trying
to provide independant flow logic (I don't know how to
name this). With this style, code is easier to understand
but you rapidly get iput() all around. Here is an exemple
of what I am trying to avoid.
if (a){
...
18
if(b){
...
}
...
if (c){
// Complexe state. Was b true ?
...
}
...
}
// Weird state
if (d){
// ...
}
// Was iput finally done ?
return status;
Here is the style I am using. Still sometime I do the
first when things are very simple (or very complicated :-( )
if (a){
if (b){
...
}else if (c){
// A single state gets here
}
}else if (d){
...
}
return status;
Again, while this help clarifying the code, I often get a lot
of iput(), unlike the first style, where I can place few
"strategic" iput(). "strategic" also mean, more difficult
to place.
So here is the style I will be using from now on in this project.
There is always an iput() at the end of a function (which has
to do an iput()). One iput by inode. There is also one iput()
at the places where a successful operation is achieved. This
iput() is often done by a sub-function (often from the msdos
file system). So I get one too many iput() ? At the place
where an iput() is done, the inode is simply nulled, disabling
the last one.
if (a){
if (b){
...
}else if (c){
msdos_rmdir(dir,...);
dir = NULL;
}
}else if (d){
...
}
iput (dir);
return status;
19
Note that the umsdos_lockcreate() and umsdos_unlockcreate() function
paire goes against this practice of "forgetting" the inode as soon
as possible.
[umsdos/inode.c,24]
#Specification: convention / PRINTK Printk and printk
Here is the convention for the use of printk inside fs/umsdos
printk carry important message (error or status).
Printk is for debugging (it is a macro defined at the beginning of
most source.
PRINTK is a nulled Printk macro.
This convention makes the source easier to read, and Printk easier
to shut off.
14 Weakness and features
The UMSDOS file system is somewhat a compromise. Here are
the drawback. -Space efficiency. The minimal allocation unit
is generally 2k. Also, the MsDOS FAT fs do not support
sparse file (file with gap of unallocated blocks). -General
performance. UMSDOS run piggy back on top of another FS.
This means two directory structure to maintain. -Maximum
number of files is limited to 64k. It should be a good fs
for many purpose, especially if you have to coexist with
DOS. UMSDOS is trying to emulate the UNIX semantics for file
system. Here are the known differences and weakness.
[umsdos/namei.c,512]
#Specification: weakness / hard link
The strategy for hard link introduces a side effect that
may or may not be acceptable. Here is the sequence
mkdir subdir1
touch subdir1/file
mkdir subdir2
ln subdir1/file subdir2/file
rm subdir1/file
rmdir subdir1
rmdir: subdir1: Directory not empty
This happen because there is an invisible file (--link) in
subdir1 which is referenced by subdir2/file.
Any idea ?
[umsdos/namei.c,529]
#Specification: weakness / hard link / rename directory
Another weakness of hard link come from the fact that
it is based on hidden symbolic links. Here is an example.
mkdir /subdir1
touch /subdir1/file
20
mkdir /subdir2
ln /subdir1/file subdir2/file
mv /subdir1 subdir3
ls -l /subdir2/file
Since /subdir2/file is a hidden symbolic link
to /subdir1/..hlinkNNN, accessing it will fail since
/subdir1 does not exist anymore (has been renamed).
[umsdos/namei.c,973]
#Specification: weakness / rename
There is a case where UMSDOS rename has a different behavior
than normal UNIX file system. Renaming an open file across
directory boundary does not work. Renaming an open file within
a directory does work however.
The problem (not sure) is in the linux VFS msdos driver.
I believe this is not a bug but a design feature, because
an inode number represent some sort of directory address
in the MSDOS directory structure. So moving the file into
another directory does not preserve the inode number.
15 utilities
Very little has been done here. Not much is missing though.
Look in the directory umsdos_progs.
15.1 The UMSDOS synchroniser
[umsdos_progs/linux/umssync.c,1]
#Specification: utility / synchroniser
The UMSDOS synchroniser (umssync) make sure that the EMD file
is in sync with the MSDOS directory. File created during a DOS
session should be add to the EMD. File removed should erased
from the EMD.
The UMSDOS file system will operate normally even if the
system is out of sync. However, files will be missing from
directory search, creating an annoying feeling.
There is no easy way this kind of update may be achieved by
UMSDOS transparently. Here are the reason:
This process take some time for each directory. If there were some
access time in MSDOS for directories, then, based on boot time, it would
be possible to do it once per directory. It is not the case.
When a file is discover in MSDOS which does not exist in the EMD, we
need some directives to properly map the file. At least the owner must
be known.
A set of ioctl are available (wrapper interface in
umsdos_progs/umsdosio.c) to allow independant manipulation of the EMD
and the DOS directory.
21
A utility is provided. It should be run from /etc/rc.
A man page (umssync.8) describe its options.
[umsdos_progs/linux/umssync.c,417]
#Specification: umssync / default creation mode
Unless override with command line option, file
and directory created by umssync will be owned
by root with mode 755 for directories and mode
644 for files.
[umsdos_progs/linux/umssync.c,427]
#Specification: umssync / depth
Normally, umssync won't recurse into directory.
Option -r allows for depth control. You may specify
how deep you want umssync to work.
When recursing into directory, umssync will use
the owner and group specified on the command line
(see option -g and -u). If option -i+ is specified
the specs of the sub-directory itself may be used.
umssync won't follow symlinks. And it won't cross mount
points.
[umsdos_progs/linux/umssync.c,84]
#Specification: umssync / mangled name
If a DOS file is missing from the EMD, it is added. If
the file has an extension with the first character
being a member of the restricted set for mangling,
the operation won't be done. A message will be printed.
To synchronise back into the EMD, the file must be renamed.
If one try to create a such a file with umsdos, it is
automaticly mangled, producing a different file name in DOS.
This is always done to avoid the following problem
Unix command MsDOS file name created
============ =======================
mkdir DIR dir.{__
mkdir dir.{__ dir.{_1
...
mkdir dir.{_1 dir.{10
Now, suppose that dir.{__ does not exist in the EMD. dir.{_1
do exist in DOS. If we try to create it in Umsdos, this
will create a mangled name. Mangling in based on the
entry offset in the EMD.
So if say dir.{_1 exist in DOS, but not in Umsdos, rename
it to anything (dir.111) and synchronise it in the EMD
with umssync.
[umsdos_progs/linux/umssync.c,359]
#Specification: umssync / mount point
22
umssync won't cross mount point. It means
you must specify each mount point separatly.
[umsdos_progs/linux/umssync.c,536]
#Specification: umssync / user mode
To execute umssync, the effective user id must be root.
It it possible to configure umssync to run setuid root.
In this case (when getuid() != geteuid()), umssync
show a special behavior: Options -d -f -g -i -u are not
available anymore. The inheriting mode is automaticly
activated. No way to desactivated.
A user should be able to umssync its own directory. If a user
apply umssync to a directory, all file uncovered will be given
to the owner of the directory with restrictive permissions
(600 for files, 700 for directory).
Another way would be to limit umssync operation to directory
which belong to the user. Suggestion welcome.
15.2 Other
[umsdos_progs/linux/udump.c,4]
#Specification: utilities / udump
udump display the content of a --linux-.--- file (EMD file).
Simply type:
udump file
This utility was mainly used to debug the UMSDOS file systems.
[umsdos_progs/linux/udosctl.c,12]
#Specification: umsdos_progs / udosctl
The udosctl utility give acces directly to UMSDOS ioctl on directory.
udosctl command arg
Here are the commands:
ls:
List the content of dos directory arg. Bypass the EMD file.
It uses UMSDOS_READDIR_DOS.
create:
Create the file arg in the EMD file. Do nothing on the DOS
directory. Use UMSDOS_CREAT_UMSDOS.
mkdir:
Create the directory arg in the EMD file. Do nothing on the DOS
directory. Use UMSDOS_CREAT_UMSDOS.
23
rm:
Remove the file arg in the DOS directory. Bypass the EMD file.
Use UMSDOS_UNLINK_DOS.
rmdir:
Remove the directory arg in the DOS directory. Bypass the EMD file.
Use UMSDOS_RMDIR_DOS.
uls:
List the content of the EMD and print the corresponding DOS
mangled name. It uses UMSDOS_READDIR_EMD.
urm:
Remove the file arg from the EMD file. Don't touch the DOS
directory. Use UMSDOS_UNLINK_UMSDOS.
urmdir:
Remove the directory arg from the EMD file. Don't touch the DOS
directory. Use UMSDOS_UNLINK_UMSDOS.
version:
Prints the version of the UMSDOS driver running.
This program was done mostly for illustration of ioctl use
and testing.
16 Test cases
The umsdos_progs/tests directory holds two utilities.
utstgen is a general test suite "UMSDOS independant". It
tests the general behavior of UMSDOS as a true UNIX-like
file system. utstspc is truely UMSDOS oriented. It (will)
tests the proper behavior of UMSDOS especially when the EMD
and the MSDOS directory are out of sync. Currently utstspc
is not testing much!
16.1 utstgen
[umsdos_progs/tests/utstgen.c,7]
#Specification: umsdos / automated test / general
utstgen.c is a sequence of test for the UMSDOS file system.
These test are not really specific to the UMSDOS file system.
You will find extensive testing of some stuff which are specific
to the UMSDOS file system. There is a long section on hard link
which could hardly fail on a normal UNIX file system and were
a nightmare to implement in UMSDOS.
[umsdos_progs/tests/gen/hlink.c,108]
#Specification: utstgen / hard link / cases / across directory boundary
24
The target of the link is not in the same directory
as the new link.
[umsdos_progs/tests/gen/hlink.c,121]
#Specification: utstgen / hard link / cases / target does not exist
Many hard links are attempted to a file which
does not exist.
[umsdos_progs/tests/gen/hlink.c,127]
#Specification: utstgen / hard link / to a directory
A hard link can't be made to a directory.
[umsdos_progs/tests/gen/hlink.c,50]
#Specification: utstgen / hard links / case / link 2 link 2 link ...
hlink_simple does test a link made to a link made to a link
and so on. On a normal UNIX file system, this test is not
really an issue. Given the fact that a hardlink on UMSDOS is
a symlink to a hidden file, it make sense to test at least
the two cases:
hard link to an existing file with no link
hard link to an existing file with more than one link.
[umsdos_progs/tests/gen/multi.c,67]
#Specification: utstgen / multi task / basic test
A simple test is performed on a directory, by many task.
Only one task must succeeded at a time. The others must fail
with specific error code.
So we fork 10 time.
This test hopes it is a sufficient test :-(
[umsdos_progs/tests/gen/file.c,176]
#Specification: utstgen / Rename test
Rename test are done with files and directories.
The following case are tested
-In the same directory
-In the root directory
-In two independant directory
-In a subdirectory and the parent
The same test is also done on an open file.
[umsdos_progs/tests/gen/file.c,199]
#Specification: utstgen / Rename test / open file
Rename test is done on an open file. We do the following
sequence.
create a file
Open it
25
Rename it
Write to the open handle
Close it
Open the file using the news name
Read back the data and check it.
This test does not succeed if the file is renamed accross
directories. This sounds like a limitation of the linux msdos
driver. I am not sure at this point. I hope it is not
a critical feature of a Unix file system. Comments are welcome
about this topics. Comments with solution also :-)
[umsdos_progs/tests/gen/rename.c,104]
#Specification: utstgen / rename / destination exist
The following rename tests are done. The source is
always file1 and the destination file2 always exist.
file2 is a file or a directory. Here a the different
case.
file1 is a file, file2 is a file.
file1 is a file, file2 is a hard link to a file.
file1 is a file, file2 exist and is a empty directory.
file1 is a file, file2 exist and is a non empty directory.
file1 is a directory, file2 is a file.
file1 is a directory, file2 is a hard link to a file.
file1 is a directory, file2 exist and is a empty directory.
file1 is a directory, file2 exist and is a non empty directory.
This sequence is performed in the same directory and accros.
The following combination are tested.
path/dir1 -> path/dir1
path/dir1 -> path/dir2
path/dir1 -> path
path -> path/dir1
[umsdos_progs/tests/gen/syml.c,49]
#Specification: utstgen / symbolic links / link 2 link 2 link ...
syml_simple does test the number of connected symlink the
kernel can handle (A symlink pointing to another pointing
to another ... and finally pointing to something.
[umsdos_progs/tests/gen/dir.c,141]
#Specification: utstgen / creating . and ..
A check is done that the special entries . and .. can't be
created nor removed.
16.2 utstspc
[umsdos_progs/tests/utstspc.c,22]
#Specification: umsdos / automated test / specific
26
utstspc.c is a sequence of test for the UMSDOS file system.
These tests are specific to the UMSDOS file system.
[umsdos_progs/tests/utstspc.c,33]
#Specification: utstspc / default environnement
utstspc needs to start from a fresh partition (it reformats it).
So we normally use it on a floppy. utstspc do mount and umount
of that floppy. The default mount point is /mnt and the default
drive (for mformat) is F:.
This value F: looks very odd. This comes from my own setup.
Here is my definition for /etc/mtools
A /dev/fd0 12 0 0 0
B /dev/fd1 12 0 0 0
E /dev/fd0h1200 12 80 2 15 # A: 5 1/4
F /dev/fd1H1440 12 80 2 18 # B: 3 1/2
Using /dev/fd0 and /dev/fd1 on A and B, this gives me flexibility
for normal operation. I can read any type of floppy without much
question. I can't do a mformat A: or B:. I always get an error.
I guess /dev/fd0 are flexible driver, so do not impose a format.
The entries E and F replicate A and B but this time use the specific
device. So I can do a "mformat f:" and expect to format a 1.44
3 1/2 floppy correctly.
Of course if you know better, please tell me!
[umsdos_progs/tests/utstspc.c,111]
#Specification: utstspc / floppy only
To avoid desaster, utstspc will only work on floppy.
A test is done before everything to ensure that
the drive is indeed a floppy.
It use /etc/mtools to locate the proper device. It also
assume a floppy device have a path starting with
"/dev/fd". This is not fool proof!
[umsdos_progs/tests/utilspc.c,86]
#Specification: utstspc / what's needed
utstspc use different other program to achieve its test.
It has to reformat, mount, unmount etc... the floppy
on which it is doing the test.
utstspc assume that the following utility are available.
/usr/bin/mformat
/etc/mount or /bin/mount
/etc/umount or /bin/umount
Also, it requieres /etc/mtools