home *** CD-ROM | disk | FTP | other *** search
- % $Id: overview.tex,v 5.2 90/06/23 22:21:50 jsp Rel $
- %
- % Copyright (c) 1989 Jan-Simon Pendry
- % Copyright (c) 1989 Imperial College of Science, Technology & Medicine
- % Copyright (c) 1989 The Regents of the University of California.
- % All rights reserved.
- %
- % This code is derived from software contributed to Berkeley by
- % Jan-Simon Pendry at Imperial College, London.
- %
- % Redistribution and use in source and binary forms are permitted provided
- % that: (1) source distributions retain this entire copyright notice and
- % comment, and (2) distributions including binaries display the following
- % acknowledgement: ``This product includes software developed by the
- % University of California, Berkeley and its contributors'' in the
- % documentation or other materials provided with the distribution and in
- % all advertising materials mentioning features or use of this software.
- % Neither the name of the University nor the names of its contributors may
- % be used to endorse or promote products derived from this software without
- % specific prior written permission.
- % THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
- % WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
- % MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
- %
- % %W% (Berkeley) %G%
-
-
- \Chapter{Overview}
- \pagenumbering{arabic}
-
- \Amd\ maintains a cache of mounted filesystems. Filesystems are {\em demand-mounted}
- when they are first referenced, and unmounted after a period of inactivity.
- \Amd\ may be used as a replacement for Sun's {\bf automount}(8)
- \cite{usenix:automounter,sun:automount} program.
- It contains no proprietary source code and has been ported
- to numerous flavours of \Unix\ (see table \ref{table:os},~p\pageref{table:os}).
-
- \Amd\ was designed as the basis for experimenting with filesystem
- layout and management. Although \amd\ has many direct applications it
- is loaded with additional features which have little practical use.
- At some point the infrequently used components may be removed to
- streamline the production system.
-
- %\Amd\ supports the notion of {\em replicated} filesystems by evaluating
- %each member of a list of possible filesystem locations in parallel.
- %\Amd\ checks that each cached mapping remains valid. Should a mapping be
- %lost -- such as happens when a fileserver goes down -- \amd\ automatically
- %selects a replacement should one be available.
-
- The fundamental concept behind \amd\ is the ability to separate the name used to refer to
- a file from the name used to refer to its physical storage location.
- This allows the same files to be accessed with the same name regardless of where
- in the network the name is used. This is very different from placing
- {\tt /n/hostname} in front of the pathname since that includes location
- dependent information which may change if files are moved to another
- machine.
- By placing the required mappings in a centrally administered database,
- filesystems can be re-organised without requiring changes to password
- files, shell scripts and so on.
-
- \Section{Filesystems and Volumes}
- \Amd\ views the world as a set of fileservers, each containg one or more filesystems
- where each filesystem contains one or more {\em volumes}.
- Here the term volume is used to refer to a coherent set of files such as a user's home directory or
- a \TeX\ distribution.
-
- In order to access the contents of a volume, \amd\ must be told in which filesystem
- the volume resides and which host owns the filesystem.
- By default the host is assumed to be local and the volume is
- assumed to be the entire filesystem.
- If a filesystem contains more than one volume, then a {\em sublink} is used to
- refer to the sub-directory within the filesystem where the volume can be found.
-
- \Section{Volume Naming}
-
- Volume names are assumed to be unique across the entire network.
- A volume name is the pathname to the volume's root as known by the
- users of that volume. Since this name uniquely identifies the volume contents,
- all volumes can be named and accessed from each host, subject to
- administrative controls.
-
- Volumes may be replicated or duplicated. Replicated volumes contain identical
- copies of the same data and reside at two or more locations in the network.
- Each of the replicated volumes can be used interchangeably.
- Duplicated volumes each have the same name but contain different, though
- functionally identical, data. For example, {\tt /vol/tex} might be the
- name of a \TeX\ distribution which varied for each machine architecture.
-
- \Amd\ provides facilities to take advantage of both replicated and
- duplicated volumes. Configuration options allow a single set of configuration
- data to be shared across an entire network by taking advantage of replicated
- and duplicated volumes.
-
- \Amd\ can take advantage of replacement volumes by mounting
- them as required should an active fileserver become unavailable.
-
- \Section{Volume Binding}
-
- \Unix\ implements a namespace of hierarchically mounted filesystems.
- Two forms of binding between names and files are provided.
- A {\em hard link} completes the binding when the name is added to the filesystem.
- A {\em soft link} delays the binding until the name is accessed.
- An {\em automounter} adds a further form in which the binding of name to
- filesystem is delayed until the name is accessed.
-
- The target volume, in its general form, is a tuple (host, filesystem, sublink)
- which can be used to name the physical location of any volume in
- the network.
-
- When a target is referenced, \amd\ ignores the sublink element and determines
- whether the required filesystem is already mounted. This is done by computing
- the local mount point for the filesystem and checking for an existing filesystem
- mounted at the same place. If such a filesystem already exists then it is
- assumed to be functionally identical to the target filesystem. By default
- there is a one-to-one mapping between the pair (host, filesystem) and the local
- mount point so this assumption is valid.
-
- \Section{Operational Principles}
-
- \Amd\ operates by introducing new mount points into the namespace.
- The kernel sees these mount points as \NFS\ \cite{sun:nfs} filesystems being served by \amd.
- Having attached itself to the namespace, \amd\ is now able to control
- the view the rest of the system has of those mount points.
- RPC \cite{sun:rpc} calls are received from the kernel one at a time.
-
- When a {\em lookup} call is received \amd\ checks whether the
- name is already known. If it is not, the required volume is mounted.
- A symbolic link pointing to the volume root is then returned.
- Once the symbolic link is returned, the kernel will send all
- other requests direct to the mounted filesystem.
-
- If a volume is not yet mounted, \amd\ consults a configuration
- {\em mount-map} corresponding to the automount point.
- \Amd\ then makes a runtime decision on what and where to mount
- a filesystem based on the information obtained from the map.
-
- \Amd\ does not implement all the \NFS\ requests; only those
- relevant to name binding such as {\em lookup}, {\em readlink}
- and {\em readdir}. Some other calls are also implemented
- but most simply return an error code; for example {\em mkdir}
- always returns ``Read-only filesystem''.
-
- \Section{Mounting a Volume}
-
- Each automount point has a mount map. The mount map contains
- a list of key--value pairs. The key is the name of the volume to
- be mounted. The value is a list of locations describing where the
- filesystem is stored in the network.
- In the source for the map the value would look like
- \begin{quote}
- ${\em location}_1\ \ {\em location}_2\ \ \ldots\ \ {\em location}_n$
- \end{quote}
-
- \Amd\ examines each location in turn. Each location may contain {\em selectors}
- which control whether \amd\ can use that location. For example, the location
- may be restricted to use by certain hosts. Those locations which cannot be used
- are ignored.
-
- \Amd\ attempts to mount the filesystem described by each remaining location
- until a mount succeeds or \amd\ can no longer proceed.
- The latter can occur in three ways:
- \begin{itemize}
- \item
- If none of
- the locations could be used, or if all of the locations caused an error,
- then the last error is returned.
-
- \item
- If a location could be used but was being mounted in the background then \amd\ marks
- that mount as being ``in progress'' and continues with the next request; no reply
- is sent to the kernel.
-
- \item
- Lastly, one or more of the mounts may have been {\em deferred}.
- A mount is deferred if extra information is required before the mount
- can proceed. When the information becomes available the mount will
- take place, but in the mean time no reply is sent to the kernel.
- If the mount is deferred, \amd\ continues to try any remaining locations.
- \end{itemize}
-
- %\Section{Task Scheduling}\label{task scheduler}
- %
- %\Amd\ provides a task scheduler to support its non-blocking semantics.
- %The basic operation of the scheduler is to call a procedure when
- %a particular event occurs. A general sleep/wakeup mechanism is used
- %and sub-process support is built on that. The scheduler maintains
- %two queues: one of blocked calls and one of callbacks waiting to
- %be made.
- %When a child process exits, its exit status is picked up by a signal
- %handler and a wakeup is issued on the internal job descriptor for that sub-process.
- %A timeout/untimeout mechanism provides for time dependent processing.
-
- \Section{Automatic Unmounting}
-
- To avoid an ever increasing number of filesystem mounts, \amd\ removes
- volume mappings which have not been used recently. A time-to-live interval
- is associated with each mapping and when that expires the mapping is removed.
- When the last reference to a filesystem is removed, that filesystem is unmounted.
- If the unmount fails, for example the filesystem is still busy, the mapping
- is re-instated and its time-to-live interval is extended.
- The global default for this grace period is controlled by the ``-w'' command-line
- option (\see \Ref{opt:wait}). It is also possible to set this value on a per-mount basis
- (\see \Ref{opt:utimeout}).
-
- \Section{Keep-alives}\label{keepalives}
-
- Use of some filesystem types requires the presence of a server on another machine.
- If a machine crashes then it is of no concern to processes on that machine
- that the filesystem is unavailable. However, to processes on a remote host using
- that machine as a fileserver this event is important. This situation is
- most widely recognised when an \NFS\ server crashes and the behaviour observed
- on client machines is that more and more processes hang.
- In order to provide the possibility of recovery, \amd\ implements a {\em keep-alive}
- interval timer for some filesystem types.
- Currently only \NFS\ makes use of this service.
-
- The basis of the \NFS\ keep-alive implementation is the observation that
- most sites maintain replicated copies of common system data such as manual
- pages, most or all programs, system source code and so on.
- If one of those servers goes down it would be reasonable to mount one of
- the others as a replacement.
-
- The first part of the process is to keep track of which fileservers are up and
- which are down. \Amd\ does this by sending RPC requests to the servers'
- \NFS\ {\sc NullProc} and checking whether a reply is returned.
- While the server state is uncertain the requests are re-transmitted
- at three second intervals and if no reply is received after four attempts
- the server is marked down. If a reply is received the fileserver is marked
- up and stays in that state for 30 seconds at which time another \NFS\ ping is sent.
-
- Once a fileserver is marked down, requests continue to be sent every 30 seconds
- in order to determine when the fileserver comes back up. During this time
- any reference through \amd\ to the filesystems on that server fail with the
- error ``Operation would block''.
- If a replacement volume is available then it will be mounted, otherwise
- the error is returned to the user.
-
- %\Amd\ keeps track of which servers are up and which are down.
- %It does this by sending RPC requests to the servers' \NFS\ {\sc NullProc} and
- %checking whether a reply is returned. If no replies are received after a
- %short period, \amd\ marks the fileserver {\em down}.
- %RPC requests continue to be sent so that it will notice when a fileserver
- %comes back up.
- %ICMP echo packets \cite{rfc:icmp} are not used because it is the availability
- %of the \NFS\ service that is important, not the existence of a base kernel.
-
- %Whenever a reference to a fileserver which is down is made via \amd\, an alternate
- %filesystem is mounted if one is available.
- Although this action does not protect
- user files, which are unique on the network, or processes which do not access files
- via \amd\ or already have open files on the hung filesystem, it can prevent most new
- processes from hanging.
-
- %With a suitable combination of filesystem management and mount-maps,
- %machines can be protected against most server downtime. This can be
- %enhanced by allocating boot-servers dynamically which allows a diskless
- %workstation to be quickly restarted if necessary. Once the root filesystem
- %is mounted, \amd\ can be started and allowed to mount the remainder of
- %the filesystem from whichever fileservers are available.
-
- \Section{Non-blocking Operation}
-
- Since there is only one instance of \amd\ for each automount point,
- and usually only one instance on each machine, it is important
- that it is always available to service kernel calls.
- \Amd\ goes to great lengths to ensure that it does not block in a system call.
- As a last resort \amd\ will fork before it attempts a system call that may block
- indefinitely, such as mounting an \NFS\ filesystem.
- Other tasks such as obtaining filehandle information for an \NFS\ filesystem,
- are done using a purpose built non-blocking RPC library which is integrated
- with \amd's task scheduler.% (\see \Ref{task scheduler}).
- This library is also used to implement \NFS\ keep-alives (\see \Ref{keepalives}).
-
- Whenever a mount is deferred or backgrounded, \amd\ must wait for it to complete
- before replying to the kernel. However, this would cause \amd\ to block waiting
- for a reply to be constructed. Rather than do this, \amd\ simply {\em drops}
- the call under the assumption that the kernel RPC mechanism will automatically
- retry the request.
-