LinuxTag 2002 Konferenz-CD-ROM  
[Hauptseite][Vortr臠e][Bcher][History][Software][Sponsoren] [Abspann]

Grußwort von Dr. Müller
Die Vorträge

UMLinux - A Tool for Testing a Linux System's Fault Tolerance
von Hans-Joerg Hoexer
Universit舩 Erlangen-Nrnberg

OTHER AUTHORS: Kerstin Buchacker, Volkmar Sieh All authors are participating in the UMLinux-Project at the University of Erlangen.

When setting up servers it would often be nice to know how these systems will react to hardware failures, such as a defective disk drive, memory chip, or network interface, or to a simple power failure. Will data be lost or corrupted, or will the system simply not be accessible for clients for some time? The silent corruption of data without error messages, for example, is a worst-case scenario for database systems.

It would be nice to be able to test whether a system designed to continue delivering services even in the presence of faults will in fact do so.

To help answer the above questions, we have implemented UMLinux, a User Mode Linux which can be used for realistic fault injection experiments. In order to simulate reality as closely as possible, our UMLinux implements kernel memory protection and runs the complete virtual machine (including operating system and all processes) as a single process on the (real) host. Of course UMLinux is binary-compatible with the host, so all binaries which run on the host also run on UMLinux without recompilation.

The system we want to examine is set up using virtual UMLinux machines. For most Linux distributions, we can use the out-of-the-box installation routine to install the virtual machine directly from cdrom. The virtual hardware, i.e. the memory size, the hard disk, floppy disk and CD-ROM drives, and the network interfaces can be configured freely within the limits imposed by the resources available on the host.

When the virtual server system is up and running, the fault injector can be configured to inject faults into the virtual hardware. We can currently inject bit-flips into CPU registers and main memory, defective bytes into any kind of block device, and network send and receive errors.

The whole setup is currently controlled from a graphical user front end. We are currently working to implement a script-driven automatic experiment controller.

Material zum Vortrag: PS PDF

  LinuxTag 2002 Konferenz-CD-ROM © 2002 LinuxTag e.V.