WWC snapshot of http://www.alw.nih.gov/WWW/ALW-brochure.html taken on Sat Jun 10 17:45:12 1995

Advanced Laboratory Workstation System



INTRODUCTION

The National Institute of Health's Advanced Laboratory Workstation (ALW) System is a general-purpose, open, distributed computing system currently consisting of about 200 client UNIX* workstations, 10 fileservers providing almost 300GB of disk storage, and over 400 registered users. The system was developed by the NIH Division of Computer Research and Technology (DCRT), and has been in production use since June, 1991.

All Advanced Laboratory Workstations are interconnected by the NIH campus-wide network, which they use to share resources and access services such as file backup, software maintenance, applications software, online documentation, international electronic mail and news, computation and database servers, laser printers, and an international distributed file system. ALWs are particularly suitable for scientific applications requiring high-performance computing or graphics, or access to large amounts of data. The most popular such applications include medical image processing, DNA and protein sequencing and searching, statistical analysis, and molecular graphics and modeling.

The ALW System won the 1992 Best in Open Systems Solutions (BOSS) award in the Innovation in Hardware, Software, and Networking Approaches category. This award is conferred annually by the Federal Computer Conference and the Government Open Systems Solutions Council to recognize government agencies that have best applied open systems technology.

SYSTEM OVERVIEW

ALWs offer the best features of both personal computers and mainframe timesharing systems. Like a personal computer, a user purchases a workstation for use in the office or lab, and configures it with memory, disk, and peripherals as dictated by the user's needs and budget. An entry-level ALW costs less than $8,000. Nearly all computation is performed on the workstation, so users are not competing with other users for processor time and memory, as would occur with a timesharing system. Also like a personal computer, the workstation has a high-speed display that supports an interactive graphical user interface.

Distributed File System

However, like a timesharing system, files and software are centrally administered. Master copies of the data and program files needed to use an ALW are stored on special-purpose computers called file servers that DCRT operates. Whenever a user needs to access a particular file, the workstation uses the network to contact the file server where the master copy resides, and makes a copy of the file on the local disk. If a user changes or creates a file on the local disk, the workstation sends a copy to a file server, which creates or replaces the master copy. The workstation keeps track of which files it has copied locally, and uses the local copy whenever possible to avoid unnecessary transfers over the network, thus improving performance. This all happens automatically, thereby creating the illusion that all files are stored on the workstation on a gigantic disk, when in reality they reside on file servers elsewhere on the network-perhaps even on the other side of the globe! The software that performs this service is called a distributed file system.

Storing files in the distributed file system rather than on the workstation's local disk offers many advantages: daily file backup, higher security, automatic software maintenance, file sharing, and user mobility. The following sections describe these in greater detail.

Daily file backup

The ALW operators back up master copies of user files to tape every working day, thereby protecting users from losing more than one day's work in the event of a disk disaster. Because the file servers operate 24 hours a day, 7 days a week, the operators can back up a user's files even if the user's workstation is powered off and locked in an office.

The distributed file system keeps a read-only copy of the previous day's version of each user's files online. This allows users to easily recover a file that may have been inadvertently deleted or modified.

Security

The file servers reside in a locked machine room, and are attended by operators most of the day. Physical access to the machines is thus limited to a small number of authorized personnel. Also, since the file servers are specialized systems, they do not run software such as that for handling electronic mail and file transfer that intruders typically use to gain unauthorized access to files via the network.

ALW staff routinely monitor ALWs for potential security vulnerabilities, and regularly install and distribute software security updates.

Automatic software maintenance

Users rarely need to install or update the software that runs on their ALW. Master copies of all software reside in the distributed file system, and ALWs either run it directly from there, or, in the case of critical system software needed to get the workstation started, copy it when necessary from the distributed file system to the workstation's local disk at boot time. The ALW System's staff maintains all software centrally, with upgrades distributed to all workstations automatically via the network.

File sharing

The distributed file system makes it easy for users to share programs and data with other users if so desired. For example, a user can specify which other users or groups of users are allowed to read or write the files in a particular directory by placing their names on the Access Control List (ACL) for that directory. Users can also define their own user groups, and control who is a member. This is extremely useful when a group of people wish to collaborate on a project.

User mobility

The distributed file system also makes it possible for users to access their files from virtually any UNIX machine connected to the NIH network. They can thus conveniently work on several machines without needing to explicitly move files from machine to machine or keep track of where the most recent version of a file resides. Unlike a personal computer, a user's customized environment moves with the user-the preferred "look and feel," which application programs to start initially and the size and placement of windows on the display, the contents and structure of various menus, and hundreds of other parameters. These characteristics make ALWs ideal for shared or public use.

Network Graphics System

ALWs run the X Window System* to provide a "desktop" graphical, mouse and menu-operated user interface similar to that of the Apple Macintosh* or Microsoft Windows*. However, the X Window System offers an extra dimension over its personal computer counterparts: the application program controlling a window need not be running on the same computer to which the display is connected, but may instead run on another machine and control the window via the network. For example, a user may have one window on his ALW display running a text editor or drawing program locally, another command window running remotely on DCRT's Convex* supercomputer, and a third window displaying the results of a visualization program running remotely on another workstation. A user can select and "cut" text from a Convex window, and "paste" it into the local text editor window just as if both programs were running locally. The window system moves the information across the network automatically. Thus, the X Window System enables users to conveniently access and coordinate applications running on different machines. This makes ALWs ideal for use as front ends to mainframes or special-purpose systems such as DCRT's highly parallel computer.

Electronic Mail

ALW users can send electronic mail around the world. The electronic mail system is connected to the Internet, and gatewayed to most other mail systems, such as BITNET, 3COM, Microsoft*, DECnet, and UUCP.

Optionally, users can exchange multi-media mail with other ALW users via the Andrew Message System (AMS). In addition to plain text, AMS multi-media mail can contain multi-font text, equations, tables, graphics, images, animations, and sound.

Electronic News

The distributed file system gathers and stores thousands of news articles from all over the globe each day. These are organized into nearly one thousand news groups covering topics from "bionet.agroforestry" to "vmsnet.networks.management." Users can browse these articles, and also post articles for distribution to tens of thousands of sites around the world. Users can also arrange with ALW administration to establish NIH-wide newsgroups on topics of local interest.

Applications Software

Major applications include: Analyze*, Asterix*, AVS*, , FrameMaker*, GlobalView*, Mathematica*, MATLAB*, SAS*, S-PLUS*, Synchronize*, and WordPerfect*.

TECHNICAL DESCRIPTION

The following sections describe the ALW System's technical characteristics.

Network

All ALWs communicate via the TCP/IP protocol suite over NIHnet, our campus network. NIHnet consists of a high-speed FDDI backbone, which interconnects 10 buildings on the NIH campus, plus T1 links to the remaining on-campus buildings and off-campus sites (about 30 buildings). Approximately 75 LANS (token ring, Ethernet, and FDDI) attach to the FDDI backbone, and 100 LANS attach via T1 links.

Distributed File System

A distinctive feature of our system is its use of the Andrew File System (AFS*) to provide distributed file services. The major advantages of AFS over alternatives such as NFS are scalability, ease of administration, and improved security. However, systems that run only NFS, such as Silicon Graphics* workstations and our Convex minisupercomputer, can still access AFS via the NFS/AFS translator that is included with AFS.

Our use of AFS also positions us perfectly for migration to OSF's Distributed Computing Environment (DCE*), since the next major version of AFS (AFS 4.0) was adopted as the distributed file system component for DCE.

Client Workstations

We currently support as client workstations Sun SPARCstations* running SunOS 4.1.3, DECstations* running ULTRIX* 4.3, the HP* 9000/700 series running HP/UX 9.0.1, and Silicon Graphics running IRIX* 4.0.5.

Most of our workstations are configured as dataless clients: the ALW's local disk contains only that software necessary to boot the machine and get it communicating on the network. The remaining local disk space is used for swap space, temporary files, and the AFS file cache. Since all the information on the local disk can be rapidly reconstructed from data held in the distributed file system, it does not need to be backed up, and users are relieved of this burden.

File Servers

We presently operate five SUN SPARCserver 1000s, one SPARCstation, and three DECstation 5000/240s as AFS servers. They provide a combined total of almost 300GB of storage. The servers are divided into two clusters, each located in a different building.

AFS makes a read-only clone of the files to be backed up early each day, and this "snapshot" is backed off to 8mm tape during the prime shift. We thus obtain a consistent backup of the file system without the necessity of shutting down the file servers.

The distributed file system retains the read-only clones online until the next morning. This allows users to easily restore the most recent version of files that they have accidentally deleted or modified without requiring operator assistance.

AFS permits us to replicate read-only data to improve system performance and reliability. We operate with two copies of the operating system and application program executables for each system type on physically different servers. This distributes access to these files across two machines, and eliminates a single point of failure: when a fileserver fails, only those users whose home directories reside on the failed machine are affected. Client workstations accessing replicated data dynamically switch to use the remaining accessible copy.

AFS can move collections of files "on-the-fly" between disks or servers. We use this capability to balance disk space utilization.

Security

User authentication is via Kerberos. Standard UNIX file access is augmented by directory-level Access Control Lists (ACLs). Users can define their own user groups, and control who is a member. This is extremely useful when a group of people wish to collaborate on a project.

Window System and User Interface

ALWs run the X Window System to provide a "desktop" graphical, mouse and menu-operated user interface. All supported system types run Motif, and we additionally provide OpenWindows on Suns.

Electronic Mail Delivery

Mail delivery is handled by the Andrew Messages Delivery System (AMDS), which utilizes AFS. Mail is delivered to a subdirectory in each user's home directory, where it can be easily accessed from any ALW and is backed up daily along with other user files. AMDS post office machines can be replicated for robustness and provide an SMTP gateway for connectivity to the rest of the world. ALW users can run SMTP if they prefer.

Software Distribution and Management

Distribution of updates to the system software stored on the local disks of client workstations is done at boot time via the package utility that is supplied with AFS. We have enhanced package to accommodate user customization of ALW configuration; for example, adding a local printer, extra disks, or special peripherals. We have found such customization to be a requirement in our research environment.

Applications software is managed by depot, a C program developed at Carnegie-Mellon University. Depot enables system administrators to more easily perform integrated testing and release engineering, simplifies search path management, permits control of application version configuration, and can selectively copy and update individual applications on an ALW's local disk. This last feature is crucial when configuring workstations that must operate when portions of the system are inaccessible; for example, machines used to restore files from backup tapes or run network diagnostics.

User Support

User support is via an online trouble report submission and tracking system and the 496-UNIX telephone hotline.

SUMMARY

ALWs are enabling technology. To an NIH researcher without access to local computer expertise and support, an ALW is the only practical means of obtaining this high level of computing power and storage capacity on the desktop. By relying on the network for file space and software support, we provide our users with true "plug & play" capability for a variety of UNIX workstations.

The attached article, which appeared in the August 18, 1992 issue of The NIH Record, describes the major benefits of the ALW System, a few of its more demanding scientific applications, and the highly favorable reaction of several users. We have achieved these results by using innovative, state-of-the-art software and hardware, particularly for the distributed file system and for software distribution and management.

FURTHER INFORMATION

Keith Gorlen
Chief, Distributed Systems Section
Computing Facilities Branch
Division of Computer Research and Technology
National Institutes of Health
Bethesda, MD 20892
Phone: (301) 496-1111
FAX: (301) 402-2867
E-mail: kgorlen@alw.nih.gov  


* TRADEMARKS


Comments to www-alw@alw.nih.gov

To ALW Home Page