home *** CD-ROM | disk | FTP | other *** search
- The following document is copyrighted, 1989, by Tim Sankary -
- all rights reserved. It may be copied and distributed freely as long
- as no changes are made and as long as this copyright notice remains
- with the document
-
- I want to preface this document with a personal statement. I
- am aware that Jim Goodwin has published a partial list of his virus
- disassemblies and I can imagine the controversy that will result. I
- do not have an inside track to the "truth" of this Distribute/Don't
- Distribute issue, and I can frankly see both sides of the argument. I
- find it hard, however, to censure a colleague who has performed such
- excellent and dedicated work as Jim has, and I have to admire his
- courage in taking such a controversial step. For those of you who
- anticipate writing or designing Identification and Removal programs
- (CVIA Class III programs) for viruses, I hope you will find something
- of value in the following study that will be useful. If you have
- access to disassemblies, this document may provide some insights into
- designing your own disinfectant.
- I would like to thank "Doc" John McAfee for his guidance and
- help in developing this paper, and the Computer Virus Industry
- Association for the outstanding visual aids that they contributed.
- These figures have been referenced in the paper but I have been unable
- to create ASCII representations of them for BBS distribution. If you
- obtained this document from an electronic source and would like a copy
- of the figures, they can be obtained by sending a stamped, self
- addressed envelope to the CVIA, 4423 Cheeney Street, Santa Clara, CA.
- 95054. - Tim Sankary
- From the Homebase BBS
- 408 988 4004
-
-
-
- DEVELOPING VIRUS IDENTIFICATION PRODUCTS
-
-
- In January of 1986, the world's first computer virus was
- unleashed upon an unsuspecting and largely defenseless population of
- global IBM personal computers users. The virus originated in Lahore,
- Pakistan, and spread rapidly from country to country through Europe
- and across to the North American Continent. In less than twelve
- months it had infected nearly a half-million computers and was causing
- minor havoc in hundreds of universities, corporations and government
- agencies.
- This virus, later dubbed the "Pakistani Brain", caught the
- user community unawares and the problems resulting from its many
- infections demonstrated how unprepared we were for this phenomenon.
- The computer systems targeted by the virus contained no specific
- hardware or software elements that could prevent or even slow its
- spread, and few utilities could even detect its presence after an
- infection occurrence. Fortunately, the virus was not destructive, and
- it limited its infections to floppy diskettes; avoiding hard disks
- entirely.
- The first defensive procedure developed to counteract this
- virus involved a simple visual inspection of a suspected diskette's
- volume serial label. The virus erased every infected diskette's
- volume label and replaced it with the character string - "@BRAIN".
- Thus, any inspection of the volume label, such as performing a simple
- DIRECTORY command, would indicate the presence or absence of the
- virus. An infected diskette could then be reformatted, or the virus
- could be removed by replacing the boot sector. This manual procedure
- is a typical, if somewhat rudimentary, example of the type of
- functions performed by a class of antiviral utilities commonly called
- Infection Identification products.
- Infection identification products generally employ "passive"
- techniques for virus detection. That is; they work by examining the
- virus in its inert state. This contrasts with active detection
- products which look for specific actions employed by a virus. For
- example, looking for a Format instruction within a segment of code on
- a disk would be a passive method of detecting a potentially
- destructive program. If we detected the Format attempt during program
- execution, however, we would be performing an active detection.
- Passive methods concern themselves with the static attributes of
- viruses, active methods concern themselves with the results of virus
- execution.
- Example active indicators are: the attempted erasure of
- critical files, destruction of the FAT table, re-direction of system
- interrupt vectors, general slowdown of the system, or an attempt to
- modify an executable program. These indicators are generic; that is,
- they are common to a large class of viruses. Because so many viruses
- perform these common activities, however, they are of little use in
- identifying individual virus strains. It is the passive virus
- indicators that prove most useful to a positive identification: The
- characteristic text imbedded within the virus, specific flags,
- singular filenames or a distinctive sequence of instructions that are
- unique to the virus. These and other similar indicators can best be
- ascertained by scanning system storage and examining the program files
- and other inert data.
-
- History
- Virus identification products have their genesis in the
- utility programs first developed in 1982 and 1983 to check public
- domain software for bombs or trojans before they were executed. These
- utility programs initially checked for questionable instructions in
- the suspect program's object code. Direct input/output instructions,
- interrupt calls, format sequences and like instructions, if found,
- were flagged and the user was notified. Later versions included tests
- for imbedded data strings that were typically used by trojan
- designers. Suspect programs were scanned for profanity, for keywords
- like "gotcha" or "sucker", and for data strings that had been found in
- specific trojan programs. Some programs looked also for specific
- names of files that were frequently used by trojans and bombs.
- These products, however, were seldom able to identify a
- specific bomb or trojan. Rather, they indicated that the suspect
- program contained instructions or messages of a questionable nature -
- implying that the program might be a generic trojan. This, however,
- is not sufficient for dealing with viruses.
- Viruses create entirely different problems than bombs or
- trojans. Viruses replicate, and can infect hundreds or even thousands
- of programs within an installation. They remain invisible for long
- periods of time before they activate and cause damage. And, they are
- difficult to remove because they imbed themselves within critical
- segments of the system. It is not sufficient to know that a virus is
- present, it is necessary to know which virus is present. We must know
- how it infects, what actions it takes, and, most importantly, what
- must be done to de-activate and remove the virus.
- Thus, when the first virus identification products emerged in
- 1986 they didn't just look for generic code or messages, they looked
- for specific indications that could identify the individual virus
- strain. This allowed the user to verify a specific infection
- occurrence and take appropriate action. Later versions of these
- products went a step further. They actually removed the virus when an
- infection was identified.
-
- Techniques
- Before we discuss the techniques used by identification
- products, we need to look briefly at how viruses insert themselves
- into programs. As shown in Figure 1, viruses actually modify the
- structure of the programs that they infect. Generally, the virus
- replaces the program's start-up segment with a routine that passes
- control to the main body of the virus. This main body code may be
- inserted within the program in a buffer area, or it may be added to
- the beginning or the end of the program. After execution of the
- virus, the program's original start-up sequence is replaced and
- control is passed to the program.
- When removing a virus from an infected program, it is crucial
- to determine exactly how the virus modified the program. Each virus
- differs from other viruses in size, segmentation and technique. Each
- virus chooses a different area for infection, stores the start-up
- sequence in a different location. and return control in a different
- manner. We must know exactly what the virus did during the infection
- process in order to reverse the steps for removal.
- Thus, it should be clear that in order to develop an antidote
- for a specific virus, we must first obtain a copy of the virus for
- analysis. A thorough analysis of the structure and design of the
- virus will provide the answers to all of the above questions.
- When a virus has been disassembled and analyzed, we in theory
- know all there is to know about the virus. We are then able to create
- an "attribute file" for the virus. This file contains all of
- characteristics of the virus that can be uniquely assigned to the
- virus. For example, we may find imbedded data within the virus that
- we would not reasonably expect to find in any other program or data
- file. Or we may find an instruction sequence that is sufficiently
- unusual that we would not expect any other program to use the exact
- same sequence. Figure 2 shows two virus examples that contain unique
- imbedded data. In the Pakistani Brain example, it is clear that we
- would not expect to find the exact same name, address and telephone
- number in any other program.
- In addition to "identification" attributes, the attribute file
- contains all information necessary to reverse the virus infection
- process. Common elements of an attribute file might be:
- - Executable code signatures
- - Volume label flags
- - Hidden file names
- - Absolute sector address contents
- - Key data at specific file offsets
- - Specific interrupt vector modifications
- - ASCII data content
- - Specific increases in bad sector counts
- When the attribute file has been created, it is inputted into
- a program that scans all of the appropriate areas of system storage
- looking for combinations of the attributes. As more attributes are
- discovered, the degree of assurance that the virus is present
- increases. For example, the character string "sUMsDOS" is common to
- all versions of the Israeli virus. It is conceivable, however, that
- the same string could appear randomly in any text file. Therefore,
- the identification program will look for verification attributes, such
- as the file offset where the character string was located, or a
- sequence of instructions following the data.
- When the virus has been identified, the removal phase begins.
- Since the infection attributes of the virus are known, the removal
- process is fairly straightforward. Usually it involves locating the
- main body of the virus and all segments of the original program that
- had been re-located by the virus. The virus is erased, and the
- program is then re-constructed.
- Clearly, multiple attribute files can be used by a single
- program. Thus, single identification products are able to identify
- multiple strains of viruses (see Figure 3).
-
- Product Advantages
- Infection identification products have a major advantage over
- other types of virus protection products: They are able to determine
- whether or not a system is already infected. This is a serious
- concern in many organizations. Other classes of virus protection
- products must assume that a given system is uninfected at the time the
- products are installed. They log the state of the system at the time
- they are installed and periodically compare the current state to the
- original state. If a virus has infected the system in the interim,
- the change will be detected. If a virus has already infected the
- system before such products are installed, however, the virus will be
- logged as part of the original system, and no change will be detected.
- Infection identification products, on the other hand, are
- specifically designed to look for and identify pre-existing
- infections. This ability to identify an existing infection is in many
- cases crucial to the success of implementing antiviral measures.
- Since a virus may remain dormant for months or even years before it
- activates and damages the system, pre-existing infections could cause
- widespread destruction in spite of our best efforts at implementing
- protection programs.
- Automatic removal is the second advantage of identification
- products. Virus infections can sometimes involve hundreds or
- thousands of programs within an organization. When the virus is
- discovered, the task of tracking down and disinfecting all of the
- infected programs can become monumental. In many cases, multiple
- versions of a single program may be infected, or the original source
- diskettes may have been lost or misplaced. In some cases, infected
- programs may be overlooked or incorrectly replaced, so that re-
- infection becomes a problem. These and other issues invariably cause
- problems. The identification products, however, automatically find,
- identify and remove the infection, normally at a rate of a few seconds
- per infected program. The time savings alone can be enormous.
- A third advantage to identification programs is that they
- cannot be circumvented by a known virus. Other types of products that
- use active methods for infection prevention or detection can be
- specifically targeted by viruses. The virus can seek out and destroy
- or disable the active element of such products. For example, if the
- product is a filter type product that monitors all system I/O, the
- virus can steal the interrupts from the monitor and thus bypass the
- program's checking function. Likewise, if a protection program uses a
- checksum or other method to look for change within a program, the
- virus can modify the program's checksum routine so that the change
- caused by an infection will not be detected. These and other
- techniques have been used by many viruses to avoid interference by
- antiviral programs that use active detection methods.
- Identification products, on the other hand, cannot be so
- easily circumvented. Since these products use passive techniques, the
- virus has no control over the products' functions. Keep in mind that
- the virus and its resultant system modifications are merely a sequence
- of inert bits as far as the identification product is concerned. Also
- the virus is not active at the time the product is being used (all
- such products come with their own boot diskettes, and they run
- stand-alone). Thus, the virus can in no way affect the product's
- operation, or even be aware of its presence.
-
- Problem areas
- There are some drawbacks to identification products however.
- The first problem is that these products only work for known viruses.
- That is, a virus that has been around long enough to be noticed,
- isolated, sampled, disassembled and analyzed. This may take a
- considerable time if the virus is unobtrusive and slow to activate.
- When the virus has been discovered and analyzed, the identification
- product must be designed, implemented, packaged, marketed and
- distributed - a process that could take considerably more time. Thus
- identification utilities will lag new virus developments by months, or
- in some cases, even years. This time lag implies that there will
- always be new viruses, and thus new dangers, against which no
- identification utility will be effective.
- The second problem with these products is more thorny, and
- requires a high level of product sophistication in order to resolve. At
- issue is a phenomenon that might be called the Uncertainty Factor, and
- it is caused by the increasing tendency of hackers to collect existing
- viruses, modify them and return them to the public domain. These
- modifications sometimes cause viruses to react differently from the ways
- in which they were originally designed, yet they may leave key
- identification attributes unchanged.
- For example, the Jerusalem virus was originally designed to slow
- down the infected machine's processor one-half hour after an infected
- program was executed. This slowdown was a nuisance to the user of the
- infected machine, but it severely limited the spread of the virus,
- because the virus made itself known early in the infection process and
- had limited time to replicate before being removed. In the summer of
- 1988, an unknown hacker modified the virus by changing just one
- instruction (see Figure 4). This modification disabled the routine that
- caused the system to slowdown, and as a result, the virus became many
- times more infectious.
- Modifications like this, and other more substantial
- modifications, are made almost daily to existing viruses. The danger
- that these modifications pose to identification products is substantial.
- If an identification product is attempting to remove a virus that has
- infected a program differently than the way in which the product
- expects, then the results of the disinfection will be unpredictable.
- Damage to the system may result, the program may be destroyed or, in the
- worst case, the virus will still be active even though the product
- thinks it has removed it.
- In order to minimize the risks posed by this problem,
- identification products must be designed to cross reference as many
- virus attributes as possible prior to attempting removal. If any one of
- the expected attributes has been changed, or is missing, the product
- should notify the user of the potential problem and manual intervention
- will be required.
-
- Future Prospects
- Identification products clearly must play a major role in the
- battle against computer viruses. As viruses become more widespread and
- as infections become more common, the need for utilities able to
- identify and help remove viruses will become apparent. It is probable
- that these products will become the dominant form of virus protection in
- the future. A few technical advances, however would greatly aid their
- general acceptance.
- One of the problems facing identification products is the time
- required to fully scan attached storage devices when searching for a
- virus. For example, as many as ten or more minutes can be required to
- fully scan a 40 megabyte drive while looking for just one virus.
- Multiple virus checks require more time. Because of this, it is
- impractical to perform frequent scans of the system. This is
- unfortunate because it would be advantageous to perform a complete
- identification check of a system each time the system was booted. This
- would provide a high degree of system security, assuming that the
- identification product was kept up to date. More sophisticated
- algorithms for searching attached storage and creative techniques for
- multiple virus scans could alleviate the time scan problem.
- A second desirable advance in the technology of these products
- would be the development of techniques that could identify variations of
- known viruses and still provide the capability to remove the modified
- virus. This advance would remove a major limitation of the current
- products and would greatly increase their reliability. Techniques for
- removing variations have already been developed for a few root viruses,
- but there currently exists no generic technique that is effective for a
- large class of viruses. I anticipate that this hurdle will be overcome
- within a year or two.
- A final enhancement would be the ability to fully or partially
- re- structure data that has been corrupted by a virus after it has
- activated. Currently, infection identification products are only useful
- if they are used before a virus begins its destructive phase. When the
- destructive phase begins, the virus may destroy critical control tables,
- data files, programs or even itself. At this point all current virus
- products have limited usefulness.
- It is possible in some cases, however, to reverse much of the
- destruction caused by a virus provided: 1) We know the details of the
- destruction process, and 2) The destructive phase has not gone on too
- long. For example, one of the common PC viruses scrambles the File
- Allocation Table by reversing a number of the entries. Since we know
- the exact way in which the virus scrambles the information, we can
- easily unscramble it. However, after a few days of data scrambling, the
- virus initiates a low level format of the hard disk. At this point, no
- recovery is possible.
- I anticipate that future products will incorporate recovery
- capabilities for a large number of virus destructive acts. This
- capability, and others described above, should provide the best virus
- protection that we can hope to achieve.
-