NetNews Usenet Archive 1993 #1

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #1 / NN_1993_1.iso / spool / comp / virus / 4869 < prev next >

Wrap

Internet Message Format | 1993-01-07 | 5.5 KB

Path: sparky!uunet!think.com!yale.edu!jvnc.net!netnews.upenn.edu!netnews.cc.lehigh.edu!news From: bontchev@fbihh.informatik.uni-hamburg.de (Vesselin Bontchev) Newsgroups: comp.virus Subject: How to measure polymorphism? Message-ID: <0022.9301071651.AA16031@barnabas.cert.org> Date: 6 Jan 93 21:21:12 GMT Sender: virus-l@lehigh.edu Lines: 120 Approved: news@netnews.cc.lehigh.edu Hello, everybody! There are already two polymorphic engines available (MtE and TPE) and we are going to see more and more polymorphic viruses in the future. An interesting question arises - how to determine how polymorphic a virus is? How to determine which of two viruses is "more polymorphic"? In other words - how to measure polymorphism in an objective way? There are several ways one could think about: 1) Determining the total number of different instance that a particular virus can take. Obviously, a virus that can take 6 different appearances (V-Sign) is less polymorphic than a virus that can take 65,536 (Cascade). Unfortunately, this does not work. First, for some viruses it is almost impossible (well, at least too difficult) to count the total number of appearances. Nobody has succeeded yet to determine the exact number of different decryptors that MtE can generate. Second, a variably encrypted virus with the following decryptor mov si,code_end mov cx,code_length mov ax,key decode: xor word ptr [si],ax jmp short skip garbage db ? skip: dec si dec si loop decode where the byte 'garbage' can take any value is not very polymorphic, regardless that the number of possible decryptors is 256. If we change garbage db ? to garbage dw ? this will mean that the number of possible decryptors is 65,536, but the virus is not more polymorphic... 2) The second idea is to divide the polymorphisms is classes. Class 0 means no polymorphism, class 1 means variable encryption with constant decryptor, class 2 means variable encryption with a decryptor that can be detected by a wildcard scan string, class 3 is a virus that cannot be detected even with a wildcard string. E.g., Vienna (class 0) is less polymorphic than Cascade (class 1), which is less polymorphic than Suomi (class 2), which is less polymorphic than V2P2 (class 3). Unfortunately, this is not good enough. First, what to do with viruses that use a limited set of decryptors, one of which is selected randomly (Whale). Such viruses are obviously more polymorphic than Cascade. But are they more or less polymorphic than Suomi? They can be detected by a set of non-wildcard strings... Second, what about Bad Boy? It consists of 9 segments of code, 8 of which can appear in any order. This gives 8! = 40,320 variants. But the virus is even not encrypted, so it can be detected with a simple (non-wildcard) scan string... Third, what is a "wildcard string"? A string allowing "don't care" bytes? A string allowing "don't know how many don't care bytes"? A string allowing "don't care" bits? For instance, Maltese Amoeba cannot be detected by the user-defined scan strings of SCAN or F-Prot, but can be with the wildcard scan strings of HTScan/TbScan... So, is Maltese Amoeba class 2 or class 3 polymorph? Fourth, what about V2P2 and V2P6Z? None of them can be detected by a scan string (even a wildcard one), but most experts agree that V2P6Z is "more polymorphic" than V2P2 - because the decryptor can be of different length, the virus too, and many other things... 3) The third idea was proposed by Dr. Solomon. He proposes to use the length of the longest byte sequence that is present in the decryptor. As an addition, one could use the sum of the length of all constant byte sequences in the decryptor - if there are several short ones that can appear in any order (like it is in V2P2). This is already better, but still not good enough. According to this criterium, the MtE-based viruses have polymorphism 1 - because all decryptors contain only a single constant byte (the JNZ instruction), which does not appear at a constant place. (Obviously, by this criterium, the higher the number, the less polymorphic the virus is.) Unfortunately, as I said, this is still not good enough. There are other viruses with only a single constant byte in the decryptor, which are much less polymorphic than the MtE-based ones. On the other side, the TPE-based viruses can have 0 constant bytes in the decryptor, but this does not imply that they are more polymorphic than the MtE-based ones... In fact, if one looks at them very carefully, it is possible to see that the different decryptors are pretty similar, if you remove the garbage ("do-nothing") instructions... It seems that what each anti-virus researcher considers as "level of polymorphism" is dependent somehow on the technique he uses in his anti-virus product to detect polymorphic viruses. For instance, V2P6Z might look "more difficult" (i.e., more polymorphic) than the MtE-based viruses to those people whose scanners depend on the number of addressing modes used by the instruction that performs the actual decoding... This article is not meant to provide a solution of the problem. I am trying just to explain the problem and am asking for solutions. Any ideas are welcome - we really need an objective way to measure the level of polymorphism... Regards, Vesselin - -- Vesselin Vladimirov Bontchev Virus Test Center, University of Hamburg Tel.:+49-40-54715-224, Fax: +49-40-54715-226 Fachbereich Informatik - AGN < PGP 2.1 public key available on request. > Vogt-Koelln-Strasse 30, rm. 107 C e-mail: bontchev@fbihh.informatik.uni-hamburg.de D-2000 Hamburg 54, Germany