home *** CD-ROM | disk | FTP | other *** search
Text File | 1991-05-21 | 70.5 KB | 1,466 lines |
- John A. Thomas
- CompuServe: 75236,3536
- 101 N.W. Eighth St.
- Grand Prairie, TX 75050
-
-
- SURVEY OF DATA ENCRYPTION
- By John A. Thomas
-
-
- Introduction
- ------------
-
- The following article is a survey of data encryption.
- It is intended to provoke discussion among the members of this
- forum and perhaps lead to a creative exchange of ideas.
- Although the basics of the subject seem to be known to few
- programmers, it embraces many interesting and challenging
- programming problems, ranging from the optimization of machine
- code for maximum throughput to the integration of encryption
- routines into editors, communications packages, and perhaps
- products as yet not invented. Governments have dominated this
- technology up until the last few years, but now the need for
- privacy and secrecy in the affairs of a computer-using public
- has made it essential that programmers understand and apply
- the fundamentals of data encryption.
-
- Some Cryptographic Basics
- -------------------------
-
- A few definitions are appropriate first. We use the
- term "encryption" to refer to the general process of making
- plain information secret and making secret information plain.
- To "encipher" a file is to transform the information in the file
- so that it is no longer directly intelligible. The file is then
- said to be in "ciphertext". To "decipher" a file is to
- transform it so that it is directly intelligible; that is, to
- recover the "plaintext."
-
- The two general devices of encryption are "ciphers" and
- "codes". A cipher works on the individual letters of an
- alphabet, while a code operates on some higher semantic level,
- such as whole words or phrases. Cipher systems may work by
- transposition (shuffling the characters in a message into some
- new order), or by substitution (exchanging each character in the
- message for a different character according to some rule), or a
- combination of both. In modern usage, transposition is often
- called "permutation". A cipher which employs both transposition
- and substitution is called a "product" cipher. In general,
- product ciphers are stronger than those using transposition or
- substitution alone. Shannon referred to substitution as
- "confusion", because the output is a non-linear function of the
- input, thus creating confusion as to the set of input
- characters. He referred to transposition as "diffusion" because
- it spreads the dependence of the output from a small number of
- input positions to a larger number.
-
- Every encryption system has two essential parts: an algorithm
- for enciphering and deciphering, and a "key", which consists of
- information to be combined with the plaintext according to the
- dictates of the algorithm. In any modern encryption system, the
- algorithm is assumed to be known to an opponent, and the security of
- the system rests entirely in the secrecy of the key.
-
- Our goal is to translate the language of the plaintext to a
- new "language" which cannot convey meaning without the additional
- information in the key. Those familiar with the concept of
- "entropy" in physics may be surprised to learn that it is also
- useful in information theory and cryptography. Entropy is a
- measure of the amount of disorder in a physical system, or the
- relative absence of information in a communication system. A
- natural language such as English has a low entropy because of
- its redundancies and statistical regularities. Even if many of
- the characters in a sentence are missing or garbled, we can
- usually make a good guess as to its meaning. Conversely, we
- want the language of our ciphertext to have as high an entropy
- as possible; ideally, it should be utterly random. Our guiding
- principle is that we must increase the uncertainty of the
- cryptanalyst as much as possible. His uncertainty should be so
- great that he cannot make any meaningful statement about the
- plaintext after examining the ciphertext; also, he must be just
- as uncertain about the key, even if he has the plaintext itself
- and the corresponding ciphertext (In practice, it is impossible
- to keep all plaintext out of his hands).
-
- A prime consideration in the security of an encryption
- system is the length of the key. If a short key (i.e., short
- compared with the length of the plaintext) is used, then the
- statistical properties of the language will begin to "show
- through" in the ciphertext as the key is used over and over, and
- a cryptanalyst will be able to derive the key if he has enough
- ciphertext to work with. On the other hand, we want a
- relatively short key, so that it can be easily stored or even
- remembered by a human. The government or a large corporation
- may have the means to generate and store long binary keys, but
- we cannot assume that the personal computer user will be able to
- do so.
-
- The other important fact about the keys is that there must be
- very many of them. If our system allows only 10,000 different keys,
- for example, it is not secure, because our opponent could try every
- possible key in a reasonable amount of time. This introduces
- the concept of the "work factor" required to break an encryption
- system. We may not have a system unbreakable in principle, but
- if we can make the work factor for breaking so high it is not
- practical for our opponent to do so, then it is irrelevant that the
- system may be less strong than the ideal. What constitutes an
- adequate work factor depends essentially on the number of
- uncertainties the cryptanalyst must resolve before he can derive
- plaintext or a key. In these days of constantly improving computers,
- that number should probably exceed 2**128. It is easy to quantify the
- work factor if we are talking about exhaustive key trial, but few
- modern ciphers are likely to be broken by key trial, since it is too
- easy to make the key space very large. Most likely they will be
- broken because of internal periodicities and subtle dependency of
- output on input which give the cryptanalyst enough information to
- reduce his uncertainty by orders of magnitude.
-
- A corollary to work factor is the rule that a system need
- only be strong enough to protect the information for however
- long it has value. If a system can be broken in a week, but
- not sooner, then it may be good enough, if the information has
- no value to an opponent after a week.
-
-
- Cryptanalysis
- -------------
-
- Cryptanalysis is the science of deriving plaintext
- without the key information. Anyone intending to design an
- encryption system must acquaint himself to some degree with
- cryptanalytic methods. The methods of attack may range from
- sophisticated statistical analysis of ciphertext to breaking
- into the opponent's office and stealing his keys ("practical
- cryptanalysis"). There are no rules of fair play. The
- cryptanalyist is free to use his puzzle-solving ingenuity
- to the utmost, even to the point of applying the knowledge that
- your dog's name is "Pascal", and that you might be lazy enough
- to use that as your key for the day.
-
- The cryptanalyst may have only ciphertext to work with,
- or he may have both ciphertext and the corresponding plaintext,
- or he may be able to obtain the encipherment of chosen
- plaintext. Some cryptographic systems are fairly strong if the
- analyst is limited to ciphertext, but fail completely if he has
- corresponding plaintext. Your system should be strong enough to
- resist attack even if your opponent has both plaintext and
- ciphertext.
-
- Computer power can greatly aid cryptanalysis, but
- many systems that appear strong can be broken with pencil-and-
- paper methods. For example, the Vigenere family of
- polyalphabetic ciphers was generally believed to be unbreakable
- up until the late nineteenth century. A polyalphabetic cipher is
- a substitution cipher in which a different alphabet is used for
- each character of plaintext. In these systems, the key
- determines the order of the substitution alphabets, and the
- cycle repeats with a period equal to the length of the key. This
- periodicity is a fatal weakness, since fairly often a repeated
- letter or word of plaintext will be enciphered with the same key
- letters, giving identical blocks of ciphertext. This exposes
- the length of the key. Once we have the length of the key, we
- use the known letter frequencies of the language to gradually
- build and test hypotheses about the key. Vigenere ciphers can
- be easily implemented on computers, but they are worthless
- today. A designer without knowledge of cryptanalysis however,
- might be just as ignorant of this fact as his colleagues of
- the last century. Please see the references at the end of
- this article for information on cryptanalytic technique.
-
- A Survey of Cryptographic systems
- ---------------------------------
-
- We now review some representative encryption schemes,
- starting with traditional ones and proceeding to the systems
- which are only feasible to implement on computers.
-
- The infinite-key cipher, also known as the "one time
- pad," is simple in concept. We first generate a key which
- is random and at least the same length as our message. Then,
- for each character of plaintext, we add the corresponding
- character of the key, to give the ciphertext. By "addition," we
- mean some reversible operation; the usual choice is the
- exclusive-or. A little reflection will show that given a random
- key at least the size of the plaintext (i.e., "infinite" with
- respect to the plaintext because it is never repeated), then the
- resulting cipher is unbreakable, even in principle. This scheme
- is in use today for the most secret government communications,
- but it presents a serious practical problem with its requirement
- for a long random key for each message and the need to somehow
- send the lengthy key to the recipient. Thus the ideal infinite
- key system is not practical for large volumes of message
- traffic. It is certainly not practical for file encryption on
- computers, since where would the key be stored? Be wary of
- schemes which use software random-number generators to supply
- the "infinite" key. Typical random-number algorithms use the
- preceeding random number to generate the succeeding number, and
- can thus be solved if only one number in the sequence is found.
-
- Some ciphers have been built to approximate the
- infinite-key system by expanding a short key. The Vernam system
- for telegraph transmission used long paper tapes containing
- random binary digits (Baudot code, actually) which were
- exclusively-or'ed with the message digits. To achieve a long
- key stream, Vernam and others used two or more key tapes of
- relatively prime lengths, giving a composite key equal to their
- product. The system is still not ideal, since eventually the
- key stream will repeat, allowing the analyst to derive the
- length and composition of the keys, given enough ciphertext.
- There are other ways to approach the infinite-key ideal, some of
- which are suggested in the author's article (with Joan
- Thersites) in the August '84 issue of DDJ.
-
- The "rotor" systems take their name from the
- electromechanical devices of World War II, the best known being
- perhaps the German ENIGMA. The rotors are wheels with
- characters inscribed on their edges, and with electrical
- contacts corresponding to the letters on both sides. A
- plaintext letter enters on one side of the rotor and is mapped
- to a different letter on the other side before passing to
- the next rotor, and so on. All of the rotors (and there may be
- few or many) are then stepped, so that the next substitution is
- different. The key is the arrangement and initial setting of
- the rotor disks. These devices are easy to implement in software
- and are fairly strong. They can be broken however; the British
- solution of the ENIGMA is an interesting story outside the
- scope of this note. If you implement a rotor system, consider
- having it operate on bits or nybbles instead of bytes, consider
- adding permutation stages, and consider how you are going to
- generate the rotor tables, since you must assume these will
- become known to an opponent.
-
- In 1977 the National Bureau of Standards promulgated the
- Data Encryption Standard (DES) as the encryption system to be
- used by all federal agencies (except for those enciphering data
- classified under any of the national security acts). The
- standard is available in a government publication and also in a
- number of books. The DES was intended to be implemented only in
- hardware, probably because its designers did not want users to
- make changes to its internal tables. However, DES has been
- implemented in software and is available in several
- microcomputer products (such as Borland's Superkey or IBM's
- Data Encoder).
-
- The DES is a product cipher using 16 stages of
- permutation and substitution on blocks of 64 bits each. The
- permutation tables are fixed, and the substitutions are
- determined by bits from a 56-bit key and the message block.
- This short key has caused some experts to question the security
- of DES. Controversy also exists regarding the involvement of the
- NSA in parts of the DES design. The issues are interesting, but
- beyond the scope of this note.
-
- Since DES was intended for hardware implementation, it
- is relatively slow in software. Software implementations of DES
- are challenging because of the bit-manipulation required in the
- key scheduling and permutation routines of the algorithm. Some
- implementations gain speed at the expense of code size by using
- large pre-computed tables.
-
- The public key cipher is an interesting new development
- which shows potential for making other encryption systems
- obsolete. It takes its name from the fact that the key
- information is divided into two parts, one of which can be made
- public. A person with the public key can encipher messages, but
- only one with the private key can decipher them. All of the
- public key systems rely on the existence of certain functions
- for which the inverse is very difficult to compute without the
- information in the private key. These schemes do not appear to
- be practical for microcomputers if their strength is fully
- exploited, at least for eight-bit machines. One variety of
- public key system (the "knap-sack") has been broken by solution
- of its enciphering function, but this is no reflection on other
- systems, such as the RSA scheme, which use different enciphering
- functions. All public-key systems proposed to date require
- heavy computation, such as the exponentiation and division of
- very large numbers (200 decimal digits for the RSA scheme). On
- the other hand, a public-key system that worked at only 10
- bytes/sec might be useful if all we are sending are the keys for
- some other system, such as the DES.
-
-
- Some random thoughts
- --------------------
-
- To wrap up this too-lengthy exposition, I append a few
- questions for the readers:
-
- Must we operate on blocks instead of bytes? Block
- ciphers seem stronger, since they allow for permutation. On the
- other hand, they make life difficult when file size is not
- an integral multiple of the block size.
-
- Can we make a file encryption system OS-independent?
- This is related to the question above on blocks vs bits. How do
- we define the end-of-file if the plaintext is ascii and the
- ciphertext can be any 8-bit value?
-
- Can we find an efficient way to generate and store a
- random key for the infinite-key system? Hardware random-number
- generators are not hard to build, but would they be of any use?
-
- Bit-fiddling is expensive. Can it be avoided and still
- leave a secure system? What are the relative costs of
- manipulating bits on the Z80 vs the 68000, for example?
-
- No file-encryption system can erase a file logically and
- be considered secure. The information can be recovered until it
- is overwritten. Overwriting files adds to processing time. I
- am informed that it is possible to reliably extract information
- even from sectors that HAVE been overwritten. Is this so?
- If it is, what is the solution?
-
- How do we integrate encryption systems into different
- tools? Should a telecommunications program transparently
- encrypt data if the correspondent is compatible? What about an
- editor-encryption system wherein plaintext would never exist on
- the disk, only on the screen? How would we manage to
- encipher/decipher text as we scroll through it and make changes,
- and still get acceptable performance?
-
- By their nature, encryption schemes are difficult to
- test. In practice, we can only have confidence that a system is
- strong after it has been subjected to repeated attack and
- remained unbroken. What test might we subject a system to that
- would increase our confidence in it?
-
-
- References
- ----------
-
- Here are a few useful books and articles. This is by no means
- a complete bibliography of the subject:
-
- Kahn, David. "The Code Breakers". The basic reference for the
- history of cryptography and cryptanalysis. Use it to
- learn where others have gone wrong.
-
- Konheim, Alan G. "Cryptography, A Primer". Survey of
- cryptographic systems from a mathematical perspective.
- Discusses rotor systems and the DES in great detail.
-
- Sinkov, Abraham. "Elementary Cryptanalysis". Very basic, but
- very useful, introduction to the mathematical concepts
- of cryptanalysis.
-
- Foster, Caxton C. "Cryptanalysis for Microcomputers". Covers
- the cryptanalysis of simple systems, but still a good
- introduction to cryptanalytic technique. Describes the
- operation of many traditional systems in detail.
-
- Shannon, Claude. "Communication Theory of Secrecy Systems".
- Bell System Technical Journal (October 1949) : 656-715.
- Discusses secrecy systems from viewpoint of information
- theory. No practical tips, but useful orientation.
-
- Rivest, R. et al. "A method for Obtaining Digital Signatures and
- Public Key Cryptosystems". Comm. of the ACM, Vol. 21,
- No. 2, (February 1978) : 120-126. This article
- describes what has come to be known as the RSA
- public-key system.
-
- "Data Encryption Standard", Federal Information Processing
- Standard (FIPS), Publication No. 46, National Bureau of
- Standards, U.S. Dept. of Commerce, January, 1977.
-
- ---
-
- To start off, I'll discuss some *good* points of the Data Encryption
- Standard promulgated by the federal government (DES). We all know the key is
- too small - the NSA almost certainly has the means to exhaust it. But it
- would be a pity not to examine the DES just for that reason. It uses a
- brilliant cryptographic technique that once understood can be used by us in
- families of other crypto-systems, with much larger keys.
-
- The DES shows us how to use one-way functions in cryptography. At first
- sight, a one-way function would seem to be useless - if we encrypt 'A' and get
- 'R', we have to be able to decrypt 'R' and get back 'A'. However, a one-way
- function, if it could be used, allows very complex transformations of text
- that are practically impossible to undo without knowledge of the key.
- However, the DES is as complicated as it is complex, so for a beginning I'm
- going to explain how to use a one-way function cryptographically without
- reference to the DES. If there's enough interest, we can later relate this to
- the DES.
-
- Perhaps the simplest way to define a one-way function is with a table, such
- as:
-
- 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
- -----------------------------------------------
- 14 05 01 14 10 04 08 00 03 02 15 08 09 11 07 15
-
- The top numbers are indexes into the one-way table. Given an index, you can
- get a value by table look up, but given a table value there's no guarantee
- you'll get the index you started with because both 0 and 3 map to 14, 6 and 11
- map to 8, and so on. BTW, the table values were generated by a random
- process.
-
- Now, let's use this cryptographically. Take an ASCII letter, say 'a' and
- split it into two nibbles, left and right
-
- LEFT RIGHT
- 61h -> 6 1
-
- The DES trick is to use one half as an argument of a one-way function to
- obtain a value with which to encrypt the other half, so let's do it. Using
- RIGHT as the index into the table, we obtain the value 5. Now, we need a
- primitive encryption function that is, and must be, invertible. The DES uses
- XOR, but we will use ADD, and add 5 to LEFT, mod 16, obtaining 11. We have
-
- LEFT RIGHT
- 11 1
-
- This encrypts LEFT. Notice that even though we used a one-way function, the
- encryption is completely reversible for two reasons. First, RIGHT, the
- argument of the table lookup is unchanged, so getting the correct value from
- the table for decryption is assured. Second, the encryption primitive is
- invertible; we need only to subtract the table value mod 16 from encrypted
- left.
-
- But there seems to be a problem. RIGHT isn't encrypted, and if that's how we
- left matters, we wouldn't have much of a cipher. The answer suggests itself -
- use the new value of LEFT in the same way to encrypt RIGHT. Let's do it.
- Using 11 as an index into the table gives us 8, which we add to RIGHT giving
-
- LEFT RIGHT
- 11 9
-
- which on the IBM PC is a funny graphics character. Now, is this reversible?
- Of course it is, the argument into the table that encrypted RIGHT is
- completely unchanged, and if used we get 8 which we subtract from 9 giving 1.
- And we have already shown that 11 1 will undo properly to give us 6 1.
-
- So far, we have nothing but a curious simple substitution cipher. Repeating
- the process isn't encouraging either. It clearly must cycle if we continue
- to alternately encrypt LEFT and RIGHT, using the values from the previous
- encryption as input. In fact, there are a number of cycles of different
- lengths, but it's interesting that some cycles don't cycle back to the
- starting value. For example, starting with 01, we get 01, 55, 99, BB, 33,
- EE, 55... The reason is that our table is not a permutation of the integers
- 0 through 15.
-
- To make our sample cipher more than a curiosity, we shall do what the DES
- does, use another one-way function (that is, a table) in a second *round*, and
- repeat this process with the new table.
-
- 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
- -----------------------------------------------
- 04 13 12 06 13 07 03 15 13 15 06 09 09 09 07 10 07
-
- LEFT RIGHT
- 11 9 -> 15 + 11 = 6
- 11 = 9 + 3 <- 6 9
- 6 11
-
- This is already a considerably more complex cipher. It is still reversible,
- but we must remember to *decrypt* it starting with the last table, not the
- first. This cipher, like the DES, is sensitive to direction, encrypt or
- decrypt.
-
- It still is a simple substitution cipher with no strength. You will notice
- however, that the length of the second table is one more than that of the
- first. Obviously, I intend to rotate both tables one position before
- encrypting the next letter of a message. Since the lengths are relatively
- prime, I can encipher 16x17=272 letters before my tables repeat. This is no
- longer a simple substitution cipher, but one far more complex. Still not good
- enough by far. But if I had only one message not too long to protect, this
- would be practically unbreakable. It might be good for hundreds of letters
- before repetitions would enable cryptanalysis.
-
- Of course, it isn't strong enough to protect against an attack based on known
- plaintext. With enough known plaintext, both tables can be reconstructed. If
- the DES used only two rounds, it too could be cryptanalyzed. It can be broken
- because with only two rounds, there aren't enough addends for known input and
- output that the addends can't be exhausted. Several more rounds are necessary
- to make exhaustion of possible addends impractical. It is quite important in
- designing a crypto-system that you design it against *known* plaintext. In
- practice, it is impossible to prevent known plaintext from falling into the
- cryptanalyst's hands.
-
- Later, I will develop this sample into a very tough cipher indeed, though I
- won't make any claims for its ultimate strength (I haven't examined it that
- well yet). But for now, let's sum up what we have done.
-
- 1. We used a one-way function for a crypto-system. To give credit where it is
- due, this technique was invented by Horst Feistel of IBM, originally for the
- Lucifer. It is in my opinion an absolutely brilliant cryptographic
- innovation. The technique is fundamental to the DES.
-
- 2. We cascaded one-way functions for complexity. The DES uses cascading
- similarly for strength.
-
- 3. Unlike the DES, we have developed the germ of a byte cipher, rather than a
- block cipher. Very great complexity may be had with a block cipher, but there
- is still doubt that block ciphers are 'better'.
-
- It is the trick of alternately enciphering 'left' and 'right' that permits the
- use of a one-way function. In practice, it is easier to swap or exchange
- right and left after each round. That is, we use RIGHT to encrypt LEFT, then
- exchange RIGHT and LEFT, and repeat encryption for the next round. This
- simplifies code and circuitry, though it may confuse us as we try to follow
- what happens in the DES. Therefore, to avoid confusion, we will always keep
- RIGHT right, and LEFT left.
-
- ---
-
- To begin, we shall consider the DES encipherment a little abstractly, then
- later in more detail.
-
- The heart of the DES one-way function consists of the eight S-boxes, each
- of which transforms six bits of the input to four bits of output. We
- can understand the function of the S-boxes better if we first consider
- the transformation they effect as they act in concert upon the entire
- input block of 48 bits. Imagine one table consisting of 2^48 numbers of 32
- bits each. Each 32 bit number is repeated 65,536 times, in a more or less
- random order. Also imagine a 'letter', not of one byte, but of 64 bits divided
- into a LEFT of 32 bits and a RIGHT of 32 bits. RIGHT is combined with 48 key
- bits selected out of 56 in a way that will be described later, without,
- however, changing RIGHT, and this combined value is used as an argument into
- the huge table to return a pseudo-random 32 bit value. This returned value is
- XORed with LEFT to encrypt LEFT. The same table lookup is repeated in the next
- round using LEFT as the argument and encrypting RIGHT. Except that a different
- arrangement of 48 key bits out of the 56 is used. The DES repeats this process
- 16 times, that is, encrypts RIGHT eight times, and LEFT eight times. There is
- a reason for eight that we will discuss later. Each iteration is called a
- 'round'.
-
- It is clear that this huge table is a one-way function, and that the
- encryption technique is almost exactly what we described for the byte cipher
- discussed in the previous upload. There is an important difference - we are
- now using a key in addition to a table. In our byte cipher, the key is the
- table itself. Also, the DES uses a different permutation of the original key
- for every round in a clever way that ensures that every bit of final
- ciphertext depends complexly on every bit of the key. This is important.
- A sure-fire cryptanalytic technique is to suppose a few key bits - not
- too many possibilities to exhaust - then to test the supposition against
- sample ciphertext compared with known plaintext. In this way, although a key
- space may be too large to exhaust by brute force, the key is gradually
- recovered. A good example is the simple substitution where the key space is
- 26! (about 2^88). But, by forcing all ciphertext bits to depend on all key
- bits, this sure-fire attack is inhibited if not made impossible. You can't
- solve for a few key bits, you have to solve for all or none.
-
- Why a key? The chief reason is that the one-way table is published. It is no
- secret as we suppose our byte cipher's tables are. And, to be a standard,
- we aren't supposed to change the DES tables. Further, the tables are not
- random; the inventors state that the DES's particular tables have been worked
- out for cryptographic strength, and that to their own surprise discovered that
- random tables, which they had naively believed to be sufficient, aren't so
- good. And, nobody will say what criterion should be used to design DES
- tables, and nobody has figured them out (at least, not publicly). Naturally,
- this has bred suspicion, and the fact that the NSA helped design the tables
- hasn't helped. Later, I would like to offer my own speculation on what this
- criterion might be. To summarize, the key serves as the secret ingredient
- because the tables can't be secret.
-
- Obviously, a table of this size is a practical impossibility. So, how does
- the DES achieve a 'virtual' table of this size? Basically, nibble by nibble.
- It uses the nibbles of RIGHT, in order, as indexes into relatively small
- matrixes to get eight encryption values per round. But the encryption values
- don't encrypt the corresponding nibbles of LEFT. Oh no, that would be fatally
- simple. Each result nibble of the one-way lookup encrypts scattered bits of
- LEFT so that each bit of the value impacts the most nibbles possible in LEFT.
- Now, when LEFT becomes the argument, the nibble values returned from the table
- have dependencies on many bits of RIGHT. Within five rounds, all ciphertext
- bits become complex dependents of all plaintext bits and all key bits. The
- dependency is extremely violent. Believe me, a single bit of difference in
- either plaintext or key gives you an unrecognizably different ciphertext.
-
- We should be ready now to see how a nibble of RIGHT (actually, three will be
- involved), and some key bits, are used as table arguments, and how this
- encrypts four bits of LEFT in a single round. Let's ignore how the different
- permutations of the key bits are generated for now. Just imagine there are
- somehow six key bits. The first nibble one-way function is this matrix:
-
- Box S1
- 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
- |-----------------------------------------------
- 0 |14 4 13 1 2 15 11 8 3 10 6 12 5 9 0 7
- 1 | 0 15 7 4 14 2 13 1 10 6 12 11 9 5 3 8
- 2 | 4 1 14 8 13 6 2 11 15 12 9 7 3 10 5 0
- 3 |15 12 8 2 4 9 1 7 5 11 3 14 10 0 6 13
-
- Recall that the input block of 32 bits is first expanded to 48 bits by
- the "selection operation". If you consult this table in the standard,
- you will see that the first six bits of the result are the first RIGHT
- nibble, plus the last bit of RIGHT, plus the first bit of the next nibble
- to form six bits:
-
- 32 1 2 3 4 5
-
- This is one argument into the one-way function. Also, six bits of the key are
- an argument, and the six key bits and these RIGHT bits are XORed to form the
- actual table argument. The first and last bits of this XOR-result index the row
- of the S1 box. The middle four bits index the column. The intersection of row
- and column gives us a four-bit value, which, after passing through the
- permutation operation P, is used to encrypt LEFT bits 9 17 23 and 31.
-
- This is surely one-way. You can't determine the row and column indexes from
- the table value because there are four row and column indexes that map to the
- same value. It should also be clear that LEFT bits 9 17 23 31 are decryptable
- because RIGHT was never changed. We have only to combine the appropriate key
- bits with RIGHT's 32 1 2 3 4 5, and we'll get the same value out of the S box,
- which, XORed with LEFT, will yield our original 9 17 23 and 31.
-
- Notice the scatter. By encrypting these particular bits, we are encrypting
- the 3rd, 5th, 6th, and 8th nibbles which will be immediately used in the next
- round as arguments. Because of their positions, they will form part of the
- 2nd, 3rd, 4th, 5th, 6th, and 8th nibble arguments of the future LEFT. And
- thus, when RIGHT gets encrypted, its old first nibble will be quite
- spread out. The scatter makes far more bits dependent on very few bits than
- is at first apparent. The only unaffected nibbles are the 1st and 7th, but
- this is only one round, and we tracked only one nibble of RIGHT.
-
- This is getting lengthy, so let me list the RIGHT bit arguments corresponding
- to the eight S boxes and the LEFT bits that get encrypted
-
- RIGHT bits box LEFT bits
- 32 1 2 3 4 5 ----> S1 ----> 9 17 23 31
- 4 5 6 7 8 9 ----> S2 ----> 13 28 2 18
- 8 9 10 11 12 13 ----> S3 ----> 24 16 30 6
- 12 13 14 15 16 17 ----> S4 ----> 26 20 10 1
- 16 17 18 19 20 21 ----> S5 ----> 8 14 25 3
- 20 21 22 23 24 25 ----> S6 ----> 4 29 11 19
- 24 25 26 27 28 29 ----> S7 ----> 32 12 22 7
- 28 29 30 31 32 1 ----> S8 ----> 5 27 15 21
-
- In this manner, a nibble by nibble encryption is spread out so that a virtual
- table of 2^48 elements is achieved. Note that we have not really considered
- the key yet. And note that I have shown the contents of only one box. The
- boxes are listed in FIPS Pub 46, which you should use if you ever implement
- the DES, because other sources usually have typos, the slightest of which is
- fatal.
-
- Also, pay attention to how RIGHT's bits are expanded to 48. The last bit of
- the previous nibble plus the first bit of the next are tacked onto each nibble
- of the argument fore and aft. This builds an inter-nibble dependency into the
- DES. Even more important, one encrypting bit from the table can do a lot of
- 'damage'. Look at the nibble value coming out of the second S-box; its first
- bit will encrypt the 13th bit of LEFT. But look where the 13th bit occurs in
- expanded RIGHT! The result of this encryption is that when LEFT becomes the
- table argument, the third and fourth table nibbles are dependent on just one
- bit. As a matter of fact, the value out of any S-box encrypts exactly two
- nibble boundary bits and two nibble inner bits.
-
- We saw how the DES uses one-way functions, each specific to a nibble, and how
- the results of the one-way functions are carefully scattered. The purpose of
- the scattering is to build up complex dependencies for the final result on all
- bits of message and key as fast as possible. The scatter is achieved by a
- permutation, called P, that the standard describes as occurring after
- gathering the eight table values. For software implementation, there is a
- great savings in speed by replacing all S-box values with 32-bit entries
- pre-permuted by the inverse of P - that is, by the encrypting bit positions
- we already listed, and doing away with P entirely.
-
- To illustrate, the first value of the first S-box is 14, in binary, 1 1 1 0.
- Therefore, to do away with P, we replace this value with its equivalent
-
- 1 2 3
- 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
- ---------------------------------------------------------------
- 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
- 1 1 1 0
-
- The gain in speed can't be overemphasized. There are more implementation
- tricks like this that hopefully we can get into later. On an 8088, at 4.77
- mHz, encryption speeds of 2500 bytes/sec are possible in pure software. On a
- 68000, we think that software speeds of 6000, or better, bytes/sec are
- achievable. Perhaps even as high as 8000. The important lesson is that you
- don't have to code in the dumb plodding way the standard implies. It seems
- that the standard was written to be as uninstructive as possible.
-
- We should now discuss the key. We won't go into a bit by bit description, for
- that, see FIPS 46. The main idea is to generate 16 48-bit keys out of the
- original 56 that
-
- 1. are simply generated
- 2. vary one from the other as much as possible
- 3. fit the 'scatter' to involve all key bits in all cipher bits
-
- For this, two (actually, three) permutations are used, called PC1 and PC2 in
- the standard, awkward names, but we'll use them. This 'key scheduling' is not
- perfect; it permits weak keys (keys that are their own inverse) and semi-weak
- keys (keys that result in alternating patterns so that all encryptions of LEFT
- are as if encrypted by an invariant key, and a different invariant key for
- RIGHT). The key scheduling just isn't good enough to avoid these weaknesses.
-
- PC1 is a geometric permutation that doesn't seem to have deep reason. It
- discards the so-called parity bits, thus reducing 64 bits to 56, and divides
- the key bits into two halves. The halving *is* important. This is a picture
- of the PC1 transformation of the original key bits with their 'parity' bits
-
- 1 2 3 4 5 6 7 8
- 9 10 11 12 13 14 15 16
- 17 18 19 20 21 22 23 24
- 25 26 27 28 29 30 31 32
- 33 34 35 36 37 38 39 40
- ^ 41 42 43 44 45 46 47 ^ 48
- | 49 50 51 52 53 54 55 | 56
- | 57 58 59 60 61 62 63 | 64
- C half D half discarded
-
- C = 57 49 .. 44 36
- D = 63 55 .. 4 12 20 28
-
- Very geometric, which, combined with the geometric shifts used with PC2, cause
- a larger number of weak keys than would be strictly necessary. Now, PC2,
- which is actually two permutations, one that operates on circular shifts of C,
- and one that operates on circular shifts of D, has an odd property; it
- re-arranges 24 bits out of C and 24 bits out of D so that the C bits directly
- combine only with the first 16 bits of RIGHT (or LEFT), and the D bits only
- with the last 16 bits of RIGHT (or LEFT). (I'm not considering the nibble
- boundary bits). Bit 41, in other words, will never combine with bits 17
- through 32 of the one-way argument. Similarly, bit 47 will never combine with
- bits 1 through 16 of the one-way argument.
-
- Indirect combination, due to the scatter, is another story. My guess is, that
- cutting the key bits into halves is deliberate to ensure that all key bits
- encrypt all cipher bits as quickly and simply as possible. If you carefully
- examine which bits get encrypted by the table values, you will see that the
- scatter also repositions substitution bits by halves. That is, in our
- illustration, the first 16 bits plus the nibble boundary bits, of RIGHT,
- encrypt half of the first 16 bits of LEFT and half of the last 16 bits of LEFT
- (almost). Also, the last 16 bits of RIGHT cause encryption of the remaining
- first and second halves of LEFT. I think this completely explains the purpose
- of the P permutation described (described? just listed!) in the standard.
-
- For PC2, see FIPS 46. Let me just mention that since PC2 selects 24 bits out
- of each half, in any of the key schedules there are always 4 bits out of the
- 56 missing. This makes it harder for the cryptanalyst to work the cipher
- backwards by supposing and testing assumptions about the key.
-
- Finally, let me add an implementation note. You don't have to go through the
- key scheduling algorithm for every block like the standard describes.
- Instead, you program an initialization call that generates the 16 key
- schedules one time. You can do this because the schedules are invariant from
- block to block. Such an initialization call makes the difference between
- encryption rates of 80 bytes/sec and 700 bytes/sec for an otherwise uninspired
- implementation, or 2500 bytes/sec for what is in my opinion the very finest
- ever for the IBM PC, the Data Encoder.
-
- To make all cipher bits depend on all key bits is no mean accomplishment. The
- attempt is fraught with booby traps. Most ways you can think of to force this
- dependency have the unfortunate result of transforming the actual key into a
- simpler and much smaller equivalent key. This danger the designers of the DES
- managed to avoid. I mention this because we have seen some weaknesses in the
- DES's key scheduling, but these weaknesses should not blind us to the DES's
- good points if we want to learn how to design ciphers.
-
- The DES has another weakness, the complementary property, that effectively
- halves the key space. This property is caused by the two uses of XOR in the
- algorithm. Let's use '~' to indicate boolean 'not', 'm' to indicate
- 'message', and 'k' to indicate 'key'.
-
- The complementary property is as follows:
-
- DES(k,m) = ~DES(~k,~m)
-
- Read this as: the DES encryption of a message, m, under a key, k, is IDENTICAL
- to the complement of the DES encryption of ~m under key ~k.
-
- Remember that the key bits are combined with the message bits by a XOR to look
- up an encrypting value in an S-box. Because of this XOR, m and k, or ~m and
- ~k, map to the identical value in an S-box because of the boolean identity
-
- (m XOR k) = (~m XOR ~k)
-
- Remember also that LEFT is encrypted by XORing it with the looked-up value.
- Due to another boolean identity
-
- (LEFT XOR VALUE) = ~(~LEFT XOR VALUE)
-
- we have the complementary property. The result is that for known plaintext,
- the DES's key space is 2^55, not 2^56. And cryptographers must assume that
- plaintext is known when analyzing a cipher; that simply isn't unreasonable.
-
- This weakness would have been easy to avoid. We have only to *add* the key to
- RIGHT (or LEFT) mod 2^48, and we would not map to the same S-box values for
- complements of message and key; and to *add* the looked-up value to LEFT (or
- RIGHT) mod 2^32. And as a general rule, XOR is not a good primitive for
- ciphers. True, it is convenient because XOR is its own inverse, while if we
- used ADD to encrypt, we would have to use SUB to decrypt. But in cryptography
- there are more important things than convenience, for example, keeping the
- actual key space close to the potential.
-
- Why this weakness is not corrected is hard to understand. DES defenders claim
- that the complementary property is useful for verifying the correctness of
- implementations. Basically, you code the DES then test it to see if the
- complementary property holds for randomly chosen key and plaintext, and if it
- does, you are supposed to get a 'warm' feeling. But this argument can't be
- valid. Errors in key scheduling and in S-box values can't be caught by this
- test. Matter of fact, most errors in coding the DES can't be caught by the
- complementary property, so long as XOR is used to combine RIGHT and key, and
- to encrypt LEFT. It only proves that XOR is used both places, not that things
- are right!
-
- DES defenders also claim that XOR is necessary to keep decryption like
- encryption, that is, to avoid sensitivity to direction. However,
- the DES, whether it uses XOR or not, *is* sensitive to direction. To encrypt,
- you must start with the *first* key schedule, and work your way to the 16th.
- To decrypt, you must start with the 16th key schedule, and work your way
- backwards to the first. Since the DES is sensitive to direction anyhow, it
- wouldn't hurt a thing to use ADD one way, and SUB the other.
-
- My opinion is that XOR is a mistake realized too late, and that correction is
- resisted because too much is now invested in this mistake, and that defense of
- the XOR are rationalizations, not reasons. And yes, it does matter. The key
- space of 2^56 isn't enough anyhow. If the NSA can exhaust all 2^56 keys in a
- day (and on average, it need only to exhaust half that, or 2^55), then for
- known plaintext, which is very common, it can exhaust all possible 2^55 keys
- in half a day (or on average 2^54 keys in one quarter of a day).
-
- But our interest in the DES is not its defense, but to learn good ciphering
- from both its good and bad points. The lesson is clear; avoid XOR for
- ciphers, because it halves the key space. If you implement the DES and
- don't need to maintain the standard, I recommend using ADD and SUB instead
- of XOR. Software implementation of the DES is frowned on, not because it
- is slow, but because it permits a knowledgeable person to tamper with the DES
- to the NSA's distress. The NSA prefers hardware implementations only (ROM
- qualifies as 'hardware') and will more readily grant export licenses for
- hardware than software.
-
- Here is one suggestion that practically increases the key space to
- 120 bits. Begin with a seed of 64 bits, the seed being an extension
- of your 56-bit key. Encrypt the seed by your key and use the resulting
- 16 nibbles of the ciphertext to shuffle the nibbles of the first row of
- the first S-box. Encrypt the ciphertext (not the seed) again with
- the altered S-box. Use the second ciphertext block to shuffle the
- nibbles of the next row of the S-box. Repeatedly encrypt blocks
- and shuffle rows until all 32 rows have been altered. Thereafter,
- use your 56-bit key and the altered DES tables to encrypt and decrypt
- your messages. In all probability, the resulting ciphertext won't
- be as nice cryptographically with these randomized tables as with
- the contrived ones, but this scheme will thwart a brute-force attack
- based on exhaustive key trial. A key search on a special-purpose
- machine is possible over a 55-bit key space (given known plaintext),
- but it is not possible over a 120-bit key space. We may take up
- later other pros and cons of modifying the DES tables.
-
- The inventors of the DES state that five rounds are required for all
- ciphertext bits to become dependent on all key bits and all message bits. Yet
- the DES uses 16 rounds. How come? Wouldn't it be better to limit the rounds
- to five for a great increase in throughput? Or would it? In examining a
- cipher it is very good not to take anything for granted, to ask oneself 'why
- this, instead of that?' and attempt to find a coherent reason for it.
-
- I haven't been able to answer this question satisfactorily for myself.
- However, the DES is too carefully crafted for me to lightly believe that it
- uses 16 rounds just because that is a nice round number in binary arithmetic.
- Perhaps my current thinking will help others to puzzle it out.
-
- The DES is breakable, not just by brute force but by cryptanalysis, if it used
- two rounds instead of 16, and if plaintext is known. To see this, let's list
- knowns and unknowns for two rounds. We need a notation now for LEFT and RIGHT
- to show rounds, and we will use L[i] and R[i]. When i=0, LEFT and RIGHT are
- plaintext; when i=n, LEFT and RIGHT are ciphertext. Also, we will designate
- key schedules by K[j]. This is the picture when n=1.
-
- L[0] R[0] known by assuming known plaintext
- L[1] R[1] known by intercepting ciphertext
- K[1] unknown
- K[2] unknown
-
- Remember that encryption means a 32-bit number was picked out of the S-boxes
- as a function of R[0] and K[1], and this number encrypted L[0] to produce
- L[1]. Similarly, another 32-bit number was picked out of the S-boxes as a
- function of L[1] and K[2], and the second number encrypted R[0] to produce
- R[1].
-
- There is another way to look at this (the poet Wallace Stevens listed thirteen
- ways of looking at a blackbird). One nibble is picked out of the first S-box
- as a function of the bits 32 1 2 3 4 5 of R[0] and the first six bits of K[1]
- (bits 10 51 34 60 49 17 of the key), and encrypts bits 9 17 23 and 31 of L[0]
- by XOR. And so on.
-
- Now, if we have known plaintext, it is simply no problem to determine what the
- S-box values were that encrypted both LEFT and RIGHT. We just take bits 9 17
- 23 and 31 of L[0] and XOR them against the same bits of L[1]. And so on.
- Let's call these nibbles the encrypting *sums*, and because there is only one
- round per side, the sums are also direct S-box values.
-
- For each S-box value, there are four possible 6-bit keys that will map to
- the S-box value. For all the nibbles that encrypted left, the possible key
- space used with R[0] is reduced from 2^48 to (2^2)^8, or 2^16. This key space
- is easily exhausted, and we can now attack K[2]. By the same procedure, the
- key space for K[2] is also reduced to 2^16, however, K[2] must be consistent
- with K[1]. That is, the bits of K[2] are practically the same as for K[1],
- but rearranged. So, in fact, the key space for K[2] is far smaller than 2^16,
- I think it is only 2^4 (but I haven't counted it yet). This breaks a
- two-round DES.
-
- Now suppose four rounds; two for LEFT and two for RIGHT, and list the knowns
- and unknowns again
-
- L[0] R[0] known (plaintext)
- L[1] R[1] K[1] K[2] unknown! (intermediate ciphertext)
- L[2] R[2] known (ciphertext)
- K[3] K[4] unknown
-
- Now when we derive the encrypting sums we no longer have the S-box values, but
- two S-box *addends*, and for any sum, there must be 16 pairs of addends. We
- simply don't know what L[1] and R[1], the intermediate ciphertext, are
- anymore. The possible keys for K[1] increase from 2^16 to 2^32. This begins
- to look computationally difficult. Instead of four possible 6-bit keys per
- nibble, we now have 16. However, let us remember the consistency requirement
- of the key bits. If we do suppose a particular key bit is 1, it must be 1 in
- all the schedules in which it occurs, regardless of position. So, the
- combinatorics aren't quite as daunting as they first appear. It seems that
- this is still solvable.
-
- To work this out accurately, you would have to construct tables of key bits
- according to the key schedules, then note the bits repeated (they will be in
- different positions) from round to round.
-
- For six rounds, the number of possible keys used with R[0] jumps to 2^48.
- With a large machine, it should be possible to solve. But any more rounds, we
- would be solving for 2^56 possible keys; that is, we are back to a brute force
- attack.
-
- With 16 rounds, there are seven intermediate unknown LEFTs and RIGHTs
- and the combinatorics become too great to backsolve for the key. I suspect
- that if the math is worked out, 16 rounds make it a practical certainty that
- backsolving is infeasible, whether we attack all bits used in one round, or a
- few bits used in all rounds. It would be nice to know the exact number of
- rounds that reach this infeasibility; the math, however, escapes me.
-
- We must remember this when we return to our byte cipher (discussed in
- the file "DES"), because we should determine how many rounds are
- required to make backsolving with known plaintext infeasible. The
- number of rounds is important; too many is inefficient, too few is fatal.
-
- ---
-
- Experts have studied the S-boxes. To a casual eye each row of
- the S-boxes appears to be a random permutation of the nibbles 0
- through 15. But they aren't random. So far, nobody has figured
- out their ordering or why.
-
- I haven't figured out the S-boxes, and it is unlikely I ever
- will. Nevertheless, I am inclined to believe the inventors when
- they say there is no trap door, and the computed boxes really
- are 'better' cryptographically than random boxes. Keep in mind
- that many mathematicians, computer scientists, and gifted
- amateur cryptanalysts who owe nothing to the NSA have pored over
- these boxes. Also, the inventors' tone in their defense of the
- DES ('Cryptography: A New Dimension in Computer Security'; Carl
- Meyer and Stephan Matyas, John Wiley and Sons) strikes me as
- sincere. They seem to me genuinely surprised to discover that
- non-random boxes are better than random.
-
- I can't explain the S-boxes, but I do have an idea why
- non-random boxes might be 'better'. It turns out that the
- ENIGMA rotors also were not randomly wired. Rather, they were
- wired to ensure 'distance'. The idea is that similar plaintext
- should not encrypt under a similar key to a similar ciphertext.
- I haven't studied the ENIGMA in detail yet, so I don't know what
- threat 'closeness' poses to the key that the German inventors
- were trying to avoid. Yet, it must have been a threat, because
- the German cryptographers deliberately avoided random rotor
- wiring, in spite of the weaknesses 'distance' wiring introduced.
-
- There is the same idea in symbol table hashing algorithms.
- Here, one wants similar variables, perhaps differing only by one
- character to hash to far different places in a symbol table.
- The reason is, we want to avoid 'collisions' - different
- variable names accidentally hashing to the same symbol table
- location, causing extra effort to find an unused slot for each
- variable name. Most hashing schemes have the effect of
- discarding the high order character, so without 'distance', in
- FORTRAN, for example, collision is practically assured because
- FORTRAN uses the initial letter of a variable to classify the
- data type, and we tend to name variables systematically.
-
- Might this not be the secret of the S-box design criterion? It
- seems to me that the message space mapping of a cipher is not in
- principle different from symbol table hashing. And if
- 'distance' is *not* a criterion of the mapping, maybe it ought
- to be.
-
- What I'm saying, strictly as speculation, is that very similar
- plaintext, differing perhaps by only a bit, should map to wildly
- different points in the message space, that it may be impossible
- to guarantee distance with random table values, and that the
- S-box values were carefully computed to ensure distance.
- Exactly why the lack of distance should be weak, I haven't
- figured out. In support of my guess, the DES does indeed
- exhibit the property that similar input maps to violently
- different results.
-
- But even if my guess is right, this leaves unanswered the
- important question of how to design the boxes. Still, it's a
- start, and it raises very interesting questions in math and
- cryptography.
-
- The plaintext is not simply divided into LEFT and RIGHT like we
- described. It is permuted first by a so-called Initial
- Permutation, IP, that transposes the rows and columns of a
- bit-byte matrix in an odd way.
-
- 1 2 3 4 5 6 7 8
- 9 10 11 12 13 14 15 16
- 17 18 19 20 21 22 23 24
- 25 26 27 28 29 30 31 32
- 33 34 35 36 37 38 39 40
- 41 42 43 44 45 46 47 48
- 49 50 51 52 53 54 55 56
- 57 58 59 60 61 62 63 64
- -----------------------
- 1 2 3 4 take out order for LEFT
- 5 6 7 8 take out order for RIGHT
-
- In other words, transcribe the bytes in row order, then take
- bits out in the column order indicated, from bottom to top.
-
- This permutation is bemusing. There is no apparent reason for
- it, and many cryptanalysts believe IP has no cryptographic
- significance. The American Cryptogram Association, an
- organization of amateur cryptanalysts, has considered IP without
- being able to reach a conclusion about it.
-
- This unexplained permutation does something distinctive,
- however. Let's look at eight consecutive bytes in a different
- way:
-
- x..x x..x
- x..x x..x
- x..x x..x
- x..x x..x
- x..x x..x
- x..x x..x
- x..x x..x
- x..x x..x
-
- What we are doing is distinguishing between nibble boundary bits
- and nibble inner bits. Now, it is a fact that IP rearranges
- boundary and inner bits like this
-
- .... ....
- x..x x..x
- .... ....
- x..x x..x
- x..x x..x
- .... ....
- x..x x..x
- .... ....
-
- In other words, half the normal boundary bits are exchanged with
- inner bits, and the original boundary bits that still remain
- boundary bits are relocated. The reason for this is obvious -
- the boundary bits have quite pronounced statistics in normal
- natural language text, especially in EBCDIC. For example, a
- blank is 0 1 0 0 0 0 0 0. The result is that 20 percent of all
- boundary bits are 0 because blanks make up 20 percent of all
- text. By continuing to separate the boundary nibbles by their
- natural language statistics, you can derive a frequency table,
- in addition to 0..0, for
-
- 0..1
- 1..0
- 1..1
-
- Do this as an exercise, and you will see that the frequencies
- are quite pronounced. Use EBCDIC; remember, the inventors of
- the DES were thinking IBM, not micros and ASCII.
-
- The effect of IP is to even out these frequencies for the
- boundary bits so that they are more nearly uniform. It is
- impossible to believe that something as remarkable as this was
- not intended. This smoothing out of boundary bit statistics
- must be the purpose of IP.
-
- How come? Why are the boundary bits so important that they are
- evened out at the expense of the inner bits? The answer I
- suppose is to compensate for the reduction of the DES key from
- 128 bits to 56.
-
- It doesn't take too long studying the DES to realize that its
- dimensions are wrong. It is very lopsided. There are eight
- boxes of four rows and 16 columns each. In short time, you get
- the feeling that 16 boxes of 16 rows and columns each would have
- been more natural. The key should have been 128 bits; LEFT and
- RIGHT should have consisted of 64 bits each. The inter-nibble
- dependency should have been TWO bits from the previous nibble
- (not one) concatenated to the current nibble, concatenated to
- TWO bits of the following nibble. In other words, instead of 32
- 1 2 3 4 5, it should have been 31 32 1 2 3 4 5 6. Bits 31 32 5
- 6 should have been the row coordinate into the first S-box, not
- 32 and 5.
-
- The result of this brutal reduction is that there are far less
- rows per nibble to select values out of than there are columns.
- Guessing which row is selected, rather than guessing which
- value, there are
-
- (2^2)^8 = 2^16
-
- choices per round, or for one nibble for all rounds. This is
- not a formidable number.
-
- For known input, correctly guessing a row gives you two bits of
- the key. Because of the key bit consistency requirement, it
- also helps determine other key bits. Also, guessing the row
- helps determine the column as well, since the selfsame message
- row bits participate as column selection bits. It looks like
- row selection is the weakest part of the DES, and if a
- cryptanalytic attack is achievable, it should be the spot to
- concentrate on. My guess is that a strong bias of nibble
- boundary bits might help such an attack, and that IP is a
- stop-gap measure to thwart it. If the rows *can* be determined,
- 32 key bits are recovered, and the DES falls. The remaining 24
- bits could not resist even a microcomputer.
-
- Perhaps IP is not a bemusing oddity. Perhaps it is symptomatic
- of a deep weakness born of the DES's butchering. By the way,
- this is exactly how cryptanalysis works - find a weak point and
- hammer it.
-
- ---
-
- This note will describe one more implementation trick for very
- fast software DES, then recap implementation tricks.
-
- You will remember how the 32 bits of RIGHT (or LEFT) must be
- expanded before they can be combined with the key schedule for a
- round. The following bits
-
- 1 2 3 4
- 5 6 7 8
- 9 10 11 12
- 13 14 15 16
- 17 18 19 20
- 21 22 23 24
- 25 26 27 28
- 29 30 31 32
-
- must be expanded to
-
- 32 1 2 3 4 5
- 4 5 6 7 8 9
- 8 9 10 11 12 13
- 12 13 14 15 16 17
- 16 17 18 19 20 21
- 20 21 22 23 24 25
- 24 25 26 27 28 26
- 28 29 30 31 32 1
-
- This expansion is called E in FIPS 46. You will also remember
- that IP arranges the bits of the plaintext into these 32 bits
- for RIGHT and 32 bits for LEFT. Now to perform this expansion
- in code for every round happens to be quite a bit of work, even
- using a micro's fastest instruction forms. However, IP may be
- modified to generate LEFT and RIGHT in *expanded* form. Then,
- if the S-box values are adjusted to compensate for an expanded
- LEFT and RIGHT, the same encryption is achieved, but without the
- expense of performing E for every round. Including the S-box
- adjustment to do away with P, the cost is 48-bit S-box values
- for every original 4-bit element in the S-boxes.
-
- To see how this works, remember that *all* values in the first
- S-box encrypt only bits 9 17 23 and 31 of either LEFT or RIGHT,
- and the first S-box's first value is 1 1 1 0, in binary. We
- therefore replace this value with the following
-
- 0 0 0 0 0 0
- 0 0 0 0 0 1
- 0 1 0 0 0 0
- 0 0 0 0 0 1
- 0 1 0 0 0 0
- 0 0 0 1 0 0
- 0 0 0 0 0 0
- 0 0 0 0 0 0
-
- Let's assume the 8088 chip and a clock rate of 4.77 MHz, and
- recap. Setting the key schedules in a one-time initialization
- call increases software speed from about 80 bytes per second to
- close to 800. Getting rid of P by precomputing it in the
- S-boxes increases the speed from about 800 bytes/second to about
- 1700. Making E a one-time expense by putting it into IP, and
- adjusting the S-boxes accordingly increases the speed from about
- 1700 to 2500 bytes/second.
-
- There is one more trick if you can live with a c * 64K table,
- where 'c' is the length of one table entry. You can collapse two
- S-boxes into a single table, thereby halving the number of S-box
- lookups. This should bring the speed up to about 5000
- bytes/second. However, this large a table isn't practical for
- some encryption applications.
-
- ---
-
- The files permf.s and permg.s are the 68000 versions of the
- permutation routines described in Z80 code in the article
- "Designing a File Encryption System" in the Aug '84 issue of
- DDJ. Permf performs the forward permutation of the bits in a
- 256-bit block as specified by a table of bytes. Permg performs
- the inverse permutation.
-
- The file cycle.s is the 68000 version of the corresponding
- Z80 code for the routine which "cycles" the permutation tables
- to the next position.
-
- For example, if the permutation table has the values:
-
- 1 15 115 57 .... 0
-
- then the forward permutation means to put the 1st bit of the
- block in the 0th place, the 15th bit in the 1st place, the 115th
- bit in the 2nd place, and so on until the 0th bit goes in the
- 255th place.
-
- The inverse permutation with the same table means to place the
- 0th bit of the block in the 1st place, the 1st bit in the 15th
- place, the 2nd bit in the 115th place, and so on until the 255th
- bit goes in the 0th place.
-
- The routines address bits in the block by deriving a bit index
- from the byte value of the permutation table. The upper five
- bits of that value index to the particular byte in the block,
- and the lower three bits then index to the particular bit within
- that byte.
-
- In the original cryptographic use, the permutation table was
- assumed to be cycled to its next permutation after the
- encryption of each block. The idea is to use the same thing in
- as many different ways as possible to get a long period before
- it repeats. Consider the following permutation list of 10
- elements:
-
- element: 7 4 1 3 5 9 8 6 2 0
- position: 0 1 2 3 4 5 6 7 8 9
-
- This is the table permf and permg use. In cyclic notation this
- list becomes:
-
- (0 7 6 8 2 1 4 5 9) (3)
-
- That is, there are two cycles, one of degree 9 and one
- singleton. It means that 7 goes to 0, 6 goes to 7, 8 goes to 6,
- and so on. If we "rotate" the cycles to the second position, we
- get this list:
-
- element: 6 5 4 3 9 0 2 8 1 7
- position: 0 1 2 3 4 5 6 7 8 9
-
- Thus from one cycle table we can get 9 different permutation
- lists. The cycle routine constructs these lists for permf and
- permg, given a table in cyclic notation as above. It can handle
- tables of 256 elements, each of which may contain a number of
- cycles. For example, if we had two tables, the first containing
- two cycles, of degrees 83 and 173, and the second containing
- cycles of degrees 197 and 59, the composite cycle would not
- repeat until 83 * 173 * 197 * 59 = 1.669 x 10^8 blocks had been
- processed.
-
- The routines now run about four times faster that the Z80
- versions.
-
- *
- * PERMF for the 68000. Permute a 256-bit bit vector, BLOCK
- * by table BITPERM. On call:
- *
- * a0 -> BLOCK
- * a1 -> BITPERM
- *
- * On return, a0 -> permuted BLOCK.
- *
- * Register usage:
- *
- * a2 -> WORK, a 32-byte temporary work area
- * d0 = byte from BITPERM, shifted to bit index
- * d1 = index to byte of BLOCK
- * d2 = rotated bit for masking and inner loop control
- * d3 = #31, outer loop control
- * d4 = #$07 immediate masking value
- *
- * 5/19/86. Execution time 4.5 ms.
- *
- .globl permf,work
-
- .text
-
- permf:
- movem.l d0-d4/a0-a3,-(a7)
- moveq #7,d0 clear work area
- lea work,a2 -> work
- move.l a2,a3 copy ptr for later use
- clrloop:
- clr.l (a2)+
- dbra d0,clrloop
- move.l a3,a2 retrieve work pointer
- move #$80,d2 masking bit and inner loop control
- moveq #31,d3 outer loop control
- moveq #7,d4 masking value
- clr d0 keep word clear
- permloop:
- move.b (a1)+,d0 get byte from BITPERM
- move d0,d1 we will need it twice
- lsr #3,d1 compute byte index in BLOCK
- and d4,d0 save lower 3 bits for bit index
- eor d4,d0 reverse bit order for btst
- btst d0,(a0,d1) is bit on in BLOCK?
- beq permf1 if so, we must set bit in WORK
- or.b d2,(a2) set bit in WORK
- permf1:
- ror.b #1,d2 shift masking bit
- bcc permloop next bit of work?
- addq #1,a2 else, next byte of work
- dbra d3,permloop do for 32 bytes
-
- movloop:
- move.l (a3)+,(a0)+ move data from WORK to BLOCK
- dbra d4,movloop use #7 already in d4
- movem.l (a7)+,d0-d4/a0-a3
- rts all done
-
- .end
-
- *
- * PERMG for the 68000. Inversely permute a 256-bit bit
- * vector by table BITPERM. On call:
- *
- * a0 -> BLOCK
- * a1 -> BITPERM
- *
- * On return, a0 -> permuted BLOCK.
- *
- * Register usage:
- *
- * a2 -> WORK, a 32-byte temporary work area
- * a3 -> BLOCK
- * d0 = outer loop control
- * d1 = inner loop control
- * d2 = bit counter
- * d3 = longword from block
- * d4 = byte from BITPERM
- * d5 = temporary
- * d6 = #7 masking value
- *
- * 5/19/86. Execution time is 3.3 ms.
- *
- .globl permg,work
-
- .text
-
- permg:
- movem.l d0-d6/a0-a3,-(a7)
- moveq #7,d0 clear work area
- lea work,a2
- move.l a2,a3 copy ptr for later use
- clrloop:
- clr.l (a2)+
- dbra d0,clrloop
- move.l a0,a3 save a0 ptr for later
- moveq #7,d0 outer loop control
- moveq #0,d2 count of bits
- move d2,d4 need word clear
- moveq #7,d6 masking value
- permg1:
- moveq #31,d1 inner loop control and bit to test
- move.l (a0)+,d3 get longword from BLOCK
- bitloop:
- btst d1,d3 check for bit on
- beq permg2 if on, set BITPERM[d2] bit in WORK
- move.b (a1,d2),d4 get byte BITPERM[COUNT]
- move d4,d5 save for reuse
- lsr #3,d4 index to byte of WORK
- and d6,d5 compute bit # in that byte
- eor d6,d5 reverse bit order
- bset d5,(a2,d4) set the bit in WORK
- permg2:
- addq #1,d2 bump bit count
- dbra d1,bitloop and do for all bits in this word
- dbra d0,permg1 do for all words of BLOCK
-
- movloop:
- move.l (a2)+,(a3)+ move WORK to BLOCK
- dbra d6,movloop use #7 already in d6
- movem.l (a7)+,d0-d6/a0-a3
- rts all done
-
- .end
-
- *
- * CYCLE: Convert a table of permutation cycles to a
- * permutation list at the Nth permutation. This version
- * of cycle can deal with a table having any number of
- * cycles (up to word value) of various degrees. The cyclic
- * permutation table is RANDP and the permutation list
- * table is BITPERM. BITPERM may then be used to permute
- * a block of elements. The procedure is:
- *
- * consult the cycle table structure to obtain the number
- * and degree of the cycles and the rotation to be applied
- * to each
- *
- * k <- 0 /* index into RANDP */
- * for i = 1 to number_of_cycles do
- * get degree_of_this_cycle from structure
- * cycle_base <- k
- * for j = 1 to degree_of_this_cycle do
- * BITPERM[RANDP[k]] <- RANDP[RANDP[(k - cycle_base + rotation)
- * mod (degree_of_this_cycle)]]
- * k <- k + 1
- * end for
- * end for
- *
- * On call:
- * a0 -> RANDP
- * a1 -> BITPERM
- * a2 -> cycle structure
- *
- * On return:
- * all registers saved
- * BITPERM is established from RANDP
- *
- * The cycle structure is built as follows for each set of tables:
- *
- * dc.w xx number of cycles in this table
- * dc.w yy degree of first cycle
- * dc.w zz rotation of first cycle
- * . . ..and so forth for each cycle
- * . .
- * CAUTION: This structure holds words, but this implementation assumes
- * that RANDP and BITPERM are tables of BYTES. See indirect addressing
- * inside degloop below.
- *
- * Version of 4/6/86.
-
- .text
- .globl cycle
-
- cycle:
- movem.l d0-d7/a0-a2,-(a7)
- move (a2)+,d0 get # of cycles
- subq #1,d0 adjust for dbra
- clr d3 clear index into RANDP ("k" above)
- clr d7 clear scratch register
- cycloop:
- move (a2)+,d1 get degree of this cycle
- move (a2)+,d2 get rotation for this cycle
- move d1,d5 use degree for loop control
- subq #1,d5 adjust for dbra
- move d3,d4 set current cycle base = k
- degloop:
- move d3,d6 first, see if outside cycle
- sub d4,d6 RANDP index - current cycle base
- add d2,d6 add in rotation
- cmp d1,d6 does that put us outside this cycle?
- bcs degok branch if not
- sub d1,d6 else, adjust mod degree
- degok:
- add d4,d6 add cycle base back to index + rotation
- move.b (a0,d6),d6 get RANDP[index + rotation]
- move.b (a0,d3),d7 get RANDP[index]
- move.b (a0,d6),(a1,d7) put byte in BITPERM
- addq #1,d3 bump RANDP index
- dbra d5,degloop loop until this cycle done
- dbra d0,cycloop loop until all cycles done
-
- movem.l (a7)+,d0-d7/a0-a2
- rts
-
- .end
-
- /*EOF*/