home *** CD-ROM | disk | FTP | other *** search
Text File | 1988-11-07 | 153.3 KB | 5,185 lines |
-
-
-
-
-
-
- BEYOND DES:
-
- DATA COMPRESSION AND THE MPJ ENCRYPTION ALGORITHM
-
- by
-
- Michael Paul Johnson
-
- B. S., University of Colorado at Boulder, 1980
-
-
-
-
-
-
-
-
-
- A thesis submitted to the
-
- Faculty of the Graduate School of the
-
- University of Colorado in partial fulfillment
-
- of the requirements for the degree of
-
- Master of Science
-
- Department of Electrical Engineering
-
- 1989
-
- Copyright (C) 1989 Michael Paul Johnson.
-
- All Rights Reserved.
-
- This thesis for the Master of Science degree by
-
- Michael Paul Johnson
-
- has been approved for the
-
- Department of
-
- Electrical Engineering
-
- by
-
-
-
-
- Mark A. Wickert
-
-
-
-
- Charles E. Fosha, Jr.
-
-
-
-
- Rodger E. Ziemer
-
-
-
- Date
-
- Johnson, Michael Paul (M. S., Electrical Engineering)
-
- Beyond DES: Data Compression and the MPJ Encryption Algorithm
-
-
- Thesis directed by Assistant Professor Mark A. Wickert
-
- Many encryption algorithms have come and gone as cryptography, cryptanalysis,
- and technology have progressed. Today's communication and computer
- technologies need cryptography to truly secure data in many applications.
- The demands on the cryptography needed for some commercial applications
- will exceed the security offered by the National Bureau of Standards
- Data Encryption Standard (DES) in the near future due to advances
- in technology, advances in cryptanalysis, and the increasing rewards
- for breaking such a heavily used algorithm. To meet part of this
- need, a new block encryption algorithm is proposed. A Pascal program
- to implement this algorithm is given.
-
- One way to further increase security of encrypted data, as well as
- to achieve storage and/or transmission economy, is by redundancy reduction
- prior to encryption. A linguistic approach to redundancy reduction,
- together with an example computer program to implement it, is given
- for this purpose.
-
- LIST OF FIGURES viii
-
- I. INTRODUCTION 1
-
- A. Motivation 2
-
- B. Approach 4
-
- II. HISTORY OF CRYPTOGRAPHY 6
-
- III. ELEMENTS OF ENCRYPTION 8
-
- A. Substitution 8
-
- 1. Monoalphabetic 8
-
- 2. Polyalphabetic 10
-
- B. Permutation 11
-
- C. Noise Addition 12
-
- D. Feedback & Chaining 13
-
- 1. Plain Text Feedback 13
-
- 2. Cipher Text Feedback 14
-
- E. Analog Encryption 15
-
- IV. FACTORS RELATED TO ENCRYPTION 17
-
- A. Change of Language 17
-
- B. Digitization 18
-
- C. Compression 18
-
- D. Multiplexing 19
-
- V. COMPARISON OF SELECTED ALGORITHMS 20
-
- A. One-Time Key Tape 20
-
- B. Linear Shift Register Feedback 21
-
- C. Exponential Encryption 22
-
- D. Knapsack 24
-
- E. Rotor Machines 24
-
- F. Codes 27
-
- G. Galois Field and Hill Cryptosystems 28
-
- H. DES 30
-
- VI. DESIGN CONSIDERATIONS FOR MPJ 33
-
- A. Strength Based on Key 33
-
- B. Usability of Random Keys 33
-
- C. Key Length & Block Size 34
-
- D. Effort Required to Break 34
-
- E. Computational Efficiency 35
-
- F. Communication Channel Efficiency 36
-
- G. No Back Doors or Spare Keys 36
-
- VII. MPJ ENCRYPTION ALGORITHM 37
-
- A. Description 37
-
- 1. Overall Structure of MPJ 37
-
- 2. Substitution Boxes 39
-
- 3. Wire Crossings 39
-
- 4. Key Generation 40
-
- B. Implementation in Pascal 44
-
- 1. Exceptions from Standard Pascal 44
-
- 2. Main Program 45
-
- 3. Procedures Encrypt & Decrypt 46
-
- 4. Procedures Permute & Ipermute 47
-
- 5. Procedures Substitute & Isubst 47
-
- C. Implementation in Hardware 47
-
- D. Strengths & Weaknesses 48
-
- VIII. DATA COMPRESSION 50
-
- A. Purpose 50
-
- B. Linguistic Parsing 53
-
- C. Huffman Coding 55
-
- D. Pascal Programs 55
-
- IX. CONCLUSION 58
-
- REFERENCES 59
-
- APPENDIX
-
- A. MPJ in Pascal 64
-
- B. Linguistic Data Compression Programs 72
-
- FIGURE
-
- III.B Permutation. 11
-
- III.C. Noise Addition 12
-
- III.D. Block Cipher Modes. 14
-
- III.E. Analog Time & Frequency Encryption 15
-
- V.A. One-time key tape (AKA One-time pad) 21
-
- V.A.1 Typex Rotor Machine 25
-
- V.E.2. Wired Rotors 25
-
- V.H.1. DES Enciphering 30
-
- V.H.2. DES Nonlinear Function 30
-
- V.H.3. DES Internal Key Generation 31
-
- VII.A.1. Overall Structure of MPJ 38
-
- VII.A.3. Wire Crossings 39
-
- I. INTRODUCTION
-
- The increasing proliferation of digital communication and computer
- data base storage has brought with it increasing difficulty of maintaining
- the privacy of that data. There is only one effective way to protect
- the privacy of communications sent over such channels as satellites,
- terrestrial microwave, and cellular telephones. This is by encryption. It
- is clearly impossible to deny unauthorized access by a determined
- and knowledgeable interceptor to the communications, but it is possible
- to render the communications totally unintelligible to all but the
- intended receiver(s).
-
- There are many ways to reversibly transform data from its plain form
- to something that looks unintelligible, but many of these can be figured
- out (broken) by someone else. The study of how to hide secrets is
- cryptology. Trying to figure out the secrets that someone else has
- hidden is cryptanalysis. These two sciences are, of course, very
- much intertwined. History reveals many examples of cryptology that
- worked, and that didn't [KAH]. Successful cryptanalysis depends on
- taking advantage of as many of the following as are available to the
- cryptanalyst: (1) taking advantage of the redundancy in any natural
- language to determine the validity of assumptions, (2) clues gained
- from corresponding plain and cipher text, (3) information that might
- be known about the algorithm(s) used, (4) the general expected content
- of the cryptograms, (5) all of the cipher text that is available in
- the same system & key, (6) compromised keys, (7) as much computational
- and analytical power as can be obtained, and (8) mistakes made in
- the users of the cryptographic system. The cryptographer can make
- life as difficult as possible for the cryptanalyst by depriving him
- of some of these things by (1) using redundancy reduction before encryption,
- (2) using an algorithm that is resistant to the known plain text attack,
- (3) & (4) using a strong enough algorithm that these clues aren't
- really useful, (5) changing keys often and selecting them properly,
- (6) guarding keys as closely as the data they protect justifies, (7)
- ensuring that there aren't enough computers in the world to do a brute
- force attack on the algorithm, and (8) making sure users of the system
- understand how to properly use it.
-
- This thesis proposes one solution to the above challenge by proposing
- (1) an approach to linguistically based redundancy reduction, and
- (2) proposing a new data encryption algorithm (MPJ) that can be used
- where DES is in use now, but is more secure.
-
- A. Motivation
-
- There has been a great deal of discussion of the security of DES in
- the open literature. Most of it has been favorable to DES [MEY],
- but there are a few indications that it would be wise to supplement
- the aging DES and eventually replace it.
-
- DES has been in use since 1977, and has been used in a large number
- of applications where people have had many possible motivations for
- trying to break this cipher. During this time, it is possible that
- someone has discovered a computationally feasible method for doing
- so [KRU]. Under such circumstances, it is highly unlikely that such
- a discovery would be made known. There is no way to really know this
- for sure, unless you are one of the ones who has broken DES. The
- closest thing that I have found to an open admission of breaking DES
- is a story of the FBI successfully decrypting a file of drug transaction
- records that were encrypted on a PC using a DES board. [MCL] The DES
- board that the criminal used has an algorithm to generate a key from
- a word or phrase. By an exhaustive search of an English dictionary
- and key names from the criminal's family & friends using a supercomputer,
- the file was solved. This indicates some weakness in DES, but even
- more weakness in the way the key was chosen. The key was not chosen
- randomly, so the FBI's job was much easier.
-
- DES is subject to attacks that require precomputation that could tie
- up a supercomputer for a few years, after which it would take only
- a few days to solve a DES cryptogram. This is becoming less of a
- barrier as the price of computers drops and the speed and storage
- capacity of computers increase.
-
- The U. S. National Security Agency (NSA) is acting to release some of its
- own algorithms for ``sensitive but unclassified'' applications, such
- as communications between government contractors. The NSA also releases
- some classified algorithms to chip manufacturers for use in classified
- devices, but under strict controls [ULT]. This will take some of
- the load off of the DES algorithm, but there is a catch. They intend
- to keep the algorithms secret and control the keys. Since the security
- of the algorithm is dependent on the key in a way that only the NSA knows,
- this gives the NSA the exclusive ability to read everybody's' communications. That
- is not all bad, as it discourages the use of one of those systems
- by someone engaged in spying or other illegal activity. Unfortunately,
- these systems are not an option for protection of a corporations proprietary
- data that may have a great deal to do with profits and losses, but
- are not really in the domain of the NSA.
-
- There are other alternatives to DES now, but none of them in the public
- domain are even as good, let alone better, for general cryptography. For
- example, RSA encryption has great advantages in the authentication
- of digital signatures, but the complications of selecting good keys
- and the fact that the security of RSA relies heavily on the failure
- of the state of the art in mathematics to progress makes it at least
- inconvenient to use and at most insecure.
-
- Because of the above considerations, it is my intentions to suggest
- a better algorithm (MPJ) for general use in the private sector. Although
- the MPJ algorithm might be useful for government applications, too,
- the design requirements for the two areas of application vary [CHA]
- and the latter is kept shrouded in secrecy, so we shall leave it to
- the NSA.
-
- B. Approach
-
- First, the problem is defined. As explained in the above section,
- the basic problem is to create an algorithm to supplement DES. Design
- criteria are then conceived such that an algorithm that meets those
- criteria will solve the problem as defined. The design criteria chosen
- for the MPJ algorithm are discussed in chapter VI. To avoid repetition
- of one or more of the many mistakes that have been made throughout
- history with cryptography, it is, of course, necessary to research
- what was done in the past, both distant and recent. From this, many
- good ideas can be gleaned that apply to the problem at hand. These
- ideas, along with a knowledge of current technology are then applied
- to design criteria with a bit of creativity and a lot of hard work
- to define the new algorithm.
-
- In the process of doing all of the above, it became apparent that
- the security of encrypted data was vastly improved by first removing
- as much redundancy from the data as possible. Therefore, as sort
- of a bonus, a linguistic approach to data compression is also presented
- that can either be used in conjunction with the MPJ encryption algorithm,
- another encryption method, or just by itself for the savings that
- it gains in communications channel capacity and/or data storage space.
-
- II. HISTORY OF CRYPTOGRAPHY
-
- One of the best works on the history of cryptography is <MI>The Codebreakers,
- by David Kahn [KAH]. In that book, David Kahn discusses cryptography
- from the prehistory to World War II. The first codes and ciphers
- were in written form, and used to protect the privacy of communications
- sent by courier or mail through hostile or unknown territory. Some
- of these were reasonably good, but most were not difficult to break
- using manual methods, provided that the interceptor had sufficient
- cipher text and perhaps some probable text. The use of radio, especially
- by the military, increased the need for cryptography, as well as increasing
- the rewards for those who could break the encryption schemes in use. Kahn's
- documentation of the efforts of those who broke some very complex
- encryption schemes, like the German Enigma and the Japanese Purple
- ciphers, lend great insight to the kind of process cryptanalysis really
- is.
-
- Kahn points out the kinds of mistakes the inventors and users of cryptographic
- algorithms tend to make that reduce the security of their communications. For
- example, German users of Enigma tended to choose a three-letter indicator
- for their messages that consisted of three consecutive letters on
- the keyboard. This substantially reduced the number of keys that
- had to be searched to determine the one that they were using. While
- the designer of an algorithm may calculate the great number of combinations
- of keys that there are, the cryptanalyst looks at ways to isolate
- parts of the key so that the difficulty of a solution is much less
- than the size of the key space indicates. The difference in mind
- set between the concealer of secrets and the one who prys into them
- has caused many an inventor of an encryption algorithm to be overconfident.
-
- The job of the cryptanalyst is a tedious one. He tries all kinds
- of things to try to unscramble the cipher text in front of him. Sometimes
- the search is fruitless. Sometimes the search yields something that
- looks like a meaningful language. It is this ability to recognize
- a meaningful message when it comes out of the various operations that
- the cryptanalyst tries that makes the whole process possible. It
- is also helpful for the cryptanalyst to know some probable plain text
- that is contained in a message. This is almost always the case. For
- example, military messages even now have a very stereotyped format,
- with the from, to, and date indicators in the same places in the message.
- The cryptanalyst almost always knows what language to expect a message
- to be written in, and this is a great help. Natural languages contain
- a great deal of redundancy. A message that is only 90% recovered
- is usually readable. Natural languages also have very consistent
- statistical properties that are very useful in cryptanalysis, especially
- when the cryptanalysis is automated. The only time that these things
- don't help the cryptanalyst is in the ``one-time pad.''
-
- III. ELEMENTS OF ENCRYPTION
-
- Several basic elements make up the basis of a multitude of possible
- encryption algorithms, either alone or in combination [BEK][BAL]. Although
- most of these elements may be used to form the basis of an encryption
- method all by itself, the most secure methods of encryption will several
- of them. For example, substitution and permutation are basic elements
- of both DES and MPJ encryption algorithms. These algorithms, in turn,
- are best used with some form of feedback.
-
- A. Substitution
-
- A substitution cipher simply substitutes one symbol from the plain
- text alphabet for another one from a cipher text alphabet. The plain
- text alphabet can be a natural language alphabet, ASCII, EBCDIC, Baudot,
- the set {0,1}, or any other finite set. Cipher text alphabets, likewise,
- may be any finite set. To be able to easily apply computer methods
- to the manipulation of these alphabets, it is preferable to use alphabets
- that are groups of binary digits (bits). This is not a severe limitation,
- since a correspondence can be set up between any finite set and a
- set of binary numbers.
-
- 1. Monoalphabetic
-
- A monoalphabetic substitution cipher is one in which each letter of
- the plain text alphabet is always substituted for by the same cipher
- text alphabet. The cipher text alphabet need not be in any particular
- order. A subset of the monoalphabetic substitution cipher is the
- Caesar cipher. This cipher replaces each letter with a letter that
- is n letters later in the alphabet. This cipher is so named because
- it was reportedly first used by Caesar. [KAH] With only 26 possible
- keys, this is obviously not very secure. In fact, any substitution
- cipher that substitutes one letter for another is vulnerable to solution
- by frequency analysis, and can be solved, given enough cipher text
- with the same key. How much cipher text is enough depends on the
- nature of the plain text, but for plain English, as many characters
- as there are letters in the alphabet is sufficient for the probabilities
- of occurrences of letters to betray enough letter identities to make
- solution probable. While ciphers of this type no longer have any
- value for serious cryptography, they do make fun puzzles for elementary
- school children. For example, the following cryptogram is presented
- for your solution:
-
- UIJT JT B DBFTBS TVCTUJUVUO DJQIFS/ XPVME ZPV USVTU VPVS TFDSFST
- UP JU@
-
- Substitution ciphers can be made more secure by operating on substitutions
- of more than one letter at a time. For example, the substitution
- could be made on digraphs or trigraphs (groups of two or three letters). A
- more practical example is the Data Encryption Standard. DES is actually
- a substitution cipher with an effective alphabet size of 2^64,
- or about 1.84 x 10^19, and the MPJ Encryption Algorithm is a
- substitution cipher with an effective alphabet size of 2^128
- or about 3.4 x 10^38. Not only is it difficult to get that
- large of a sample to analyze for statistics, but the memory required
- to store the substitution table is impractical even for computer systems
- using large optical storage disks.
-
- 2. Polyalphabetic
-
- A polyalphabetic substitution cipher is one in which each letter of
- the plain text alphabet may be replaced a letter from one of many
- cipher alphabets. The selection of which alphabet to take the substitution
- from may be determined in many ways. The most usual method is by
- selecting the cipher alphabet by a combination of a message indicator
- and the number of characters from the beginning of the message. The
- classic example of this type of cipher is the rotor-based machine. The
- substitution occurs by physical wire crossings within rotors. After
- each character is enciphered, one or more rotors are rotated by a
- ratchet mechanism. The message indicator, which is part of the key,
- determines the starting position of the rotors. This effectively
- allows the use of more cipher alphabets than there are letters in
- the message. Because of this, the use of statistical analysis on
- any one message to guess at the substitutions is useless. There is,
- however a good way to attack this kind of cipher when it is used heavily.
- This is by analyzing the statistics of the first character in each
- message, then the second, etc. This is called analyzing the messages
- ``in depth.''
-
- Although the rotor-based polyalphabetic ciphers used by Germany and
- Japan in World War II contributed greatly to their losses because
- their messages were read regularly by the allies, it is possible to
- create a more secure polyalphabetic substitution with computer technology.
- The primary weakness of the mechanical rotor machines was the infrequent
- changing of many of the parts of the key such as rotor wiring and
- ratchet construction. A more general method of using a different
- set of alphabets for each message would be very secure. Unfortunately,
- it would result in keys larger than the message. Since the minimum
- key length for absolute, provable cryptographic security, as determined
- by C. E. Shannon [SHA], is exactly the length of the message, this
- solution is not very efficient with respect to key management.
-
- B. Permutation
-
- Permutation is taking the same symbols and writing them in a different
- order to confuse an interceptor. There are many ways of doing this
- geometrically. For example, words may be written vertically then transcribed
- horizontally, or they may be written on a strip of paper wrapped along
- the length of a rod of a given diameter, then unwrapped for delivery.
-
- In general, a permutation operation may be considered as an operation
- on a block of symbols, where each symbol's position is changed to
- another position within the block according to some rule. The blocks
- may overlap each other if the permutation is undone from the end of
- the message to the beginning.
-
- The rule(s) determining how the permutation is done may be fixed,
- or they may depend in some way on a key. Although historical permutation
- ciphers tended to be rather simple, computer methods allow them to
- be potentially rather complex, especially if the symbols that are
- scrambled are individual bits.
-
- Permutation, when used in combination with substitution, is very useful
- in further increasing the complexity of a cipher, since the two algorithms
- act in different ways to baffle the interceptor. This is significant,
- since it is not possible to actually increase the security of a substitution
- cipher by simply repeating the same thing with a different key. The
- result in such a case would be another substitution cipher with the
- same complexity.
-
- C. Noise Addition
-
- Noise addition can either be a way of enhancing another encryption
- scheme, or of hiding the very existence of a message. For example,
- a grid can be set up on a piece of paper where only certain positions
- are significant. These positions are then concatenated together to
- reveal the secret message. The rest of the paper is then filled with
- something that uses those characters, but contains something different,
- like a letter to someone's mother. The same thing can be done electronically,
- by defining a block of bits with only part of the bits significant. The
- rest of the bits are then filled with truly random noise, like the
- output of a Geiger counter. This method can be very effective, especially
- if the position of the information-bearing bits vary from block to
- block in a pseudorandom manner, and if there is a relatively large
- proportion of truly random noise. The obvious disadvantage to this
- kind of scheme is that it makes very poor use of communications channels
- and data storage facilities.
-
- One useful variation on the noise addition method is to multiplex
- several encrypted streams of data (encrypted with another method)
- together with some pseudorandom interleaving. Since the other encrypted
- streams of data act like the random noise added to any one stream
- of data, this is a good way to improve security when a large volume
- of encrypted data must be sent through the same channel.
-
- D. Feedback & Chaining
-
- A block cipher like DES or MPJ, when applied directly to an input
- file with a highly repetitive structure (i. e., lots of spaces between
- columns) will also display some of that structure. Although it may
- not be possible to determine the exact contents of what has been encrypted,
- it may be rather easy to determine something about the nature of what
- has been encrypted. To deny the interceptor even this information,
- and to further increase the complexity of the encrypted information,
- feedback and chaining may be used. Chaining refers to making the
- results of the encryption of one block dependent on previous results. This
- is usually done in one of two ways. One is by adding the plain text
- from the last block to the cipher text of this block, modulo two. The
- other is by adding the cipher text from the preceding block to the
- cipher text of the current block, modulo two. These are referred
- to as plain text feedback and cipher text feedback, respectively.
-
- 1. Plain Text Feedback
-
- Plain text feedback has the property that any errors in transmission
- get propagated clear through the rest of the message. This may be
- a desirable property if it is necessary to detect any tampering with
- the message, but it makes it difficult to use real (error-prone) channels. If
- it is necessary to have the property of error propagation to detect
- tampering and to use real channels, then the message should be transmitted
- using an error detection and correction protocol of some sort.
-
- 2. Cipher Text Feedback
-
- Cipher text feedback also causes some error propagation, but not clear
- through to the end of a message. An error in one block will cause
- an error in that block in the bit positions where the error occurred
- and in all of the next block, but the receiving station can recover
- all subsequent error-free blocks. This method is a good compromise
- between security and error recovery. It is still desirable to wrap
- encrypted data in an error detection and correction protocol to avoid
- having a one bit error wipe out many bits.
-
- E. Analog Encryption
-
- There are several methods of analog encryption. None of them are
- as secure as digital methods can be. Analog encryption is generally
- based on spectrum inversion, spectrum scrambling, time slice scrambling,
- and (for video signals) suppression of synchronization information. These
- elements are also used in combination [LOD]. These methods are commonly
- used by cable TV companies (and sometimes on satellite channels) to
- ensure that people only get the programming that they have paid for. Articles
- on how to build devices to defeat some of these things appear periodically
- in electronic hobbyist magazines.
-
- The state of the art in current use for protection of satellite TV
- signals is the General Instruments Video Cipher II. This device used
- digital (DES) encryption of the audio information and a partially
- analog encryption of the video signal. This device is a deterrent
- to those who receive satellite TV without paying a cable TV operator,
- but it has been broken by a hacker who figured out that it was possible
- to modify the microcode in the device so that a key purchased for
- one machine would work on all of them with modified code. This is,
- of course, illegal, but it allows one person to pay for a service
- and many people to use it for free [DOH]. The problem here is not
- one of the encryption algorithm, but in the key management.
-
- IV. FACTORS RELATED TO ENCRYPTION
-
- A. Change of Language
-
- Although it is not really encryption, the use of a different language
- from what the interceptor is likely to know does increase the difficulty
- of cryptanalysis. For example, I know that it would be far more difficult
- for me to solve even a simple Caesar substitution cipher in Chinese
- than in English, Spanish, or German. This principal was used effectively
- by the United States against Japan in World War II by using the Navajo
- language for spoken radio communications [KAH]. Since the Navajo
- language had no written form, and was only spoken by a small group
- of Native Americans, there were no Japanese who knew the language. The
- Japanese never did figure that ``code'' out. Thus, the Navajo ``Code
- Talkers'' made a great contribution to their country.
-
- Part of the reason for the success of the Navajo Code Talkers was
- that the Japanese had no idea what was going on. Since this success
- is now well known, it is difficult to determine the potential success
- of a repeat of this kind of approach. There are still some languages
- that are spoken only by a very few people in the world, some of which
- have no written form. These could be used to increase the effective
- ``key space.''
-
- Although a change of language was a great victory for the Navajo People
- and the United States of America during World War II, it is difficult
- to adapt this success to computerized communication methods.
-
- B. Digitization
-
- The very act of digitizing an audio or video signal provides some
- protection from the casual eavesdropper. Although this provides no
- protection against a determined snooper, it is an appropriate level
- of additional security for protecting the privacy of things like mobile
- telephone conversations. Of course, once an analog signal is digitized,
- there is a wide range of digital encryption methods available. Of
- course, some methods would be overkill, since the same communication
- would probably be transmitted in the clear over wire and/or microwave
- channels as well as by radio between the car and the stationary cellular
- transceiver.
-
- Although there has been some ill-conceived efforts by the mobile telephone
- industry to legislate privacy for such applications by saying that
- it is illegal to listen to someone else's phone calls around 800 MHz
- (where cellular phones usually operate), this kind of thing is unenforceable.
- Worse yet, such legislation provides a false sense of security in
- a world where scanners and other receivers that operate in that range
- are readily available.
-
- C. Compression
-
- Data compression itself tends to obscure things by taking, for example,
- and ASCII text file in English and reducing it to a binary file that
- is not as easy to read. There is a more important effect of data
- compression, however. The removal of the natural redundancy of the
- plain text data before encrypting it with some kind of encryption
- algorithm makes it much more difficult to cryptanalyze the cipher
- text. In the extreme case where all of the redundancy of a message
- is removed, all messages of a given length would be meaningful, and
- it would be impossible for the cryptanalyst to determine which one
- was intended.
-
- D. Multiplexing
-
- Multiplexed signals are less readable to the casual observer, but
- standard multiplexing offers no real increase in cryptographic strength. It
- can, however, increase the strength of multiple streams of encrypted
- data when the multiplexing is done with a pseudorandom interleave
- (as discussed above under noise addition). This makes the multiplexing
- and demultiplexing processes more complex with respect to synchronization
- and timing, and is likely to introduce more delay into the system
- than more straight forward multiplexing arrangements.
-
- V. COMPARISON OF SELECTED ALGORITHMS
-
- A. One-Time Key Tape
-
- The one-time key tape, also called the one-time pad, has a tremendous
- advantage. It is the only commonly used encryption method that is
- provably secure. Proving the security of most other systems generally
- reduces to an impossible negative proof <197> an attempt to prove
- the lack of a method to defeat it.
-
- The way the one-time key tape works is to add each character of the
- message to a character of the key modulo the alphabet size. When
- working with any binary stream of data, this is easy to implement
- by exclusive-oring the input stream with the key stream. At the receiving
- end, the same thing is done to decrypt the data. Since P + K + K
- = P (modulo 2), the original stream is recovered.
-
- As implied by the name, it is important that each key be used only
- once. If it were used more than once, then the system would be vulnerable
- to attack with the known cipher text and corresponding plain text
- attack. The key is obtained from the boolean identity: K = (P +
- K) + P, where K is the key, P is the plain text, P + K is the cipher
- text, and all additions are modulo the alphabet size (usually 2). The
- key thus recovered would then be used to decipher the next message
- that used it.
-
- According to C. E. Shannon [SHA], for a cipher to be provably secure,
- the number of keys must be as great as the number of potential messages. The
- number of keys he refers to, of course, is the number of keys that
- yields a distinctly different transformation of the message text into
- cipher text.
-
- This method of encryption is provably secure, since an exhaustive
- search of the keys applied to any cipher text of a given length will
- yield all possible plain text messages of the same length. Since
- there are many messages of the same length that make sense, many of
- which have totally contradictory or unrelated meanings, the cryptanalyst
- has no way of knowing which one was intended unless he has the key.
-
- Naturally, any method of encryption this secure has a price. The
- price is the sheer volume of keying material that must be kept secure
- and managed. For any communication network or data storage system
- that handles large amounts of data, key management becomes a nightmare
- with this system.
-
- B. Linear Shift Register Feedback
-
- Linear shift registers with selected feedback taps are commonly used
- to generate pseudorandom sequences for such things as spread spectrum
- communications. They could be (and are, in some cases) applied to
- cryptography by exclusive-oring the pseudorandom stream with the plain
- text stream.
-
- The main weakness to this approach is the cipher text with corresponding
- plain text attack. If the cryptanalyst obtains the plain text that
- came from a certain cipher text, then he can recreate a portion of
- the pseudorandom stream by exclusive-oring the two together. The
- linear shift register feedback function can then be expressed as a
- system of linear boolean equations that will solve the portion of
- the cycle that was received. A solution of this system of equations
- is possible with only enough bits to be a few times as long as the
- shift register at the most [DEN]. Since this amount of plain text
- required to break the cipher is so much less than the length of the
- pseudorandom sequence, it is likely that this solution can then be
- applied to additional cipher text to recover it, too.
-
- C. Exponential Encryption
-
- The classic algorithm of this type is the RSA algorithm, which is
- named after the initials of its creators (R. L. Rivest, A. Shamir,
- and L. Aldeman). The security of this algorithm rests on the
- fact that even with supercomputers, it is very difficult to factor
- the product of two very large prime numbers. The RSA algorithm has
- a unique property in that the key has two parts, one of which may
- be made public. This makes this algorithm very well suited to the
- use of digital signatures.
-
- The RSA algorithm defines the relationships between the plain text
- message, the cipher text, and the elements of the key as follows [JAC]:
-
- C = P^e mod m
-
- P = C^d mod m
-
- m = pq
- where C = Cipher text, P = Plain text, e = encryption key, d = private
- decryption key, m = public modulus, and p and q are randomly chosen
- large (> 150 digits each) prime numbers. The receiver chooses a private
- key d and calculates a public key e. d must be relatively prime to
- (p - 1) and (q - 1). The algorithm works because ed = 1 mod
- the least common multiple of (p - 1)(q - 1). The algorithm
- is only secure if very large prime numbers are used. A number of
- 100 digits is too small - this size of number can be factored
- now with a multiprocessor technique developed by Arjen K. Lenstra
- of the University of Chicago and Mark Manasse of Digital Equipment
- Corporation [COS].
-
- The key generation for RSA is a bit nasty, but is reasonable using
- Rabin's test for primality: Let n = 2^rd+1, where d is odd.
- Choose a at random from 1 < a < n - 1. Accept n as prime if either
- a^d = 1 mod n or a^2jd = -1 mod n for some j such that
- 0 <= j < r, otherwise reject it.
-
- RSA is excellent for public key and authentication use, but it does
- have certain disadvantages for general encryption use. The processing
- time required for key generation and for the encryption/decryption
- process makes it a poor choice for use with high data rates, although
- there are some reasonably efficient ways to do the exponentiation
- [CHI]. The complexity of the algorithm makes it more difficult to
- implement than many others [VAN]. The worst problem, though is the
- way that the complexity of solving the algorithm could be reduced
- by several orders of magnitude by mathematical research. For example,
- J. Sattler and C. P. Schnorr recently published some algorithms that
- would appear to reduce the complexity of solving an RSA key by a factor
- of about 10^4 [SAT]. Granted that we are talking about taking
- 9 x 10^8 years instead of 1.5 x 10^13 years, but it is
- the potential for other discoveries that may drastically reduce this
- time even further that is the real threat.
-
- D. Knapsack
-
- A knapsack or trapdoor algorithm is one that is relatively easy to
- perform, but which is very difficult to invert. The RSA algorithm
- is one such algorithm. Another one, proposed by Ralph C. Merkle,
- uses the following vector operation: given an integer s and an integer
- vector a = a1, a2, ..., an find a vector k
- = x1, x2, ..., xn where xi is in {0,1}
- such that s = x * a. (* denotes dot product.) [MER] This
- algorithm is referred to in another paper as having been broken [VAN]. Another
- algorithm using matrix operations is suggested by H. Retkin [RET]. This
- algorithm may be good, but it is not obvious to me that no one will
- come up with a good solution to breaking it, too. Therefore, I prefer
- to use encryption algorithms that have simple brute force solutions
- that are well understood, but just take too long to perform to be
- of concern.
-
- E. Rotor Machines
-
- Rotor machines were once the state of the art in cryptography. They
- are a way to form a polyalphabetic solution automatically. An excellent
- history of these machines is contained in [KAH], [DEA], and [KOZ]. Although
- they are obsolete now due to the superiority of many computer algorithms,
- they played a major role in the history of cryptography.
-
- There were many different variations on the rotor machine, but the
- main principle of each of them was that the input, which is normally
- a indicated by one of 27 (or whatever the alphabet size in use is)
- wires is connected to a voltage level by a keyboard. This wire is
- then connected via rotary contacts to a rotor that physically permutes
- the position of the wire. This causes a simple substitution for each
- character, or a poly alphabetic substitution. The output is taken
- from rotary contacts on the other side of the rotor. Several of these
- rotors are placed in series. After each character is encrypted, one
- or more of the rotors is moved to a new position by a ratchet mechanism. The
- net effect is a different simple substitution for each letter in the
- message. Common additions to this scheme are a plugboard style permutation
- performed before or after the rotors and the addition of a reflecting
- rotor to run the signal back through the rotors in the opposite direction,
- using each rotor twice.
-
- The key used is a combination of several things: (1) the permutation
- performed by each rotor (changed by rewiring it), (2) the order in
- which the rotors were assembled, (3) the starting position of the
- rotors, (4) the plugboard connections, (4) the number of rotors used,
- and (5) the ratchet arrangement. Taken all together, these form a
- large key space. If the entire key space is varied simultaneously
- and for every message, then this polyalphabetic substitution is very
- secure. Due to the construction of these machines, some of the variables
- were not as easy to change as others. Therefore in practical use,
- the variables were not changed all at once. For example, the number
- of rotors and the ratchet arrangement would probably be fixed for
- the life of the machine. Therefore, if this portion of the key is
- solved for once, that is enough until a new kind of machine replaces
- it, perhaps years later. The rotor wiring is tedious to change, and
- therefore unlikely to be changed very often. The order of the rotors
- might be changed periodically, but during heavy use, the only thing
- that was changed for each message is usually the starting position
- of the rotors.
-
- Using an alphabet size of 26 and five rotors, varying only the starting
- position of the rotors provides a key space of less than 12 million,
- which is within the range of possibility of solution by mechanical
- computer, and a quick task for one of today's PCs. With many messages
- encrypted with the same rotors, the rotor wiring can be solved by
- frequency analysis ``in depth.'' In other words, the permutation
- applied to the first letter of each message can be solved by determining
- the frequency of occurrence of each code letter and comparing that
- with the language of the source language. This can be done for each
- position in the message. From these substitutions, the rotor wiring
- and plugboard connections may be reduced. This is a more lengthy
- process than solving for the rotor starting position, but once done
- the solution is likely to be good for a while. Using this kind of
- approach, isolating the parts of the key, the Allies read many German
- and Japanese messages during World War II.
-
- Rotor machines can be simulated in a computer program with more flexibility
- than in mechanical hardware, with many improvements. There are, however,
- better methods to use in computer algorithms that were not easy to
- use in mechanical devices. Therefore, rotor machines are interesting
- to study for historical value, but they are obsolete at a time when
- computers are becoming almost as common as televisions.
-
- F. Codes
-
- Codes, as opposed to ciphers, operate on linguistic units like words,
- phrases, and sentences, rather than directly on the units of the alphabet. Codes
- have the advantage that they generally compress the message. For
- example, the codeword ``APPLE'' might mean ``The supplies will be
- shipped on Monday via private courier'' and ``GRAPE'' might mean ``The
- bid is too high. Reduce it by 10%.'' If there are only a few different
- messages that might be encoded, a substantial savings in space can
- be realized.
-
- Codes are useful for compression of information, even when no encryption
- is intended. For example, the Uniform Commercial Code, used for telegraphy
- saves on tolls, even though it is published and gives no real security. Another
- example is the use of Q signals on amateur radio, where encryption
- is illegal but brevity is important, especially when using a slower
- mode of communication like Morse Code. There QST means ``The following
- is a message of interest to all Amateurs'' and QSL means ``My location
- is _____.''
-
- When used to hide the meaning of the message, the code must be changed
- often. This is much more difficult to do than to change the key to
- an encryption algorithm. Therefore, this technique has limited value
- except for some diplomatic and military codes. For commercial applications
- where brevity and security are both required, it is easier to first
- encode the plain text with a fixed code to reduce its size, then to
- encrypt the encoded text with an encryption algorithm whose key can
- be easily changed. This technique is more secure than the use of
- the encryption algorithm alone, even if the code used is widely known. See
- chapter VIII for more on the effects of data compression when used
- with encryption.
-
- G. Galois Field and Hill Cryptosystems
-
- A Galois Field cryptosystem is one in which the letters of the encryption
- alphabet are assigned arbitrarily to the integers from 0 to p^n,
- where p is prime and n is an integer. These integers are then operated
- on in blocks of m letters by matrix multiplication with a matrix that
- represents monic irreducible polynomials of order m - 1. All operations
- are done modulo p^n. To decrypt the data, the ciphertext is
- multiplied in the same manner by the inverse of the encryption matrix
- (obtained using the same modulo arithmetic). The resulting numbers
- are then substituted back for the letters they represent.
-
- The Hill cryptosystem is similar, except that n is fixed at one, and
- that p need not be absolutely prime as long as several integers less
- than p are relatively prime to it. For more details on how these
- cryptosystems work, see the articles by Hill, Cooper, and Patti [HIL]
- [HLL] [COO] [PAT].
-
- Both of these systems are based on linear algebra over a limited field
- of integers. Therefore, they are subject to solution of the key in
- use by linear algebra if enough cipher text and corresponding plain
- text is available. If the permutation of the encryption alphabet
- in use is known, then the amount of corresponding plain and cipher
- text needed for a solution is merely the key length. If the permutation
- of the encryption alphabet in use is not known then there must be
- a sufficient quantity to also perform statistical analysis attacks
- on the alphabet in use. Therefore, I do not recommend the use of
- these systems for serious encryption because of these vulnerabilities. Not
- only are they less secure than their key length (the combined length
- of the coefficients of their matrices) would indicate, they are computationally
- less efficient than DES and MPJ. Key generation for these systems
- becomes increasingly complex as the size of the key grows, too, because
- the probability of coming up with a valid set of monic irreducible
- polynomials that form an invertible matrix decreases rapidly with
- matrix size.
-
- These cryptosystems, do, however, provide interesting mathematical
- illustrations on what can be done with Galois Fields. They also provide
- adequate security for cases where the information being secured is
- not of great commercial value, and is therefore unlikely to be cryptanalyzed.
-
- H. DES
-
- The National Bureau of Standards (now known as NITS) Data Encryption
- Standard (DES) was published in the Federal Information Processing
- Standards Publication Number 46, dated January 15, 1977 (FIPS PUB
- 46) [DES]. For details, please refer to that publication. A summary
- of the algorithm follows.
-
- Encryption consists of an initial permutation, sixteen rounds of encryption,
- then an inverse of the initial permutation. Each of the sixteen rounds
- of encryption consist of taking the right half of the input block
- (32 of the 64 bits) and running it through a nonlinear function of
- the 32 bits and an internal key, then adding this result to the left
- half of the input modulo two. This 32 bit answer becomes the next
- round's right half block. The next round's left half becomes the
- right half block without modification. The nonlinear function used
- consists of a bit selection E that selects 48 bits from the input
- of 32 (several of the bits are repeated). These 48 bits are added
- modulo 2 to the round key of 48 bits. The result of that operation
- are then fed six bits each into eight substitution boxes. Each of
- the eight substitution boxes are different, but the same set of eight
- boxes are used for each round. Each substitution box gives an output
- of 4 bits. The output of these boxes are fed into a permutation P
- that rearranges the output in a fixed manner.
-
- The sixteen internal keys are generated from the 56 bit input key
- by feeding the input key into a fixed permutation that rearranges
- the order of the key bits. The key is then split into left and right
- halves called C and D. Each half is shifted left one or two times
- (according to a fixed table) before generating each internal key. Each
- of the sixteen internal keys is generated by taking the two halves
- of the key as shifted and permuting them in a fixed manner.
-
- The key and the resulting internal keys are the only things that vary
- in this algorithm. The initial and final permutations and the contents
- of each of the substitution boxes are constant. The two permutations
- used in generating the internal keys are constant. The bit selection
- and permutation used within the nonlinear function are constant.
-
- The strengths of the DES is that its cryptographic strength depends
- only on the key, that the algorithm is easy to implement in a single
- IC, that it has been well tested an no one has publicly announced
- a solution, that hardware and software that uses it is readily available,
- and that the algorithm places very few restrictions on key generation
- so that random numbers may be generated by the users for use as keys.
-
- The weaknesses of the DES are that the key is too short for security
- in the face of anticipated increases of computing power, that it is
- old enough that it is likely that someone has broken it (found a short-cut
- solution), that hardware implementations of the DES are too slow for
- some applications, and that it limits itself to be simpler than is
- really necessary with current technology.
-
- VI. DESIGN CONSIDERATIONS FOR MPJ
-
- The following paragraphs describe the design considerations that I
- used in creating the MPJ Encryption Algorithm, and the reasons for
- each.
-
- A. Strength Based on Key
-
- The strength of the system must rely on the security of the key only. It
- cannot depend on the algorithm being kept secret, because the algorithm
- will be published. Even if the algorithm were not published, it would
- probably be reverse engineered from software implementations of the
- algorithm. The algorithm must be constructed in such a way that there
- is no computationally feasible way to derive the key from samples
- of corresponding plain text and cipher text.
-
- B. Usability of Random Keys
-
- The key selection should be as easy as the random selection of a number
- in a given range. Selecting a very secure key should be no more difficult
- than flipping a coin once for each bit of the key, or generating keys
- using a pseudorandom sequence combined with random events such as
- timing of keystrokes on a computer. A one bit change in the key should
- provide a drastically different transformation, so that a potential
- cryptanalyst has no idea when a key that he guesses might be ``close''
- to the right one.
-
- C. Key Length & Block Size
-
- The key length should be significantly longer than the DES 56 bit
- size. A key size of 128 bits (sixteen 8-bit bytes) was chosen as
- being very manageable, yet highly secure even when attacked by multiple
- array supercomputers. The block size was also chosen as 128 bits
- (twice the size of DES) to provide a significant increase in complexity
- of the encryption.
-
- D. Effort Required to Break
-
- The effort required to break the algorithm by any method should be
- so great as to make such a task unfeasible even if significant advances
- are made in computer technology.
-
- This requirement is intimately linked to the choice for key size and
- block size. For example, if one thousand keys could be tried every
- nanosecond (10^12 attempts per second), an exhaustive key search
- that resulted in success after only testing one millionth of the possible
- keys (extremely good luck) would take 2^128/((10^12)(10^6))
- = 3.4 x 10^20 seconds = 1 x 10^13 years. If someone figured
- out a computational method or weakness in the algorithm that reduced
- the complexity by a factor of a million, and that someone also figured
- out how to compute the solution a thousand times faster, and they
- still believed in one in a million chances at good luck, they could
- possibly come up with a solution in only 10,000 years. By way of contrast,
- testing keys at the rate of 10^12 per second for DES with its
- 56 bit key, all of the keys can be tested in 7.2 x 10^4 seconds,
- or about 20 hours. While trying a trillion keys a second is not realistic
- with current general purpose computers, current technology could be
- applied to dedicated, highly parallel encryption/decryption engines
- that could approach that speed - at a cost that would be prohibitive
- for all but large governments at today's prices.
-
- Other than time, the other main measure of complexity of an algorithm
- is the storage capacity required to implement a lookup table attack. Suppose
- a lookup table were to be constructed that contained the encrypted
- version of just one block of cipher text corresponding to a very common
- block of plain text (such as 16 ASCII spaces) for every key possible. For
- the MPJ algorithm, this would take 2^128 x 128 = 4.356 x 10^40
- bits. If this memory were constructed with some kind of device that
- could store one bit per atom of silicon, this would take 4.356 x 10^40
- bits x 28.086 amu/atom x 1.660531 x 10^-24 grams/amu x 10^-6
- metric tons/gram = 2.03 x 10^12 metric tons of silicon. That
- is about one thousandth of the mass of Deimos, the smaller of Mars'
- two moons. [CRC] For DES, the same table would only require about
- 215 micrograms of silicon. Nobody has come up with memory that dense,
- and fundamental physical limits make such a task difficult, indeed. It
- is not difficult to conceive of some breakthroughs in optical storage
- technology coming close, say using a thousand molecules of something
- per bit. Under those circumstances, DES would be vulnerable to such
- an attack, where MPJ would still be safe.
-
- E. Computational Efficiency
-
- The MPJ encryption algorithm must be computationally efficient enough
- to be implemented in software on a standard IBM PC or compatible (or
- on an Apple computer of comparable power), and fast enough to handle
- at least 10 megabits per second when implemented in dedicated hardware. Note
- that this is less restrictive with respect to the hardware for DES,
- which was required to be simple enough to implement on a single chip
- using 1970s technology.
-
- F. Communication Channel Efficiency
-
- The MPJ encryption algorithm must not significantly increase the size
- of the plain text when encrypting it. This precludes the use of noise
- addition as a technique to be used.
-
- G. No Back Doors or Spare Keys
-
- While it may be impossible to guarantee that no ``back doors'' or ways
- to decipher a message without the key exist, the algorithm should be
- sufficiently complex combination of simple, well-understood operations that
- no help is offered to the cryptanalyst from the structure of the
- algorithm. Spare keys (the situation where more than one unique key
- will decipher a message) are avoided by making the number of keys possible
- much less than the number of possible transformations that can be
- done on a set of blocks. This is true for MPJ because 2^128
- << 2^128!.
-
- VII. MPJ ENCRYPTION ALGORITHM
-
- A. Description
-
- The MPJ Encryption Algorithm can be looked at from the outside like
- a rather large electronic codebook that operates on 128 bit blocks. The
- algorithm may be used in several modes. It can be used directly in
- electronic codebook mode, in block chaining with ciphertext feedback
- mode, in block chaining with plain text feedback mode, and in stream
- cipher generation mode.
-
- Given a 128 bit key, instructions to encrypt or decrypt, and a 128
- bit input block, a unique 128 bit output block comes out. Encryption
- and decryption are inverse operations that can be done in either order. For
- example, if P is the plain text block, C is the cipher text block,
- EK1 is encryption with key 1, and DK1 is decryption
- with key 1, then C = EK1(P); P = DK1(P). A different
- cipher text results if the plain text is decrypted first, but DK1(EK1(P))
- = P = EK1(DK1(P)). Multiple encryption can be done
- with several keys, as in these examples with three:
-
- C = EK1(EK2(EK3(P))); P = DK1(DK2(DK3(C)))
- or a different method:
-
- C1 = EK1(DK2(EK3(P))); P = DK1(EK2(DK3(C)))
-
- 1. Overall Structure of MPJ
-
- Before encryption or decryption occurs, the substitution boxes are
- filled based on the input key.
-
- To encrypt a 128 bit input block, it is run sequentially through ten
- rounds of substitution. Each round of substitution operates on the
- 16 eight-bit bytes that make up the input block individually. In
- between each round of substitution is a round of permutation, for
- a total of 9 rounds of permutation. The permutation operation is
- identical each time. Each operation (substitution and permutation)
- has an inverse operation. Decryption is done by running the inverse
- operation of each of the above steps in the opposite order.
-
- After one round of permutation, each bit in the output block depends
- on every bit in the byte that it is in and on eight bits of the key. After
- a round of permutation and the second round of substitution, each
- bit in the block depends on eight bytes of the input block and on
- eight bytes in the key. After the third round of substitution, each
- bit in the block is functionally dependent on 15 of the input bytes
- and on 15 bytes of the key. After the fourth round of substitution,
- each bit in the block is functionally dependent on every bit of the
- input block and on each bit of the key. After the tenth round of
- substitution the functional dependence on every bit of the key and
- the input block is so complex that it would be infeasable to determine
- the key used, even if the cryptanalyst has known corresponding plain
- and cipher text and knows the algorithm used. Because the substitution
- boxes are totally nonlinear functions, the problem of finding each
- of the 40,960 internal key bits is a tough one, indeed. It would
- be easier to find the original key of only 128 bits by brute force
- - something that is difficult, indeed.
-
- 2. Substitution Boxes
-
- The substitution boxes are designed to be big enough to provide a
- large number of possible arrangements, but small enough to still fit
- comfortably within the memory of an MS-DOS computer. Each substitution
- box is therefore a collection of 16 substitution boxes that operate
- on only 8 bits at a time, with a total of 256 entries. The substitution
- boxes may be an array or look-up table in a computer, or they may
- be implemented in hardware using RAM.
-
- The substitution boxes are filled during the internal key generation
- process with a permutation of numbers that depends on the main key
- in a different way for each substitution box.
-
- 3. Wire Crossings
-
- The wire crossings serve to extend the functional dependence of the
- output block on the input block across the internal byte boundaries. This
- is done by making each byte of the output of the permutation to contain
- one bit from each of eight different bytes. The selection of bits
- was chosen to make a computer implementation fairly efficient. A
- pure hardware implementation of this operation is, of course, much
- faster, since it involves only the propagation delay of the signal
- from one end of a short wire to the other. The input block bit positions
- are numbered from left to right (MSB to LSB) as 127 down to 0, then
- the new positions those bits occupy after the wire crossing are (MSB
- to LSB):
-
- 55, 46, 37, 28, 19, 10, 1, 120, 111, 102, 93, 84, 75, 66, 57, 48,
- 39, 30, 21, 12, 3, 122, 113, 104, 95, 86, 77, 68, 59, 50, 41, 32,
- 23, 14, 5, 124, 115, 106, 97, 88, 79, 70, 61, 52, 43, 34, 25, 16,
- 7, 126, 117, 108, 99, 90, 81, 72, 63, 54, 45, 36, 27, 18, 9, 0
-
- 4. Key Generation
-
- Internal key generation in the MPJ encryption algorithm consists of
- filling all of the substitution boxes (arrays in software or RAM in
- hardware). There are 16 substitution boxes used for each of 10 rounds. Each
- substitution box contains 256 entries of 8 bits each. Therefore,
- the actual internal keys have a combined length of 16 x 10 x 256 x
- 8 = 327,680 bits. These are obtained by manipulation of the 128 input
- key bits. Note that trying to attack the MPJ encryption algorithm
- by brute force trial and error using internal keys instead of using
- the 128 input key bits is ridiculous. The calculations given above
- indicate how difficult even a 128 bit random key is to solve by brute
- force. The only way that an attack on internal keys is useful is
- if there are parts of the internal keys that can be solved for separately,
- without having to solve for as many as 128 bits at once.
-
- Attacks against the internal key structure are made computationally
- intractable in three ways. First, the smallest chunk of the internal
- key that could be meaningfully solved for is the contents of one of
- the substitution boxes. Each of these contain 256 bytes, or 16 times
- the length of the input key. Remember that the addition of each bit
- doubles the complexity of the solution, so this is not attractive
- compared to solving for the input key. The second impediment to this
- attack is to make the relationship between the contents of the substitution
- boxes and the input key quite nonlinear. The way this is done is
- by using a combination of rotating bit selection and the use of the
- last substitution box filled to compute the contents of the next one. The
- third line of defense against the solution of internal keys is the
- use of ten nonlinear rounds of encryption to make any attempt at constructing
- a system of linear equations to solve for the internal key given known
- plain text and corresponding cipher text infeasable. Note that each
- round of encryption in MPJ causes the whole 128 bit block to be changed,
- where each of the rounds in DES only modifies half of its 64 bit block. Therefore
- the internal structure of MPJ is very difficult to solve for, with
- each bit of the output having a nonlinear functional dependence on
- every bit of the input and on the contents of 1 + 8 + 15 + (7 x 16)
- = 136 of the substitution boxes, containing a total of 34,816 bits. If
- the substitution boxes were linear, this solution would be possible,
- but there is no method I know of to solve such a system of nonlinear
- equations, either here or the simpler case of the DES internal structure.
-
- To find the inverse of the substitution box functions, separate inverse
- substitution boxes are formed that place each address of a substitution
- box at the address of the data contained in the substitution box. For
- example, if the contents of the first substitution box in the first
- round of encryption for the input value of 1 is 219, then the contents
- of the first inverse substitution box in the last round of decryption
- for the input of 219 will be 1. In equations, SI[i, j, S[i, j, k]]
- = k for all i, j, k, where the first index of the array is the round
- of encryption, numbered from 0 to 9, or the round of decryption, numbered
- from 9 to 0 (defined differently for programming convenience); the
- second index is the position of the 8 bits that the substitution applies
- to in the input block, numbered from 0 to 15, left to right; and the
- third index is the value of the 8 bits input to the block.
-
- The actual algorithm for substitution box filling consists of three
- main parts. The first one is a nested loop to fill one substitution
- box, permute the key with the same permutation used for the encryption
- operation, then to run each byte of the key through the substitution
- box that was just filled. This is done for each round of encryption,
- from 0 to 9, with each substitution box used in one round of encryption,
- from left to right, done within each of these rounds. In other words,
- the index of the three dimensional array that selects the position
- of the substitution box within each array varies faster than the index
- that selects the number of the round of encryption that the box uses.
-
- The second algorithm is the one that fills each substitution box based
- on the contents of the input key, as modified by the previous steps. This
- algorithm uses bit selection and rotation to determine where each
- value of the output of the substitution box goes. To be an invertable
- function, each of the possible output values from 0 to 255 must appear
- exactly once somewhere within the array with 256 possible slots, numbered
- from 0 to 255. The first number placed in the array is 255, and there
- are 256 possible places to put it. The second one is 254, with 255
- places to put it. This is continued until the last value, 0 is placed
- in the one remaining slot. For the first half of the values, the
- formula pos = (n * m) div 255 is used to determine which of the unfilled
- slots (counting in index order from 0 to n) is to be used to contain
- n. The variable pos is the position (counting only unfilled slots),
- * denotes multiplication, and div is the integer division operator
- (remainders are ignored). The variable m is determined for the first
- value of n (255) by selecting one bit from each byte of the key, starting
- with the least significant bit from the least significant byte of
- the key. The next bit (place value of 2) is selected from the next
- most significant bit of the key, and so on until m has eight bits. For
- the next value of n (254), the same thing is done, except that the
- bit selected is one bit position to the left of the one selected last
- time. The place value of the bit selected is the same in m as it
- is in the byte it came from. The bit to the left of the MSB in a
- byte is the LSB of the byte to the left of it. The byte to the left
- of the most significant byte is the least significant byte. For the
- values of n from 127 to 0, then the same thing is done, except that
- only seven bits are selected for m, and the position is determined
- from pos = (n * m) div 127. For a more concise explanation of this
- process, see the commented pascal procedure makesbox in the program
- in appendix A.
-
- The third algorithm used in internal key generation is the generation
- of inverse substitution boxes. This is done with a simple nested
- loop that fills the inverse substitution boxes according to the formula
- si[i, j, s[i, j, k]] for i from 0 to 9, j from 0 to 15, and k from
- 0 to 255. The array si is the collection of inverse substitution
- boxes, and the array s is the collection of substitution boxes filled
- by the above two algorithms. Filling of inverse substitution boxes
- is only needed for decryption mode.
-
- B. Implementation in Pascal
-
- Pascal source code for a program to implement the MPJ Encryption Algorithm
- is given in Appendix A.
-
- 1. Exceptions from Standard Pascal
-
- This program was written for MS-DOS computers and compiled using Borland
- Turbo Pascal 5.0. To adapt this program for use on another system,
- note that the following features of Turbo Pascal that do not conform
- to the ANSI/IEEE770X3.97-1983 Standard Pascal were used in this program:
-
- (i) The assign procedure is used to associate an operating system
- file name with a Pascal file name.
-
- (ii) The nonstandard file handling procedures blockread, blockwrite,
- close, and seek, were used. The nonstandard function filepos was
- used.
-
- (iii) The type longint (a 32 bit integer) was used for input handling
- to allow range checking without invoking a system error for a number
- that was only slightly out of range. A 16 bit integer could be used
- instead.
-
- (iv) A generic file type was used, which is not defined in ANSI Pascal.
-
- (v) The nonstandard operators shl and xor were used. The shl could
- be replaced with integer division by 2 (div 2), and the xor could
- be replaced by expressions of the form ((A and (not B)) or ((not A)
- and B)).
-
- (vi) Logical operators (and, or, not, xor) were used with integers
- to perform bit-wise operations.
-
- (vii) The + operator was used to concatenate strings.
-
- (viii) The uses clause links in separately compiled units.
-
- (ix) The include comment {$I filename} includes additional source
- code to be compiled with the current file.
-
- (v) Additional nonstandard features were used in writing startup and
- the file handling functions that are unique to MS-DOS. These functions
- would best be rewritten for a different target system.
-
- 2. Main Program
-
- The main program calls startup to initialize variables, get the key
- from the user, and determine which files are to be encrypted or decrypted.
- It then calls makesbox to generate the internal keys (fill the substitution
- boxes). If the decryption mode is to be used, it calls makesi to fill
- the inverse substitution boxes. It then does the actual encryption
- or decryption.
-
- The main program allows for the encryption of multiple files (specified
- using MS-DOS wildcards), and passes these files to the encryption
- routine one block at a time. To provide additional security, files
- are encrypted in place, with each block of output overwriting the
- corresponding input block in the same place on the disk. If it is
- desired to keep a copy of both encrypted and plain text versions of
- the file, the input file should be copied before running this program.
-
- Because this program uses block chaining with ciphertext feedback,
- only the encryption mode of the MPJ algorithm is used, the source
- of the feedback (input or output) being determined by the decryption
- boolean variable. This mode is used because (1) it obscures repetitive
- patterns in the source file that the electronic codebook mode might
- not hide, (2) it accommodates files of any number of bytes (including
- a short block at the end) with no complications and results in an
- output file that is the same length as the input file, and (3) it
- is slightly faster in operation because the procedure makesi does
- not have to be called. This mode does require an initialization vector,
- but since the initialization vector is encrypted before use, a constant
- initialization vector may be used.
-
- To use the electronic codebook mode, delete the for statement following
- the calls to encrypt, replace one call to encrypt with a call to decrypt,
- and remove the comment delimiters from around the call to makesi.
-
- 3. Procedures Encrypt & Decrypt
-
- These procedures directly reflect the overall structure of MPJ. The
- are simply calls to the routines that perform the substitutions and
- permutations, calling them in the proper order and passing a parameter
- to the substitution routines that select the right set of substitution
- (or inverse substitution) boxes for the operation being done. The
- procedure encrypt makes 10 calls to substitute, with 9 calls to permute
- (one call to permute between each pair of calls to substitute. The
- procedure decrypt makes 10 calls to isubst, with 9 calls to ipermute,
- done in the inverse order of encryption.
-
- 4. Procedures Permute & Ipermute
-
- Permute just scrambles the order of the bits in the 128 bit block
- in such a way as to maximize the dependence of each byte on every
- other byte of the input. Each byte is formed by selecting the least
- significant bit from the least significant bit of the corresponding
- input byte, then the next most significant bit from the next byte
- to the left, and so on. The least significant byte is considered
- to be to the left of the most significant byte. Ipermute is just the
- inverse of this permutation, with bytes being selected to the right
- instead of the left.
-
- 5. Procedures Substitute & Isubst
-
- These procedures are simple array lookups in which each byte of the
- input block is used as the third index of the array s for the procedure
- substitute or si for the procedure isubst. The output is the contents
- of the array. The first index is the round of encryption or decryption
- (counting backwards if it is decryption), and the second index of
- the array is the byte position (0 to 15) within the input and output
- blocks.
-
- C. Implementation in Hardware
-
- For use with high data rates (greater than 10 Megabits per second),
- it is desirable to implement the MPJ algorithm directly in hardware.
- This also provides the advantage of greater security against tampering
- than a program on a general purpose computer might be subject to.
-
- The hardware implementation consists of six basic elements: (1) Key
- generation, (2) serial to parallel conversion, (3) substitution, (4)
- wire crossings, (5) parallel to serial conversion, and (6) timing
- control. The key generation is done by a program running in a microprocessor. This
- program takes the key as input and fills the substitution boxes, which
- are simply static RAM chips. Serial to parallel conversion and parallel
- to serial conversion is done with shift registers. The wire crossings
- are done with physical wire crossings (or printed circuit board trace
- crossings) - a technique that is hard to beat for speed. Timing
- control is done with a sequential circuit that is synchronized to
- the incoming data stream.
-
- The output data stream will, of necessity, be delayed by at least
- the block size of 128 bits, and some kind of protocol must be used
- to synchronize the blocks at the sending and receiving ends.
-
- The following diagram shows a block diagram of a hardware implementation
- of the MPJ Encryption Algorithm. There are, of course, many variations
- possible. For example, block chaining could be used with hardware
- exclusive-or gates to do the modulo 2 addition and flip-flops for
- unit delays.
-
- D. Strengths & Weaknesses
-
- The main strengths of the MPJ encryption algorithm are:
-
- It is much more secure than DES.
-
- When used in conjunction with data compression, MPJ is probably
- more secure than many of the best military and diplomatic codes &
- ciphers.
-
- It is easy to generate a key for MPJ.
-
- It is capable of high data rates when implemented in hardware.
-
- It will run on a personal computer.
-
- It takes more time to generate internal keys from an external
- key than to actually encrypt or decrypt data, making exhaustive key
- search attack more difficult than normal use.
-
- The algorithm is published and may be freely used for any
- legal purpose.
-
- There has not been a lot of time for anyone to search for
- a short-cut solution to MPJ, nor the volume of sensitive material
- to motivate such a search as DES has had.
-
- The main weaknesses of the MPJ encryption algorithm are:
-
- The key and block sizes are fixed (optimized for the memory
- and processing power of the IBM PC), so it doesn't take very good
- advantage of more powerful computers other than running faster.
-
- The software implementation isn't very fast.
-
- There is no way to prove that there is no short-cut solution
- to MPJ (or DES, or almost any encryption algorithm except for the
- one-time key tape).
-
- Since the MPJ encryption algorithm is my own invention,
- I could possibly be blind to some weakness that it might contain.
-
- VIII. DATA COMPRESSION
-
- A. Purpose
-
- Data compression has two obvious purposes: to save on communication
- channel capacity required to send a message and to save on media capacity
- required to store a message. The third purpose for data compression,
- pointed out by C. E. Shannon, is to make decryption of a message more
- difficult [SHA]. In reading through the history of cryptanalysis,
- it became very obvious to me that the one thing that all cryptanalysis
- relies on the most is the presence of a great deal of redundancy present
- in natural languages. This is especially true for messages that the
- cryptanalyst already has something of an idea of what the message
- is about. It is this redundancy that enables the cryptanalyst to
- tell which of several possible solutions is the right one - the
- one that ``makes sense.'' For example, if the message is between
- English-speaking parties, then the of the possible messages ``ACCEPT
- THE PROPOSAL,'' ``A;LSDI FUPE ZAARED,'' and ``XXDKFJ DDDLKJDJFFZA,''
- the first message is instantly recognized as the only one that makes
- sense. The reason that this is so obvious is because of the redundancy
- of natural languages. If all of the redundancy of a message is removed,
- then there is no criteria to select one possible answer over another,
- and the cryptanalyst is left baffled.
-
- The task at hand is to try to eliminate or at least reduce the amount
- of redundancy in a message before encryption, then restore the message
- to its original meaning after decryption. If all of the redundancy
- of a message is removed, it means that all possible messages that
- can be constructed from an alphabet have meaning, and that the length
- of a message is inversely proportional to its probability. This ideal
- can probably be achieved only for special circumstances with very
- limited messages. For use with general correspondence in a natural
- language, however, it is not practical to totally remove the redundancy
- from all messages. In fact, it is very difficult to even measure
- redundancy or entropy in a natural language. The concepts of redundancy
- and entropy (as defined in communication theory texts) are useful
- for comparisons of various methods of redundancy reduction.
-
- Most existing methods of redundancy reduction use either a manually
- constructed code like the amateur radio operator's Q-signals or an
- automatic method that operates on either the alphabet or some groupings
- of alphabet symbols, like trigraphs. Manual techniques are fine for
- their intended application, but automatic methods that are well-suited
- for computer implementation are better for use with encryption. Automatic
- methods are likely to achieve good results with much greater ease
- of use and less opportunities for errors.
-
- The probability of occurrence of various letters in most natural languages
- are fairly constant over a wide range of types of text, and have been
- studied and published many times. These probabilities for English
- letters are published in several of the texts on cryptography listed
- in the references section of this thesis. Probabilities of digraphs
- (groups of two letters) and trigraphs (groups of three letters) have
- likewise been studied, and are fairly constant for a given language. Just
- changing the representation of trigraphs to variable length codes
- that are shorter for more common trigraphs and longer for less common
- trigraphs results in a substantial savings in length of an English
- message.
-
- There are many such methods for compressing data that are fairly straight
- forward. This and other methods, many of which are discussed by James
- Storer in his book on data compression [STO]. There are also several
- public domain and share-ware programs that are available on many bulletin
- boards that perform data compression. My favorite is PKARC, written
- by Phil Katz. It dynamically analyzes each input file to determine
- which one of five compression methods or storing with no compression
- results in the smallest file. The compression methods used by PKARC
- are repetition coding, Huffman encoding, Ziv-Lempel-Welch compression,
- and two different variations on Dynamic Ziv-Lempel-Welch compression
- [KAT]. PKARC also has an option that allows a rather crude encryption
- of the archive files with a password. Although this doesn't provide
- strong cryptographic security, it does help to obscure the redundancy
- that PKARC adds back in, in the form of cyclic redundancy checks on
- files and file headers. The CRC checks and file headers would be
- a good point of vulnerability when trying to cryptanalyze an archived
- (compressed) file.
-
- My feeling was that it was possible to compress natural language text
- even more than was commonly being done by using linguistic parsing
- of the text to form a dictionary. I was partially right. My method
- does remove more redundancy and make most text files smaller and more
- secure. It won't win any speed contests with PKARC, however, which
- I recommend that you use if speed is of great importance or if you
- are going to compress binary files (such as executable programs).
-
- In comparing the performance of SQUEEZE with PKARC, I compressed a
- 4,316,030 byte ASCII text file (the King James Version of the Holy
- Bible) to 1,312,493 bytes with a language code file of 163,236 bytes. PKARC
- compressed the same file to 1,779,184 bytes. This is a removal of
- 69.6% of the redundancy for SQUEEZE (or 65.8% if you add the size
- of the language code file), compared with a redundancy reduction of
- only 58.8% for PKARC. This may not seem like much of a difference,
- but the smaller compressed file fits on a 1.44 Megabyte 3 1/2 inch
- floppy disk, and the other one doesn't. This would be more of a significant
- achievement if SQUEEZE weren't so slow. It took well over two hours
- to do this running on one of the fastest PC's available (25 MHz 80386),
- while PKARC worked for about a minute on the task. I believe that
- both the speed and amount of redundancy reduction can be improved
- upon. I am therefore publishing the source code for what I did in
- Appendix B, so that someone else may be able to build upon this idea.
-
- B. Linguistic Parsing
-
- Instead of developing a Huffman code based on trigraphs or other such
- arbitrary chunks of fixed length, why not develop a Huffman code based
- on the way people read words? Letter combinations made from statistically
- properly distributed trigraphs will probably contain a lot of valid
- English words, but they will contain a lot of garbage, too. If, however,
- all symbols correspond to English words (or punctuation), then a collection
- of random symbols will look a lot more like English. Of course, the
- arrangement of words will probably not make sense in either group,
- but an intuitive analysis indicates that a code based on linguistic
- parsing of words will contain a lot less redundancy than will a fixed-length
- code.
-
- To define linguistic symbols, I called any grouping of alphabetic
- symbols unbroken by non-alphabetic symbols a word symbol. I called
- any group of contiguous spaces a space symbol. I called a carriage
- return + line feed combination a new line symbol. Any other punctuation
- and numbers were called symbols themselves. There may be more efficient
- ways to split up text into symbols, since, for example, words are
- almost always followed by a space. I also counted upper and lower
- case words different symbols. It may have been more efficient to
- use a modifier symbol to indicate that the following word is capitalized
- or written in all capitol letters. Of course, if all of the input
- text is all upper case, as would be the case for some telegraphic
- type communications systems, this is not a concern.
-
- Once a linguistic symbol is detected in the input text, it is encoded
- with its corresponding Huffman code representation. This is then
- decoded at the other end of the communications link or after retrieving
- the data from storage, using the same Huffman code.
-
- So that the Huffman code generation does not have to be done each
- time a file is compressed, an escape code that meant ``the following
- symbol of ____ bits is not in the code, and is to be interpreted directly''
- is used. For these symbols, a simple suppression of the most significant
- bit is used, resulting in a savings of one out of eight bits.
-
- C. Huffman Coding
-
- Given a set of symbols and their associated probabilities of occurrence,
- a Huffman code for those symbols is constructed by generating a tree. Start
- with a set of symbols as individual nodes. While there are still
- separate nodes, combine the two nodes with the lowest probability
- (resolve ties arbitrarily) into a new node with a probability equal
- to the sum of the probabilities of the two nodes that were combined. Attach
- one of the nodes to the new node via a 1 and the other node via a
- 0. After all of the nodes are combined, the code for each symbol
- is the sequence of ones and zeroes assigned to the branches of the
- tree starting at the root and ending at the symbol. [STO]
-
- Note that this algorithm results in a very similar code to a Shannon-Fanno
- code (which generates the tree by starting at the root and building
- out to the leaves), but it is computationally more efficient to implement.
-
- D. Pascal Programs
-
- The pascal programs that implement this linguistic file compression
- are in appendix B. The first one, COUNT, merely counts the frequency
- of occurrence of words in sample text. It is not necessary that the
- text analyzed be the same text to be compressed, just that it be similar. For
- example, better compression would be expected when using a code based
- on personnel records when using that code to compress personnel records
- than when compressing medical discussions. However, the significant
- gains made in compressing the more common words in the code offset
- the lesser compression (or even possible expansion) encountered when
- words not in the code are encountered. It is also desirable to COUNT
- a large sample of text to ensure accurate word frequencies.
-
- The second program, MAKETREE takes the frequency data from the output
- file generated by COUNT and generates a Huffman code from it. It
- then writes this tree out into a file that is used by SQUEEZE. To
- get reasonable speed from this program, it uses as much memory as
- MS-DOS will let it for the more frequently accessed information, and
- uses a temporary file for the rest of the information. The program
- runs significantly faster if this temporary file is on a RAM disk
- in extended or expanded memory (beyond the basic 640 KBytes of the
- MS-DOS domain).
-
- SQUEEZE, the third program, does the squeezing and unsqueezing of
- files based on the output file of MAKETREE. The same Huffman coding
- tree can be used for many different input text files, because the
- statistics of English (especially when the same types of text, such
- as all business letters or all technical reports) remain fairly constant
- even when the content of the text conveys different meanings. SQUEEZE
- handles words not in its dictionary in stride, but highlights them
- as they scroll across the screen in processing so that the user can
- get a feel for how well the text is matched to the code tree in use.
- If only a few proper names and such are highlighted, then the code
- in use is a good one to use. If not, then perhaps a regeneration
- of the code would be in order. Note that the code file used to SQUEEZE
- a file must be identical to the one used to unSQUEEZE it, so it must
- either be stored or sent with the file, or sent in advance and agreed
- upon by communicating parties.
-
- After testing these programs on various ASCII text files, it can be seen that
- significant compression can be obtained, and that the compressed files can be
- recovered reliably. There is, however, room for improvement in the exact way
- that text is parsed and in the speed of the implementation. Nevertheless,
- when maximum security is desired, I recommend first compressing an ASCII text
- file with SQUEEZE, then encrypting it with CRYPTMPJ. The code file
- (LANGUAGE.COD) used by SQUEEZE and the encryption key should be sent by a
- separate, secure channel to the receiving party (or stored in a physically
- secure location for data storage) in advance of the message. The combination
- of minimal natural redundancy in the message sent and the use of a good
- encryption algorithm should be difficult to crack.
-
- IX. CONCLUSION
-
- The MPJ Encryption algorithm provides an alternative to DES that is
- more secure, providing that the software and/or hardware does not
- get corrupted or tampered with, and providing that the keys are managed
- effectively. I have implemented and tested the algorithm in software
- for the IBM PC and compatible machines. The algorithm may be implemented
- directly in hardware for faster data rates. The MPJ Encryption Algorithm
- may be used by itself, or it may be improved upon by compressing the
- data before encryption.
-
- The use of data compression in conjunction with any encryption algorithm
- drastically increases the security of the encrypted data. For natural
- language text, one approach that yields improved compression over
- the compression of constant size blocks of data is the compression
- of linguistic units, such as words. I have achieved reversible compression
- of such files to less than 35% of their original size using linguistic
- parsing and a Huffman code. For binary data, such as computer programs,
- the use of existing techniques, such as those in PKARC, written by
- Phil Katz and available on most computer bulletin boards, are recommended.
-
- It is hoped that the MPJ encryption algorithm, as well as some of
- my ideas on data compression, will make a positive contribution towards
- data security, communications privacy, and efficiency of data storage
- and transmission. May God prosper those who use this knowledge and
- build upon it for good.
-
- REFERENCES
-
- [ALB] Douglas J. Albert and Stephen P. Morse,``Combating
- Software Piracy by Encryption and Key Management,''Computer,
- Vol. 21, No. 5, April 1984, pp. 68-73 Los Alamitos, CA: IEEE Computer
- Society.
-
- [AME] Stanley R. Ames, Jr., Morrie Gasser, and Roger
- R. Schell, ``Security Kernel Design and Implementation: An Introduction,''
- Computer, Vol. 16, No. 7, July 1983, pp. 14-22 Los Alamitos,
- CA: IEEE Computer Society.
-
- [BAL] W. W. Rouse Ball & H. S. M. Coxeter, Mathematical
- Recreations & Essays. New York: The MacMillan Company, 1962.
-
- [BAU] Friedrich L. Bauer, ``Cryptology - Methods
- and Maxims,'' in Proceedings of the Workshop on Cryptography at
- Burg Feuerstein, Germany, March 29-April 2, 1982, Berlin, Heidelberg,
- New York: Springer-Verlag, 1983.
-
- [BEK] H. J. Beker, ``A Survey of Encryption Algorithms,''
- in International Conference on Secure Communication Systems,
- 22-23 February 1984, London, New York: Institute of Electrical
- Engineers, 1984. TK5102.5.I519.1984
-
- [BET] Thomas Beth, Introduction to Proceedings of
- the Workshop on Cryptography at Burg Feuerstein, Germany, March 29
- - April 2, 1982, Berlin, Heidelberg, New York: Springer-Verlag,
- 1983.
-
- [BON] D. J. Bond, ``Practical Primality Testing,'' in
- International Conference on Secure Communication Systems, 22-23
- February 1984, London, New York: Institute of Electrical Engineers,
- 1984. TK5102.5.I519.1984
-
- [BRO] C. B. Brookson and S. C. Serpell, ``Security on
- the British Telecom Satstream Service'' in International Conference
- on Secure Communication Systems, 22-23 February 1984, London,
- New York: Institute of Electrical Engineers, 1984. TK5102.5.I519.1984
-
- [BUR] Dave Bursky, ``Protect Your EEPROM Data and Gain
- More Flexibility,'' Electronic Design, 26 May 1988, pp.43-48.
- Waseca, MN: Brown Printing Co.
-
- [CHA] Leslie S. Chalmers, ``An Analysis of the Differences
- Between the Computer Security Practices in the Military and Private
- Sectors,'' in Proceedings of the IEEE 1986 Symposium on Security
- and Privacy, Oakland, CA, April 7-9, 1986. QA76.9.A25.595.1986
-
- [CHI] H. R. Chivers, ``A Practical Fast Exponentiation
- Algorithm for Public Key,'' in International Conference on Secure
- Communication Systems, 22-23 February 1984, London, New York:
- Institute of Electrical Engineers, 1984. TK5102.5.I519.1984
-
- [COO] R. H. Cooper, ``Linear Transformations in Galois
- Fields and their Application to Cryptography,'' Cryptologia Magazine,
- Volume 4, Number 3, July 1980, pages 184-188. Republished electronically
- by Tony Patti.
-
- [COS] Terry Costlow, ``Global Computer Network Cracks
- Cryptographic Mathematics Barrier,'' Electronic Engineering Times,
- Issue 508, 17 October 1988, p. 14, Manhasset, NY: CMP Publications,
- Inc.
-
- [CRC] Handbook of Chemistry and Physics, 53rd
- Edition, Cleveland, OH: The Chemical Rubber Company, 1972.
-
- [DEA] Cipher A. Deavers and Louis Kruh, Machine
- Cryptography and Modern Cryptanalysis, Norwood, MA: Artech House,
- Inc., 1985. Z103.D43.1985
-
- [DEN] Dorothy Elisabeth Robling Denning, Cryptography
- and Data Security, Reading, MA; Menlo Park, CA; London; Amsterdam;
- Don Mills, Ontario; Sydney: Addison-Wesley Publishing Company, 1982.
- QA76.9.A25.D46.1982
-
- [DES] National Bureau of Standards, Federal Information
- Processing Standards Publication Number 46 Dated 15 January 1977.
-
- [DOH] Richard Doherty, ``FBI Nabs Pirated VCII,'' Electronic
- Engineering Times, Manhasset, N. Y.: CMP Publications, Inc., Issue
- 500, August 22, 1988.
-
- [EET] ``NSA OKs Decryption Unit,'' Electronic Engineering
- Times, August 29, 1988, Manhasset, N. Y.: CMP Publications, Inc.
-
- [FRA] Ole Immanuel Franksen, Mr. Babbage's Secret
- - The Tale of a Cipher and APL, Englewood Cliffs, NJ: Prentice-Hall,
- 1985. Z103.B2.F72.1985
-
- [FRI] H. J. Beker, J. M. K. Friend, and P. W. Halliden,
- ``Simplifying Key Management in Electronic Fund Transfer Point of
- Sale Systems,'' in International Conference on Secure Communication
- Systems, 22-23 February 1984, London, New York: Institute of
- Electrical Engineers, 1984. TK5102.5.I519.1984
-
- [FRM] Lester J. Fraim, ``Scomp: A Solution to the Multilevel
- Security Problem,'' Computer, Vol. 16, No. 7, July 1983,pp.
- 26-34, Los Alamitos, CA: IEEE Computer Society.
-
- [GOO] G. Goos and J. Hartmans, Lecture Notes in
- Computer Science. Z102.5.W67.1982
-
- [GOR] J. A. Gordon and H. Retkin, ``Are Big S-Boxes
- Best?'' in Proceedings of the Workshop on Cryptography at Burg
- Feuerstein, Germany, March 29-April 2, 1982, Berlin, Heidelberg,
- New York: Springer-Verlag, 1983.
-
- [HAE] D. G. Haenshke, D. A. Kettler, and E. Oberer,
- ``Network Management and Congestion in the U. S. Telecommunications
- Network,'' IEEE Transactions on Communications, volume COM-29,pp.
- 376-385, April 1981.
-
- [HEL] Gilbert Held and Thomas R. Marshall, Data
- Compression, Second Edition; Chichester, New York, Brisbane,Toronto,
- Singapore: John Wiley & Sons, 1987. QA76.9.D33.H44.1987
-
- [HIL] Lester S. Hill, ``Cryptography in an Algebraic
- Alphabet,'' American Mathematical Monthly, June 1929, pages
- 306-312, the American Mathematical Society. Republished electronically
- by Tony Patti by permission.
-
- [HLL] Lester S. Hill, ``Concerning Certain Linear Transformation
- Apparatus of Cryptography,'' American Mathematical Society,
- March 1931, pages 135-154. Republished electronically by Tony Patti
- by permission.
-
- [JAC] A. M. Jackson, N. A. McEvoy, and B. B.Newman,
- ``Project Universe Encryption Experiment'' in International Conference
- on Secure Communication Systems, 22-23 February 1984, London,
- New York: Institute of Electrical Engineers, 1984. TK5102.5.I519.1984
-
- [KAH] David Kahn, The Codebreakers - The Story
- of Secret Writing, New York: The MacMillan Company, 1987. Z103.K28
-
- [KRU] Cipher A. Deavers, David Kahn, Louis Kruh, Greg
- Mellen, and Brian Winkel, Cryptology Yesterday, Today and Tomorrow,
- Norwood, MA: Artech House, Inc., 1987. Z103.C76.1987
-
- [LAN] Carl E. Landwehr, The Best Available Technologies
- for Computer Security,'' Computer, Vol. 16, No. 7, July 1983,pp.
- 86-100, Los Alamitos, CA: IEEE Computer Society.
-
- [KAT] Phil Katz, PKARC FAST! Archive Create/Update
- Utility Version 3.5 04-27-87, published electronically in PKX35A35.EXE.
-
- [KON] Alan G. Konheim, Cryptography: A Primer,
- New York: John Wiley & Sons. Z103.K66
-
- [KOZ] Wladyslaw Kozaczuk, Enigma - How the German
- Machine Cipher was Broken and How it was Read by the Allies in World
- War Two, University Publications of America, Inc., 1984.D810.C88.K6813.1984
-
- [LIN] Christer Lind<130>n, ``Electronic Seal for Protection
- of Electronic Money in Sweden, Finland, and Norway'' in Proceedings
- of 1986 International Carnahan Conference on Security Technology:
- Electronic Crime Countermeasures, August 12-14, 1986. QA76.9.A25.C41.1986
-
- [LOD] N. Lodge, B. Flannaghan, and R. Morcom,``Vision
- Scrambling of C-MAC DBS Signals,'' in International Conference
- on Secure Communication Systems, 22-23 February 1984, London,
- New York: Institute of Electrical Engineers, 1984. TK5102.5.I519.1984
-
- [LU] W. P. Lu and M. K. Sandareshan, ``A Hierarchical
- Key Management Scheme for End-to-End Encryption in Internet Environments''
- in Proceedings of the IEEE 1986 Symposium on Security and Privacy,
- Oakland, CA, April 7-9, 1986. QA76.9.A25.595.1986
-
- [MCL] Vin McLellan, ``Drugs and DES: A New Connection,''
- Digital Review, August 24, 1987.
-
- [MER] Ralph C. Merkle, Secrecy, Authentication,
- and Public Key Systems, Ann Arbor, MI: UMI Research Press,1982. QA76.9.A25M47.1982
-
- [MEY] Carl H. Meyer & Stephen M. Matyas,Cryptography: a
- New Dimension in Computer Data Security - A Guide for the Design
- and Implementation of Secure Systems, New York, Chichester, Brisbane,
- Toronto, Singapore: John Wiley & Sons, 1982. Z103.M55.1982
-
- [MIT] C. J. Mitchell, ``A Comparison of the Cryptographic
- Requirements for Digital Secure Speech Systems Operating at Different
- Bit Rates,'' in International Conference on Secure Communication
- Systems, 22-23 February 1984, London, New York: Institute of
- Electrical Engineers, 1984. TK5102.5.I519.1984
-
- [PAT] Tony Patti, A Galois Field Cryptosystem,
- 1986. Published electronically by Tony Patti, editor of Cryptosystems
- Journal, 9755 Oatley Lane, Burke, VA 22015.
-
- [PIP] Fred Piper, ``Stream Ciphers,'' in Proceedings
- of the Workshop on Cryptography at Burg Feuerstein, Germany, March
- 29-April 2, 1982, Berlin, Heidelberg, New York: Springer-Verlag,
- 1983.
-
- [PER] Tekla S. Perry and Paul Wallich, ``Can Computer
- Crime be Stopped?,'' IEEE Spectrum, May, 1984, pp.34-49. New
- York, NY: The Institute of Electrical and Electronic Engineers, Inc.
-
- [RET] H. Retkin, ``Multi-Level Knapsack Encryption,''
- in International Conference on Secure Communication Systems,
- 22-23 February 1984, London, New York: Institute of Electrical
- Engineers, 1984. TK5102.5.I519.1984
-
- [SAT] J. Sattler and C. P. Schnorr, ``Ein Effizienzvergleich
- Der Faktorisierungsverfahren von Morrison-Brillhart und Schroeppel,''
- in Proceedings of the Workshop on Cryptography at Burg Feuerstein,
- Germany, March 29-April 2, 1982, Berlin, Heidelberg, New York:
- Springer-Verlag, 1983.
-
- [SCH] C. P. Schnorr, ``Is the RSA Scheme Safe?'' in
- Proceedings of the Workshop on Cryptography at Burg Feuerstein,
- Germany, March 29-April 2, 1982, Berlin, Heidelberg, New York:
- Springer-Verlag, 1983.
-
- [SHA] C. E. Shannon, ``Communication Theory of Secrecy
- Systems,'' Bell System Technical Journal, Volume 28, 1949.
-
- [STO] James A. Storer, Data Compression Methods
- and Theory. Rockville, MD: Computer Science Press, 1988. QA76.9.D33S76
-
- [TIL] Henk C. A. van Tilborg, An Introduction to
- Cryptology, Boston, Dordrecht, Lancaster: Kluwer Academic Publishers,
- 1988. Z103.T54.1987
-
- [RET] H. Retkin, ``Multi-Level Knapsack Encryption,''
- in International Conference on Secure Communication Systems,
- 22-23 February 1984, London, New York: Institute of Electrical
- Engineers, 1984. TK5102.5.I519.1984
-
- [ULT] Ultron Labs Corporation, ``Crypto Module Processes
- TOP SECRET Data,'' EE Product News, October 1988, p. 16, Overland
- Park, KS: Intertec Publishing.
-
- [VAN] J. Vandewalle, R. Govaerts, W. De Becker, M. Decroos,
- & G. Speybrouk, ``RSA-Based Implementation of Public Key Cryptographic
- Protection in Office Systems'' in Proceedings of 1986 International
- Carnahan Conference on Security Technology: Electronic Crime Countermeasures,
- August 12-14, 1986. QA76.9.A25.C41.1986
-
- APPENDIX A. MPJ in Pascal
-
- {$R-} {Range checking off}
-
- {$S-} {Stack checking off}
-
- {$I+} {I/O checking on}
-
- {$N-} {Don't use 80x87 numeric coprocessor}
-
- program cryptmpj;
-
- { Encryption designed to exceed DES in security value and be implementable
- on
-
- an IBM PC compatible machine. }
-
-
-
- Uses Crt, Dos; { Include screen handling & MS DOS interface functions.
- }
-
-
-
- const maxinbuffer = 16;
-
- ap = #39; { Apostrophe for write
- statements. }
-
-
-
- type blocks = array[0..15] of byte; { 16 byte (128 bit)
- blocks. }
-
- stype = array[0..9,0..15,0..255] of byte; { Holds substitution
- boxes. }
-
- ptrstype = ^stype; { Dynamic variable required to get beyond
- 64K limit. }
-
- string80 = string[80]; { Long character strings.
- }
-
- string15 = string[15]; { Short character
- strings. }
-
-
-
- var initial, { Initialiazation vector. }
-
- feedback, { Chaining feedback array. }
-
- key, { Encryption/decryption key. }
-
- buffer, outbuf: blocks; { File read/write buffers. }
-
- s, { Substitution boxes. }
-
- si: ptrstype; { Inverse substitution boxes. }
-
- bytesdone, { Number of bytes done. }
-
- number: longint; { Used to read in key & initialization
- vector. }
-
- que, { Position in file specification que.
- }
-
- i, j, k, { Iteration & array indexes. }
-
- actualread: integer; { Actual number of bytes read in from
- file. }
-
- inputfile: file; { File to encrypt or decrypt in place.
- }
-
- keyfile: text; { External key file. }
-
- intkeyfile: file of stype; {Holds s and si. }
-
- keyname, { Name of external key file. }
-
- intkeyname, { Name of internal key file. }
-
- path, { Used in parsing file specification.
- }
-
- temp: string[255]; { Used in determining encrypt/decrypt
- option. }
-
- inputfilename, { MS-DOS name of inputfile. }
-
- filespec: string80; { File specification that may include
- wild cards}
-
- fileque: array[0..15] of string80; { Names of multiple file specifications}
-
- name: string15; { Name of next file to process. }
-
- decryption, { True iff decryption is desired. }
-
- filefound: boolean; { At least one file was found to process.}
-
- ch: char; { Character input. }
-
- search: searchrec; { Used in finding matches to wild cards.
- }
-
-
-
- procedure permute(x: blocks; var y: blocks);
-
- { This procedure is designed to make each bit of the output dependent
- on as
-
- many bytes of the input as possible, especially after repeated application.
-
- Each output byte takes its least significant bit from the corresponding
-
- input byte. The next higher bit comes from the corresponding bit
- of the
-
- input byte to the left. This is done until all bits of the output
- byte
-
- are filled. Where there is no byte to the left, the byte at the
- far right
-
- is used. }
-
-
-
- begin
-
- y[0] := (x[0] and 1) or (x[1] and 2) or (x[2] and 4) or
-
- (x[3] and 8) or (x[4] and 16) or (x[5] and 32) or
-
- (x[6] and 64) or (x[7] and 128);
-
- y[1] := (x[1] and 1) or (x[2] and 2) or (x[3] and 4) or
-
- (x[4] and 8) or (x[5] and 16) or (x[6] and 32) or
-
- (x[7] and 64) or (x[8] and 128);
-
- y[2] := (x[2] and 1) or (x[3] and 2) or (x[4] and 4) or
-
- (x[5] and 8) or (x[6] and 16) or (x[7] and 32) or
-
- (x[8] and 64) or (x[9] and 128);
-
- y[3] := (x[3] and 1) or (x[4] and 2) or (x[5] and 4) or
-
- (x[6] and 8) or (x[7] and 16) or (x[8] and 32) or
-
- (x[9] and 64) or (x[10] and 128);
-
- y[4] := (x[4] and 1) or (x[5] and 2) or (x[6] and 4) or
-
- (x[7] and 8) or (x[8] and 16) or (x[9] and 32) or
-
- (x[10] and 64) or (x[11] and 128);
-
- y[5] := (x[5] and 1) or (x[6] and 2) or (x[7] and 4) or
-
- (x[8] and 8) or (x[9] and 16) or (x[10] and 32) or
-
- (x[11] and 64) or (x[12] and 128);
-
- y[6] := (x[6] and 1) or (x[7] and 2) or (x[8] and 4) or
-
- (x[9] and 8) or (x[10] and 16) or (x[11] and 32) or
-
- (x[12] and 64) or (x[13] and 128);
-
- y[7] := (x[7] and 1) or (x[8] and 2) or (x[9] and 4) or
-
- (x[10] and 8) or (x[11] and 16) or (x[12] and 32) or
-
- (x[13] and 64) or (x[14] and 128);
-
- y[8] := (x[8] and 1) or (x[9] and 2) or (x[10] and 4) or
-
- (x[11] and 8) or (x[12] and 16) or (x[13] and 32) or
-
- (x[14] and 64) or (x[15] and 128);
-
- y[9] := (x[9] and 1) or (x[10] and 2) or (x[11] and 4) or
-
- (x[12] and 8) or (x[13] and 16) or (x[14] and 32) or
-
- (x[15] and 64) or (x[0] and 128);
-
- y[10] := (x[10] and 1) or (x[11] and 2) or (x[12] and 4) or
-
- (x[13] and 8) or (x[14] and 16) or (x[15] and 32) or
-
- (x[0] and 64) or (x[1] and 128);
-
- y[11] := (x[11] and 1) or (x[12] and 2) or (x[13] and 4) or
-
- (x[14] and 8) or (x[15] and 16) or (x[0] and 32) or
-
- (x[1] and 64) or (x[2] and 128);
-
- y[12] := (x[12] and 1) or (x[13] and 2) or (x[14] and 4) or
-
- (x[15] and 8) or (x[0] and 16) or (x[1] and 32) or
-
- (x[2] and 64) or (x[3] and 128);
-
- y[13] := (x[13] and 1) or (x[14] and 2) or (x[15] and 4) or
-
- (x[0] and 8) or (x[1] and 16) or (x[2] and 32) or
-
- (x[3] and 64) or (x[4] and 128);
-
- y[14] := (x[14] and 1) or (x[15] and 2) or (x[0] and 4) or
-
- (x[1] and 8) or (x[2] and 16) or (x[3] and 32) or
-
- (x[4] and 64) or (x[5] and 128);
-
- y[15] := (x[15] and 1) or (x[0] and 2) or (x[1] and 4) or
-
- (x[2] and 8) or (x[3] and 16) or (x[4] and 32) or
-
- (x[5] and 64) or (x[6] and 128);
-
- end;
-
-
-
- procedure ipermute(x: blocks; var y: blocks);
-
- { This is the inverse of the procedure permute. }
-
-
-
- begin
-
- y[0] := (x[0] and 1) or (x[15] and 2) or (x[14] and 4) or
-
- (x[13] and 8) or (x[12] and 16) or (x[11] and 32) or
-
- (x[10] and 64) or (x[9] and 128);
-
- y[1] := (x[1] and 1) or (x[0] and 2) or (x[15] and 4) or
-
- (x[14] and 8) or (x[13] and 16) or (x[12] and 32) or
-
- (x[11] and 64) or (x[10] and 128);
-
- y[2] := (x[2] and 1) or (x[1] and 2) or (x[0] and 4) or
-
- (x[15] and 8) or (x[14] and 16) or (x[13] and 32) or
-
- (x[12] and 64) or (x[11] and 128);
-
- y[3] := (x[3] and 1) or (x[2] and 2) or (x[1] and 4) or
-
- (x[0] and 8) or (x[15] and 16) or (x[14] and 32) or
-
- (x[13] and 64) or (x[12] and 128);
-
- y[4] := (x[4] and 1) or (x[3] and 2) or (x[2] and 4) or
-
- (x[1] and 8) or (x[0] and 16) or (x[15] and 32) or
-
- (x[14] and 64) or (x[13] and 128);
-
- y[5] := (x[5] and 1) or (x[4] and 2) or (x[3] and 4) or
-
- (x[2] and 8) or (x[1] and 16) or (x[0] and 32) or
-
- (x[15] and 64) or (x[14] and 128);
-
- y[6] := (x[6] and 1) or (x[5] and 2) or (x[4] and 4) or
-
- (x[3] and 8) or (x[2] and 16) or (x[1] and 32) or
-
- (x[0] and 64) or (x[15] and 128);
-
- y[7] := (x[7] and 1) or (x[6] and 2) or (x[5] and 4) or
-
- (x[4] and 8) or (x[3] and 16) or (x[2] and 32) or
-
- (x[1] and 64) or (x[0] and 128);
-
- y[8] := (x[8] and 1) or (x[7] and 2) or (x[6] and 4) or
-
- (x[5] and 8) or (x[4] and 16) or (x[3] and 32) or
-
- (x[2] and 64) or (x[1] and 128);
-
- y[9] := (x[9] and 1) or (x[8] and 2) or (x[7] and 4) or
-
- (x[6] and 8) or (x[5] and 16) or (x[4] and 32) or
-
- (x[3] and 64) or (x[2] and 128);
-
- y[10] := (x[10] and 1) or (x[9] and 2) or (x[8] and 4) or
-
- (x[7] and 8) or (x[6] and 16) or (x[5] and 32) or
-
- (x[4] and 64) or (x[3] and 128);
-
- y[11] := (x[11] and 1) or (x[10] and 2) or (x[9] and 4) or
-
- (x[8] and 8) or (x[7] and 16) or (x[6] and 32) or
-
- (x[5] and 64) or (x[4] and 128);
-
- y[12] := (x[12] and 1) or (x[11] and 2) or (x[10] and 4) or
-
- (x[9] and 8) or (x[8] and 16) or (x[7] and 32) or
-
- (x[6] and 64) or (x[5] and 128);
-
- y[13] := (x[13] and 1) or (x[12] and 2) or (x[11] and 4) or
-
- (x[10] and 8) or (x[9] and 16) or (x[8] and 32) or
-
- (x[7] and 64) or (x[6] and 128);
-
- y[14] := (x[14] and 1) or (x[13] and 2) or (x[12] and 4) or
-
- (x[11] and 8) or (x[10] and 16) or (x[9] and 32) or
-
- (x[8] and 64) or (x[7] and 128);
-
- y[15] := (x[15] and 1) or (x[14] and 2) or (x[13] and 4) or
-
- (x[12] and 8) or (x[11] and 16) or (x[10] and 32) or
-
- (x[9] and 64) or (x[8] and 128);
-
- end;
-
-
-
- procedure makesbox(key: blocks);
-
- { This procedure generates internal keys by filling the substitution
- box array
-
- s^ based on the external key given as input. }
-
- var i, j, k: integer;
-
- procedure makeonebox(i, j: integer; key: blocks);
-
- var pos, m, n, p, startbit, bitmask, startbyte, keybyte: word;
-
- empty: array[0..255] of boolean;
-
- begin
-
- for m := 0 to 255 do { The empty array is used to make sure
- that }
-
- empty[m] := true; { each byte of the array is filled only
- once. }
-
- startbit := 1;
-
- startbyte := 0;
-
- for n := 255 downto 128 do { n counts the number of bytes left
- to fill }
-
- begin
-
- keybyte := startbyte;
-
- bitmask := startbit;
-
- m := 0;
-
- for p := 0 to 7 do { m is obtained by bit selection on
- the key }
-
- begin
-
- m := m or (key[keybyte] and bitmask);
-
- bitmask := bitmask shl 1;
-
- if bitmask > 128 then
-
- begin
-
- bitmask := 1;
-
- inc(keybyte);
-
- if keybyte > 15 then keybyte := 0;
-
- end;
-
- end;
-
- pos := (n * m) div 255; { pos is the position among the
- UNFILLED }
-
- m := 0; { components of the s^ array that
- the }
-
- p := 0; { number n should be placed. }
-
- while m < pos do
-
- begin
-
- inc(p);
-
- if empty[p] then inc(m);
-
- end;
-
- while not empty[p] do inc(p);
-
- s^[i, j, p] := n;
-
- empty[p] := false;
-
- startbit := startbit shl 1; { The starting position of
- the bit }
-
- if startbit > 128 then { selection for the key is
- rotated }
-
- begin { left one bit for the next
- n. }
-
- startbit := 1;
-
- inc(startbyte);
-
- if startbyte > 15 then startbyte := 0
-
- end;
-
- end;
-
- startbyte := 0;
-
- startbit := 1;
-
- for n := 127 downto 1 do { This half of the algorithm
- is the }
-
- begin { same as the upper half, except
- that }
-
- keybyte := startbyte; { only 7 bits are selected
- for m. }
-
- bitmask := startbit;
-
- m := 0;
-
- for p := 0 to 6 do
-
- begin
-
- m := m or (key[keybyte] and bitmask);
-
- bitmask := bitmask shl 1;
-
- if bitmask > 64 then
-
- begin
-
- bitmask := 1;
-
- keybyte := keybyte + 3;
-
- if keybyte > 15 then keybyte := keybyte - 16;
-
- end;
-
- end;
-
- pos := (n * m) div 127;
-
- m := 0;
-
- p := 0;
-
- while m < pos do
-
- begin
-
- inc(p);
-
- if empty[p] then inc(m);
-
- end;
-
- while not empty[p] do inc(p);
-
- s^[i, j, p] := n;
-
- empty[p] := false;
-
- startbit := startbit shl 1;
-
- if startbit > 64 then
-
- begin
-
- startbit := 1;
-
- inc(startbyte);
-
- if startbyte > 15 then startbyte := 0
-
- end;
-
- end;
-
- p := 0;
-
- while not empty[p] do
-
- inc(p);
-
- s^[i,j,p] := 0;
-
- end;
-
-
-
- begin
-
- new(s);
-
- for i := 0 to 9 do
-
- for j := 0 to 15 do
-
- begin
-
- write(#13,'Filling substitution boxes for round ',i+1,'
- of 10, byte ',
-
- j+1,' of 16. ');
-
- makeonebox(i, j, key);
-
- permute(key, key); { Shuffle key bit positions. }
-
- for k := 0 to 15 do { Run key through last s-box
- before }
-
- key[k] := s^[i, j, key[k]]; { making the next s-box. }
-
- end;
-
- writeln;
-
- end;
-
-
-
- procedure makesi;
-
- { This procedure fills the inverse substitution box array si^. It
- is not
-
- necessary to call this procedure unless the decryption mode is used. }
-
- var i, j, k: integer;
-
- begin
-
- new(si);
-
- for i := 0 to 9 do
-
- for j := 0 to 15 do
-
- for k := 0 to 255 do
-
- si^[i, j, s^[i, j, k]] := k;
-
- end;
-
-
-
- {$I CRYPTINC.PAS} { Include additional file handling & user interface.
- }
-
-
-
- procedure substitute(round: integer; x: blocks; var y: blocks);
-
- var i: integer;
-
- begin
-
- for i := 0 to 15 do
-
- y[i] := s^[round,i,x[i]];
-
- end;
-
-
-
- procedure isubst(round: integer; x: blocks; var y: blocks);
-
- var i: integer;
-
- begin
-
- for i := 0 to 15 do
-
- y[i] := si^[round,i,x[i]];
-
- end;
-
-
-
- procedure encrypt(x: blocks; var y: blocks);
-
- { Encrypt a block of 16 bytes. }
-
- var z: blocks;
-
- begin
-
- substitute(0, x, y);
-
- permute(y, z);
-
- substitute(1, z, y);
-
- permute(y, z);
-
- substitute(2, z, y);
-
- permute(y, z);
-
- substitute(3, z, y);
-
- permute(y, z);
-
- substitute(4, z, y);
-
- permute(y, z);
-
- substitute(5, z, y);
-
- permute(y, z);
-
- substitute(6, z, y);
-
- permute(y, z);
-
- substitute(7, z, y);
-
- permute(y, z);
-
- substitute(8, z, y);
-
- permute(y, z);
-
- substitute(9, z, y);
-
- end;
-
-
-
- procedure decrypt(x: blocks; var y: blocks);
-
- { Decrypt a block of 16 bytes. }
-
- var z: blocks;
-
- begin
-
- isubst(9, x, y);
-
- ipermute(y, z);
-
- isubst(8, z, y);
-
- ipermute(y, z);
-
- isubst(7, z, y);
-
- ipermute(y, z);
-
- isubst(6, z, y);
-
- ipermute(y, z);
-
- isubst(5, z, y);
-
- ipermute(y, z);
-
- isubst(4, z, y);
-
- ipermute(y, z);
-
- isubst(3, z, y);
-
- ipermute(y, z);
-
- isubst(2, z, y);
-
- ipermute(y, z);
-
- isubst(1, z, y);
-
- ipermute(y, z);
-
- isubst(0, z, y);
-
- end;
-
-
-
- begin
-
- startup; { Initialize files, call makesbox (and makesi if required).
- }
-
- que := 0;
-
- while (que < 16) and (fileque[que] <> '') do
-
- begin
-
- filespec := fileque[que];
-
- inputfilename := filespec;
-
- path := '';
-
- repeat
-
- i := pos('\',inputfilename);
-
- if i > 0 then
-
- begin
-
- path := path + copy(inputfilename, 1, i);
-
- inputfilename := copy(inputfilename, i+1, length(inputfilename));
-
- end;
-
- until i = 0;
-
- findfirst(filespec, $22, search);
-
- while doserror = 0 do
-
- begin
-
- inputfilename := path + search.name;
-
- assign(inputfile, inputfilename);
-
- reset(inputfile,1);
-
- filefound := true;
-
- bytesdone := 0;
-
- for i := 0 to 15 do
-
- feedback[i] := initial[i];
-
- if decryption then
-
- begin
-
- writeln('Decrypting ',inputfilename);
-
- while not eof(inputfile) do
-
- begin
-
- blockread(inputfile,buffer,maxinbuffer,actualread);
-
- encrypt(feedback,outbuf);
-
- { Note that decrypt is not ever called when using
- block
-
- chaining with ciphertext feedback. It would be
- called
-
- if the electronic codebook mode were used, instead.
- }
-
- for i := 0 to 15 do
-
- begin
-
- outbuf[i] := outbuf[i] xor buffer[i];
-
- feedback[i] := buffer[i]
-
- end;
-
- seek(inputfile, filepos(inputfile) - actualread);
-
- blockwrite(inputfile, outbuf, actualread);
-
- bytesdone := bytesdone + actualread;
-
- write(#13, bytesdone:9,' bytes decrypted. ');
-
- end;
-
- writeln(#13, bytesdone:9,' bytes decrypted. ');
-
- end
-
- else
-
- begin
-
- writeln('Encrypting ',inputfilename);
-
- while not eof(inputfile) do
-
- begin
-
- blockread(inputfile,buffer,maxinbuffer,actualread);
-
- encrypt(feedback,outbuf);
-
- for i := 0 to 15 do
-
- begin
-
- outbuf[i] := outbuf[i] xor buffer[i];
-
- feedback[i] := outbuf[i]
-
- end;
-
- seek(inputfile, filepos(inputfile) - actualread);
-
- blockwrite(inputfile, outbuf, actualread);
-
- bytesdone := bytesdone + actualread;
-
- write(#13, bytesdone:9,' bytes encrypted. ');
-
- end;
-
- writeln(#13, bytesdone:9,' bytes encrypted. ');
-
- end;
-
- close(inputfile);
-
- findnext(search);
-
- end;
-
- inc(que);
-
- end;
-
- if not filefound then writeln('No matching files found.');
-
- writeln('Done.');
-
- end.
-
- APPENDIX B. Linguistic Data Compression Programs
-
- program count;
-
- {Program to count the probability of occurances of words in text.}
-
-
-
- uses dos, crt;
-
- const maxbuf = 4096;
-
- maxstr = 15;
-
- ap = #39;
-
-
-
- type wd = string[maxstr];
-
- ptr = ^entry;
-
- yarn = string[80];
-
- entry = record
-
- wrd: wd; {Linguistic word}
-
- count: longint; {How many times this word was found}
-
- next: ptr; {Next record in sequence}
-
- end;
-
-
-
- var outfile, {File with output report}
-
- infile: text; {File with input text to be analyzed}
-
- bottom, {Last entry record}
-
- newbuf, {New entry record}
-
- prev: ptr; {Previous entry record}
-
- buffer, {Current entry record}
-
- top: array[1..255] of ptr; {First entry record}
-
- words: wd; {Word under construction}
-
- temp,
-
- path, {Path of input file}
-
- filemask, {File name mask}
-
- filespec, {Path + name (including wild cards)}
-
- inname, {Name of input file}
-
- outname: yarn; {Name of output file}
-
- ch: char;
-
- i, j, cmdparameter, topindex: integer;
-
- cnt: longint;
-
- filefound: boolean;
-
- inbuf, outbuf: array[1..maxbuf] of byte;
-
- s: searchrec; {Record fill: array[1..21] of byte;attr:byte;time,
-
- size:longint;name: string[12]}
-
-
-
- function exist(filename: wd): boolean;
-
- {Returns TRUE iff the file exists.}
-
- var f: file;
-
- begin {exist}
-
- assign(f,filename);
-
- {$I-}
-
- reset(f);
-
- {$I+}
-
- if IOresult = 0 then
-
- begin
-
- exist := true;
-
- close(f)
-
- end
-
- else
-
- exist := false;
-
- end; {exist}
-
-
-
- function punct(ch: char): boolean;
-
- {TRUE iff ch is not a letter or chr(255).}
-
- var c: integer;
-
- begin
-
- c := ord(ch);
-
- if ((c > 10) and (c <= 64)) or ((c >= 91) and (c <= 96))
-
- or ((c >= 123) and (c <= 254)) or ((c > 0) and (c < 10)) then
-
- punct := true
-
- else
-
- punct := false;
-
- end;
-
-
-
- function letter(ch: char): boolean;
-
- {TRUE iff ch is a letter A-Z or a-z.}
-
- var c: integer;
-
- begin
-
- c := ord(ch);
-
- if ((c >= 65) and (c <= 90)) or ((c >= 97) and (c <= 122)) then
-
- letter := true
-
- else
-
- letter := false;
-
- end;
-
-
-
- procedure parse(var words: wd; var ch: char);
-
- {Extracts a word or item of punctuation from input file.}
-
- begin {parse}
-
- words := '';
-
- while not (letter(ch) or punct(ch) or eof(infile)) do
-
- begin
-
- read(infile,ch);
-
- end;
-
- if punct(ch) then
-
- begin
-
- if ord(ch) = 13 then
-
- begin
-
- words := chr(255)+chr(255);
-
- if not eof(infile) then read(infile, ch);
-
- end
-
- else
-
- if ch = ' ' then
-
- begin
-
- words := ch;
-
- while (not eof(infile)) and (ch = ' ') and (length(words)
- < maxstr) do
-
- begin
-
- read(infile,ch);
-
- if ch = ' ' then words := words + ch;
-
- end;
-
- if (length(words) = maxstr) and (not eof(infile)) then
-
- read(infile,ch);
-
- end
-
- else
-
- begin
-
- words := ch; {return one punctuation character}
-
- if not eof(infile) then read(infile, ch);
-
- end;
-
- end
-
- else
-
- begin
-
- if letter(ch) then
-
- begin
-
- words := ch;
-
- while (not eof(infile)) and letter(ch) and (length(words)
- < maxstr) do
-
- begin
-
- read(infile,ch);
-
- if letter(ch) then words := words + ch;
-
- end;
-
- if (length(words) = maxstr) and (not eof(infile)) then
-
- read(infile,ch);
-
- end;
-
- end;
-
- end; {parse}
-
-
-
- procedure update(words: wd);
-
- {Increments the count associated with a word.}
-
- begin
-
- topindex := ord(words[1]);
-
- buffer[topindex] := top[topindex];
-
- {Find record that matches word, if it exists.}
-
- while (buffer[topindex]^.wrd < words) and (buffer[topindex]^.next
- <> nil) do
-
- begin
-
- prev := buffer[topindex];
-
- buffer[topindex] := buffer[topindex]^.next
-
- end;
-
-
-
- if buffer[topindex]^.wrd = words then
-
- {Increment count on word that already exists}
-
- inc(buffer[topindex]^.count)
-
- else
-
- begin {Add word to middle of list with count of one}
-
- new(newbuf);
-
- newbuf^.wrd := words;
-
- newbuf^.count := 1;
-
- newbuf^.next := prev^.next;
-
- prev^.next := newbuf;
-
- end;
-
- end;
-
-
-
- procedure report;
-
- {Produces a report file.}
-
- var total: longint;
-
- begin
-
- assign(outfile, paramstr(1));
-
- rewrite(outfile);
-
- total := 0;
-
- for topindex := 1 to 255 do
-
- begin
-
- buffer[topindex] := top[topindex];
-
- while (buffer[topindex]^.next <> nil) do
-
- begin
-
- buffer[topindex] := buffer[topindex]^.next;
-
- if buffer[topindex]^.count > 0 then
-
- begin
-
- writeln(outfile,buffer[topindex]^.wrd);
-
- writeln(outfile,buffer[topindex]^.count);
-
- total := total + buffer[topindex]^.count;
-
- end;
-
- end;
-
- end;
-
- writeln(outfile, chr(255));
-
- writeln(outfile, '1');
-
- writeln(outfile, 'Total count: ',total);
-
- close(outfile);
-
- end; {report}
-
-
-
- begin
-
- clrscr;
-
- writeln(' This program counts words in a set This program may
- be copied freely ');
-
- writeln(' of input files to determine their for educational
- use or to try it ');
-
- writeln(' frequency of occurance. This data out. If you like
- it, donations ');
-
- writeln(' is then stored in a file for use by will be accepted
- by the author. ');
-
- writeln(' MAKETREE, which constructs a code This program is
- believed to be ');
-
- writeln(' tree with the Huffman algorithm. reliable, but it
- is the user',ap,'s ');
-
- writeln(' The code tree is used by SQUEEZE. responsibility
- to determine its ');
-
- writeln(' fitness for use. No
- liability ');
-
- writeln(' Mike Johnson will be assumed
- by the author. ');
-
- writeln(' P. O. Box 1151 Copyright (C) 1988
- Mike Johnson. ');
-
- writeln(' Longmont, CO 80502-1151 All rights reserved. ');
-
- if paramcount < 2 then
-
- begin
-
- writeln;
-
- writeln('Syntax: COUNT outputfilename inputfilename[s]');
-
- writeln(' input file names may include * and ?');
-
- writeln(' Input files are assumed to be ASCII text.');
-
- writeln;
-
- writeln('If a previous COUNT ouput file named LANGUAGE.OUT exists,
- then new counts');
-
- writeln('will be added to the counts in LANGUAGE.OUT.');
-
- writeln('Maximum effectiveness will be obtained if the nature
- of the files COUNTed');
-
- writeln('is similar to the nature of the files to be compressed
- with SQUEEZE.');
-
- end
-
- else
-
- begin
-
- filefound := false;
-
- new(bottom);
-
- bottom^.wrd := chr(255);
-
- bottom^.count := 0;
-
- bottom^.next := nil;
-
- for topindex := 1 to 255 do
-
- begin
-
- new(top[topindex]);
-
- top[topindex]^.wrd := chr(1);
-
- top[topindex]^.count := 0;
-
- top[topindex]^.next := bottom;
-
- buffer[topindex] := top[topindex];
-
- end;
-
- ch := chr(0);
-
- words := ' ';
-
-
-
- if exist('LANGUAGE.OUT') then
-
- begin
-
- assign(infile,'LANGUAGE.OUT');
-
- reset(infile);
-
- writeln('Reading in existing LANGUAGE.OUT.');
-
- while (not eof(infile)) and (words <> chr(255)) do
-
- begin
-
- readln(infile,words);
-
- if words <> chr(255) then
-
- begin
-
- topindex := ord(words[1]);
-
- readln(infile,cnt);
-
- new(newbuf);
-
- newbuf^.wrd := words;
-
- newbuf^.count := cnt;
-
- newbuf^.next := bottom;
-
- buffer[topindex]^.next := newbuf;
-
- buffer[topindex] := newbuf;
-
- write(#13,words,' ');
-
- end;
-
- end;
-
- close(infile);
-
- writeln;
-
- end;
-
-
-
- {Initialize input files.}
-
- for cmdparameter := 2 to paramcount do
-
- begin
-
- filespec := paramstr(cmdparameter);
-
- filemask := filespec;
-
- path := '';
-
- repeat
-
- i := pos('\',filemask);
-
- if i > 0 then
-
- begin
-
- path := path + copy(filemask, 1, i);
-
- filemask := copy(filemask, i+1, length(filemask));
-
- end;
-
- until i = 0;
-
- findfirst(filespec, readonly + hidden + archive, s);
-
- while doserror = 0 do
-
- begin
-
- filefound := true;
-
- inname := path + s.name;
-
- assign(infile, inname);
-
- reset(infile);
-
- words := ' ';
-
- writeln('>>>>>>>>>>>>>>>>>>>>> Processing ',inname);
-
- while (not eof(infile)) and (words <> '') and (not keypressed)
- do
-
- begin
-
- parse(words,ch); {Find word or punctuation}
-
- if words = chr(255)+chr(255) then
-
- writeln
-
- else
-
- write(words);
-
- if length(words) > 0 then
-
- update(words); {Increment word counter}
-
- end;
-
- close(infile);
-
- findnext(s);
-
- end;
-
- end;
-
-
-
- if not filefound then writeln('No matching files found.');
-
- report;
-
- end;
-
- end.
-
- {$R+}
-
- {$M 3072,4096,655360}
-
- program maketree;
-
-
-
- uses crt;
-
- const maxstr = 15;
-
- ap = #39;
-
-
-
- type wd = string[maxstr];
-
- yarn = string[80];
-
- mem = ^mementry;
-
- entry = record
-
- wrd: wd; {Which linguistic word in input file}
-
- memory: mem; {Corresponding mementry.}
-
- end;
-
-
-
- mementry = record
-
- count: longint; {Word count.}
-
- onezero: byte; {Is this record pointed to by 1
- or 0?}
-
- root, {Position of record nearer root.}
-
- next: mem; {Position of next record.}
-
- end;
-
-
-
- var temp: file of entry; {Temporary file}
-
- infile: text; {File with input text to be analyzed}
-
- buffer, {Current entry record}
-
- newbuf: entry; {New entry record}
-
- mtop,
-
- mbottom,
-
- mbuffer,
-
- mnewbuf,
-
- mprev: mem;
-
- inname, {Name of input file}
-
- outname, {Name of output file}
-
- words: wd; {Word under construction}
-
- ch: char;
-
- cnt1,
-
- cnt2,
-
- cnt: longint;
-
- i, j: integer;
-
- tempname: yarn;
-
-
-
- function exist(filename: wd): boolean;
-
- {Returns TRUE iff the file exists.}
-
- var f: file;
-
- begin {exist}
-
- assign(f,filename);
-
- {$I-}
-
- reset(f);
-
- {$I+}
-
- if IOresult = 0 then
-
- begin
-
- exist := true;
-
- close(f)
-
- end
-
- else
-
- exist := false;
-
- end; {exist}
-
-
-
- procedure fanno;
-
- begin
-
- cnt1 := 0;
-
- repeat
-
-
-
- {Find smallest two records.}
-
- mprev := mtop^.next;
-
- mbuffer := mprev^.next;
-
-
-
- {Combine lowest 2 probabilities of counts.}
-
- mtop^.next := mbuffer^.next;
-
- new(mnewbuf);
-
- mnewbuf^.count := mbuffer^.count + mprev^.count;
-
- mnewbuf^.onezero := 2;
-
- mnewbuf^.root := nil;
-
- mprev^.root := mnewbuf;
-
- mprev^.onezero := 1;
-
- mbuffer^.root := mnewbuf;
-
- mbuffer^.onezero := 0;
-
-
-
- {Place combined entry in proper place in chain.}
-
- mbuffer := mtop;
-
- repeat
-
- mprev := mbuffer;
-
- mbuffer := mbuffer^.next;
-
- until (mnewbuf^.count <= mbuffer^.count);
-
- mprev^.next := mnewbuf;
-
- mnewbuf^.next := mbuffer;
-
-
-
- {Check for completion.}
-
- inc(cnt1);
-
- write(#13,cnt1,' nodes combined. ');
-
- until (cnt1 >= cnt2);
-
- end; {fanno}
-
-
-
- procedure placeit;
-
- begin
-
- write(#13,cnt:11,' ',words,' ');
-
- mbuffer := mtop;
-
- repeat
-
- mprev := mbuffer;
-
- mbuffer := mbuffer^.next;
-
- until mbuffer^.count >= cnt;
-
- new(mnewbuf);
-
- newbuf.wrd := words;
-
- newbuf.memory := mnewbuf;
-
- write(temp,newbuf);
-
- mnewbuf^.count := cnt;
-
- mnewbuf^.onezero := 2;
-
- mnewbuf^.root := nil;
-
- mnewbuf^.next := mbuffer;
-
- mprev^.next := mnewbuf;
-
- end;
-
-
-
- procedure filegen;
-
- var cod: file of byte;
-
- ascnum: yarn;
-
- c: char;
-
- b, d: byte;
-
- i, shifter: integer;
-
-
-
- begin
-
- assign(cod,outname);
-
- rewrite(cod);
-
- reset(temp);
-
- repeat
-
- read(temp,buffer);
-
- mbuffer := buffer.memory;
-
- if buffer.wrd <> '' then
-
- begin
-
- for i := 0 to length(buffer.wrd) do
-
- begin
-
- b := ord(buffer.wrd[i]);
-
- write(cod,b);
-
- end;
-
- ascnum := '';
-
- while (mbuffer^.root <> nil) do
-
- begin
-
- if mbuffer^.onezero = 0 then
-
- ascnum := ascnum + '0'
-
- else
-
- ascnum := ascnum + '1';
-
- mbuffer := mbuffer^.root;
-
- end;
-
- d := ord(ascnum[0]);
-
- write(cod,d);
-
- shifter := 1;
-
- b := 0;
-
- for i := d downto 1 do
-
- begin
-
- if ascnum[i] = '1' then b := b + shifter;
-
- shifter := shifter shl 1;
-
- if shifter >= 256 then
-
- begin
-
- write(cod,b);
-
- b := 0;
-
- shifter := 1
-
- end;
-
- end;
-
- if shifter > 1 then write(cod,b);
-
- write(#13,buffer.wrd:maxstr,' = ',ascnum,' ');
-
- end;
-
- until eof(temp);
-
- close(cod);
-
- close(temp);
-
- erase(temp);
-
- end;
-
-
-
- begin
-
- clrscr;
-
- writeln(' MAKETREE takes the word count This program may
- be copied freely ');
-
- writeln(' data created by COUNT and creates for educational
- use or to try it ');
-
- writeln(' a code tree using the Huffman out. If you like
- it, donations ');
-
- writeln(' algorithm. The resulting code tree will be accepted
- by the author. ');
-
- writeln(' is used by SQUEEZE when it This program is
- believed to be ');
-
- writeln(' compresses ASCII text files. reliable, but it
- is the user',ap,'s ');
-
- writeln(' responsibility
- to determine its ');
-
- writeln(' fitness for use. No
- liability ');
-
- writeln(' Mike Johnson will be assumed
- by the author. ');
-
- writeln(' P. O. Box 1151 Copyright (C) 1988
- Mike Johnson. ');
-
- writeln(' Longmont, CO 80502-1151 All rights reserved. ');
-
- if paramcount < 2 then
-
- begin
-
- writeln('Syntax: MAKETREE infile outfile [tempfile]');
-
- writeln(' infile is input file name (generated
- by COUNT)');
-
- writeln(' outfile is output file name to be
- used by SQUEEZE');
-
- writeln(' tempfile is filename to use for temporary
- file');
-
- writeln('The temporary file is best put on a RAM disk in extended
- or expanded memory,');
-
- writeln('but a hard disk or even a floppy will do if that is
- the fastest you have.');
-
- writeln('Don',ap,'t use a RAM disk in conventional (lower 640K)
- memory, as this program');
-
- writeln('already uses most of that in generating the coding
- tree.');
-
- end
-
- else
-
- begin
-
- inname := paramstr(1);
-
- outname := paramstr(2);
-
- if exist(inname) then
-
- begin
-
- assign(infile,inname);
-
- reset(infile);
-
- end
-
- else
-
- begin
-
- writeln('Unable to open ',inname);
-
- exit
-
- end;
-
- if paramcount > 2 then
-
- tempname := paramstr(3)
-
- else
-
- tempname := 'ZYXWVUTS.$$$';
-
- assign(temp,tempname);
-
- {$I- }
-
- rewrite(temp);
-
- {$I+ }
-
- if ioresult <> 0 then
-
- begin
-
- writeln('I/O error attempting to open temporary file ',tempname);
-
- halt;
-
- end;
-
- new(mtop);
-
- new(mbottom);
-
- mtop^.count := -1;
-
- mtop^.onezero := 2;
-
- mtop^.root := nil;
-
- mtop^.next := mbottom;
-
- mbottom^.count := 2147483647;
-
- mbottom^.onezero := 2;
-
- mbottom^.root := nil;
-
- mbottom^.next := nil;
-
- cnt2 := 2;
-
- writeln('Reading in and sorting word frequency data.');
-
- repeat
-
- readln(infile,words);
-
- readln(infile,cnt);
-
- placeit;
-
- inc(cnt2);
-
- until (words = chr(255)) or eof(infile);
-
- close(infile);
-
- words := chr(255) + chr(255);
-
- placeit;
-
- words := words + chr(255);
-
- placeit;
-
- words := words + chr(255);
-
- placeit;
-
- writeln(#13,cnt2,' nodes to add to Huffman coding tree. ');
-
- fanno;
-
- writeln;
-
- writeln('Writing output file.');
-
- filegen;
-
- writeln(#13,'Done. ');
-
- end;
-
- end.
-
- {$R+}
-
- {$M 2048,4096,655360}
-
- program squash;
-
-
-
- uses dos, crt;
-
-
-
- const maxbuf = 4096;
-
- maxstr = 15;
-
- ap = #39;
-
-
-
- type wd = string[maxstr];
-
- yarn = string[80];
-
- ptr = ^entry;
-
- entry = record
-
- wrd: wd; {Linguistic word}
-
- count: byte; {How many bits in code}
-
- code: longint; {Code (right justified in 4 byte integer)}
-
- next: ptr; {Pointer to next entry}
-
- end;
-
-
-
- var outfile, {File with output report}
-
- infile: file; {File with input text to be analyzed}
-
- temp,
-
- path, {Path of input file}
-
- filemask, {File name mask}
-
- filespec, {Path + name (including wild cards)}
-
- inname, {Name of input file}
-
- outname: yarn; {Name of output file}
-
- words: wd; {Word under construction}
-
- ch: char;
-
- top,
-
- bottom: array[1..32] of ptr;
-
- last,
-
- newbuf,
-
- newline,
-
- endfile,
-
- buf: ptr;
-
- mask,
-
- masked,
-
- place,
-
- long,
-
- cd,
-
- cnt: longint;
-
- infilepos, {Current position in input file buffer}
-
- inbitpos, {Current bit mask in input file buffer}
-
- inbytes, {Actual number of bytes read}
-
- outfilepos, {Current position in output file buffer}
-
- outbitpos, {Current bit mask in output file buffer}
-
- outbytes: word; {Actual number of bytes to write}
-
- cmdparameter, {Command line parameter number}
-
- start, {Starting point for command line file
- name scan}
-
- strlen,
-
- bite,
-
- b, i, j, k, m: integer;
-
- squish, filefound, done: boolean;
-
- outbite: byte;
-
- inbuf,
-
- outbuf: array[1..maxbuf] of byte;
-
- s: searchrec; {Record fill: array[1..21] of byte;attr:byte;time,
-
- size:longint;name: string[12]}
-
-
-
- function exist(filename: yarn): boolean;
-
- {Returns TRUE iff the file exists.}
-
- var f: file;
-
- begin {exist}
-
- assign(f,filename);
-
- {$I-}
-
- reset(f);
-
- {$I+}
-
- if IOresult = 0 then
-
- begin
-
- exist := true;
-
- close(f)
-
- end
-
- else
-
- exist := false;
-
- end; {exist}
-
-
-
- procedure getbit(var b: integer);
-
- begin
-
- if infilepos = 0 then
-
- begin
-
- if eof(infile) then
-
- done := true
-
- else
-
- begin
-
- blockread(infile, inbuf, maxbuf, inbytes);
-
- infilepos := 1;
-
- inbitpos := 1;
-
- end;
-
- end;
-
- if not done then
-
- begin
-
- b := inbuf[infilepos] and inbitpos;
-
- if b > 0 then b := 1;
-
- inbitpos := inbitpos shl 1;
-
- if inbitpos > 128 then
-
- begin
-
- inbitpos := 1;
-
- inc(infilepos);
-
- if infilepos > inbytes then infilepos := 0;
-
- end;
-
- end;
-
- end;
-
-
-
- procedure getbite(var b: integer);
-
- begin
-
- if infilepos = 0 then
-
- begin
-
- if eof(infile) then
-
- begin
-
- done := true;
-
- b := 0
-
- end
-
- else
-
- begin
-
- blockread(infile, inbuf, maxbuf, inbytes);
-
- infilepos := 1;
-
- end;
-
- end;
-
- if not done then
-
- begin
-
- if inbitpos > 1 then
-
- begin
-
- inc(infilepos);
-
- writeln('Bit sync error in getbite!')
-
- end;
-
- b := inbuf[infilepos];
-
- inbitpos := 1;
-
- inc(infilepos);
-
- if infilepos > inbytes then infilepos := 0;
-
- end;
-
- end;
-
-
-
- procedure putbit(b: integer);
-
- begin
-
- if (b > 1) or (b < 0) then writeln('Bad data passed to putbit.');
-
- outbite := outbite or (outbitpos * b);
-
- outbitpos := outbitpos shl 1;
-
- if outbitpos > 128 then
-
- begin
-
- inc(outfilepos);
-
- outbuf[outfilepos] := outbite;
-
- outbite := 0;
-
- outbitpos := 1;
-
- if outfilepos >= maxbuf then
-
- begin
-
- blockwrite(outfile, outbuf, maxbuf, outbytes);
-
- if outbytes < maxbuf then
-
- begin
-
- writeln('Disk full error.');
-
- done := true;
-
- end;
-
- outfilepos := 0;
-
- end;
-
- end;
-
- end;
-
-
-
- procedure putbite(b: integer);
-
- begin
-
- if (b > 255) or (b < 0) then writeln('Bad data passed to putbite.');
-
- outbite := b;
-
- inc(outfilepos);
-
- outbuf[outfilepos] := outbite;
-
- outbite := 0;
-
- outbitpos := 1;
-
- if outfilepos >= maxbuf then
-
- begin
-
- blockwrite(outfile, outbuf, maxbuf, outbytes);
-
- if outbytes < maxbuf then
-
- begin
-
- writeln('Disk full error.');
-
- done := true;
-
- end;
-
- outfilepos := 0;
-
- end;
-
- end;
-
-
-
- procedure flushbit;
-
- begin
-
- if outbitpos > 1 then
-
- begin
-
- inc(outfilepos);
-
- outbuf[outfilepos] := outbite;
-
- end;
-
- blockwrite(outfile, outbuf, outfilepos, outbytes);
-
- outbite := 0;
-
- outbitpos := 1;
-
- outfilepos := 0;
-
- end;
-
-
-
- function punct(ch: char): boolean;
-
- {TRUE iff ch is not a letter or chr(255).}
-
- var c: integer;
-
- begin
-
- c := ord(ch);
-
- if ((c > 10) and (c <= 64)) or ((c >= 91) and (c <= 96))
-
- or ((c >= 123) and (c <= 254)) or ((c > 0) and (c < 10)) then
-
- punct := true
-
- else
-
- punct := false;
-
- end;
-
-
-
- function letter(ch: char): boolean;
-
- {TRUE iff ch is a letter A-Z or a-z.}
-
- var c: integer;
-
- begin
-
- c := ord(ch);
-
- if ((c >= 65) and (c <= 90)) or ((c >= 97) and (c <= 122)) then
-
- letter := true
-
- else
-
- letter := false;
-
- end;
-
-
-
-
-
-
-
- procedure parse(var words: wd; var ch: char);
-
- {Extracts a word or item of punctuation from input file.}
-
- var byt: integer;
-
- begin {parse}
-
- words := '';
-
- while not (letter(ch) or punct(ch) or done) do
-
- begin
-
- getbite(byt);
-
- ch := chr(byt);
-
- end;
-
- if punct(ch) then
-
- begin
-
- if ord(ch) = 13 then
-
- begin
-
- words := chr(255)+chr(255);
-
- if not done then begin getbite(byt); ch := chr(byt) end;
-
- end
-
- else
-
- if ch = ' ' then
-
- begin
-
- words := ch;
-
- while (not done) and (ch = ' ') and (length(words) < maxstr) do
-
- begin
-
- getbite(byt);
-
- ch := chr(byt);
-
- if ch = ' ' then words := words + ch;
-
- end;
-
- if (length(words) = maxstr) and (not done) then
-
- begin
-
- getbite(byt);
-
- ch := chr(byt)
-
- end;
-
- end
-
- else
-
- begin
-
- words := ch; {return one punctuation character}
-
- if not done then begin getbite(byt); ch := chr(byt) end;
-
- end;
-
- end
-
- else
-
- begin
-
- if letter(ch) then
-
- begin
-
- words := ch;
-
- while (not done) and letter(ch) and (length(words) < maxstr)
- do
-
- begin
-
- getbite(byt);
-
- ch := chr(byt);
-
- if letter(ch) then words := words + ch;
-
- end;
-
- if (length(words) = maxstr) and (not done) then
-
- begin
-
- getbite(byt);
-
- ch := chr(byt)
-
- end;
-
- end;
-
- end;
-
- end; {parse}
-
-
-
- begin
-
- clrscr;
-
- writeln(' SQUEEZE is designed to remove This program may
- be copied freely ');
-
- writeln(' redundancy from ASCII text files, for educational
- use or to try it ');
-
- writeln(' making them smaller for transmission out. If you like
- it, donations ');
-
- writeln(' and storage. It also improves the will be accepted
- by the author. ');
-
- writeln(' security of files encrypted after This program is
- believed to be ');
-
- writeln(' SQUEEZING. It uses LANGUAGE.COD, reliable, but it
- is the user',ap,'s ');
-
- writeln(' created with COUNT and MAKETREE. responsibility
- to determine its ');
-
- writeln(' fitness for use. No
- liability ');
-
- writeln(' Mike Johnson will be assumed
- by the author. ');
-
- writeln(' P. O. Box 1151 Copyright (C) 1988
- Mike Johnson. ');
-
- writeln(' Longmont, CO 80502-1151 All rights reserved. ');
-
- if (paramcount < 3) then
-
- begin
-
- writeln;
-
- writeln('Syntax: SQUEEZE a|s|u|x outputfile inputfile(s)');
-
- writeln('a and s both mean add or squash, u and x both mean
- unsquash or extract.');
-
- writeln('Input and output files must be different.');
-
- writeln('Input file name(s) may include wild cards.');
-
- writeln('LANGUAGE.COD (made with MAKETREE) must be in default
- directory.');
-
- end
-
- else
-
- begin
-
- {Read in language code dictionary.}
-
- assign(infile,'LANGUAGE.COD');
-
- reset(infile,1);
-
- infilepos := 0;
-
- inbitpos := 1;
-
- done := eof(infile);
-
- newline := nil;
-
- last := nil;
-
- endfile := nil;
-
- if paramcount >= 1 then
-
- temp := paramstr(1)
-
- else
-
- temp := 'X';
-
- temp := upcase(temp[1]);
-
- if (temp = 'A') or (temp = 'S') then
-
- squish := true
-
- else
-
- squish := false;
-
- writeln('Reading in LANGUAGE.COD.');
-
- for i := 1 to 32 do
-
- begin
-
- new(top[i]);
-
- top[i]^.wrd := '';
-
- top[i]^.count := 0;
-
- top[i]^.code := 0;
-
- top[i]^.next := nil;
-
- bottom[i] := top[i];
-
- end;
-
- last := nil;
-
- while not done do
-
- begin
-
- new(newbuf);
-
- newbuf^.wrd := '';
-
- getbite(strlen);
-
- if not done then
-
- begin
-
- for i := 1 to strlen do
-
- begin
-
- getbite(bite);
-
- newbuf^.wrd := newbuf^.wrd + chr(bite);
-
- end;
-
- getbite(bite);
-
- newbuf^.count := bite;
-
- getbite(bite);
-
- newbuf^.code := bite;
-
- if newbuf^.count > 8 then
-
- begin
-
- getbite(bite);
-
- long := bite;
-
- newbuf^.code := newbuf^.code + (long shl 8);
-
- if newbuf^.count > 16 then
-
- begin
-
- getbite(bite);
-
- long := bite;
-
- newbuf^.code := newbuf^.code + (long shl 16);
-
- if newbuf^.count > 24 then
-
- begin
-
- getbite(bite);
-
- long := bite;
-
- newbuf^.code := newbuf^.code + (long shl
- 24);
-
- end;
-
- end;
-
- end;
-
- newbuf^.next := nil;
-
- if newbuf^.wrd = chr(255) then
-
- begin
-
- last := newbuf;
-
- end
-
- else
-
- if newbuf^.wrd = chr(255) + chr(255) then
-
- newline := newbuf
-
- else
-
- if newbuf^.wrd = chr(255) + chr(255) + chr(255)
- then
-
- endfile := newbuf;
-
- if squish then
-
- begin
-
- if ord(newbuf^.wrd[1]) > 107 then
-
- i := length(newbuf^.wrd) + 15
-
- else
-
- i := length(newbuf^.wrd)
-
- end
-
- else
-
- i := newbuf^.count;
-
- bottom[i]^.next := newbuf;
-
- bottom[i] := newbuf;
-
- write(#13,newbuf^.wrd,' ');
-
- end;
-
- end;
-
- if newline = nil then
-
- writeln('Error: newline symbol not in LANGUAGE.COD! ');
-
- writeln;
-
- close(infile);
-
-
-
- {Initialize input & output files and global variables.}
-
- outname := paramstr(2);
-
- filefound := false;
-
- assign(outfile,outname);
-
- if exist(outname) then
-
- begin
-
- reset(outfile,1);
-
- seek(outfile,filesize(outfile))
-
- end
-
- else
-
- rewrite(outfile,1);
-
-
-
- outfilepos := 0;
-
- outbitpos := 1;
-
- outbite := 0;
-
- if squish then start := 3 else start := 2;
-
- for cmdparameter := start to paramcount do
-
- begin
-
- filespec := paramstr(cmdparameter);
-
- filemask := filespec;
-
- path := '';
-
- repeat
-
- i := pos('\',filemask);
-
- if i > 0 then
-
- begin
-
- path := path + copy(filemask, 1, i);
-
- filemask := copy(filemask, i+1, length(filemask));
-
- end;
-
- until i = 0;
-
- findfirst(filespec, readonly + hidden + archive, s);
-
- while doserror = 0 do
-
- begin
-
- filefound := true;
-
- inname := path + s.name;
-
- assign(infile, inname);
-
- reset(infile,1);
-
- infilepos := 0;
-
- inbitpos := 1;
-
- done := eof(infile);
-
-
-
- {Perform compression on input file(s).}
-
- if squish then
-
- begin
-
- writeln('Compressing ',inname);
-
- ch := chr(0);
-
- while not done do
-
- begin
-
- parse(words,ch); {Find word or punctuation}
-
- if length(words) > 0 then
-
- begin
-
- if ord(words[1]) > 107 then
-
- buf := top[length(words) + 15]
-
- else
-
- buf := top[length(words)];
-
- mask := 1;
-
- while (buf^.wrd <> words) and (buf^.next <>
- nil) do
-
- buf := buf^.next;
-
- if buf^.wrd <> words then
-
- begin
-
- highvideo;
-
- if words = chr(255) + chr(255) then
-
- writeln('Newline code not found in tree!
- ')
-
- else
-
- write(words);
-
- for i := 1 to last^.count do
-
- begin
-
- masked := last^.code and mask;
-
- mask := mask shl 1;
-
- if masked = 0 then
-
- putbit(0)
-
- else
-
- putbit(1);
-
- end;
-
- for i := 0 to length(words) do
-
- begin
-
- for j := 0 to 6 do
-
- begin
-
- b := (ord(words[i]) shr j) and
- 1;
-
- putbit(b)
-
- end;
-
- end;
-
- end
-
- else
-
- begin
-
- for i := 1 to buf^.count do
-
- begin
-
- masked := buf^.code and mask;
-
- mask := mask shl 1;
-
- if masked = 0 then
-
- putbit(0)
-
- else
-
- putbit(1);
-
- end;
-
- lowvideo;
-
- if words = chr(255) + chr(255) then
-
- writeln
-
- else
-
- write(buf^.wrd);
-
- end;
-
- end;
-
- end;
-
- mask := 1;
-
- for i := 1 to endfile^.count do
-
- begin
-
- masked := endfile^.code and mask;
-
- mask := mask shl 1;
-
- if masked = 0 then
-
- putbit(0)
-
- else
-
- putbit(1);
-
- end;
-
- end
-
- else
-
-
-
- {Decompress input file.}
-
- begin
-
- writeln('Decompressing ',inname);
-
- cd := 0;
-
- mask := 1;
-
- cnt := 0;
-
- while not done do
-
- begin
-
- getbit(b);
-
- if b > 0 then cd := cd or mask;
-
- mask := mask shl 1;
-
- if mask <= 0 then writeln('Error: code too long.');
-
- inc(cnt);
-
- buf := top[cnt];
-
- while (buf^.next <> nil) and
-
- ((cnt <> buf^.count) or (cd <> buf^.code))
- do
-
- buf := buf^.next;
-
- if (cnt = buf^.count) and (cd = buf^.code) then
-
- begin
-
- if buf = endfile then
-
- done := true
-
- else
-
- if buf^.wrd = chr(255) then
-
- begin
-
- j := 0;
-
- for k := 0 to 6 do
-
- begin
-
- getbit(b);
-
- j := j + (b shl k);
-
- end;
-
- for i := 1 to j do
-
- begin
-
- m := 0;
-
- for k := 0 to 6 do
-
- begin
-
- getbit(b);
-
- m := m + (b shl k)
-
- end;
-
- putbite(m);
-
- highvideo;
-
- write(chr(m));
-
- end;
-
- end
-
- else
-
- if buf^.wrd = chr(255) + chr(255) then
-
- begin
-
- writeln;
-
- putbite(13);
-
- putbite(10);
-
- end
-
- else
-
- begin
-
- for i := 1 to length(buf^.wrd) do
-
- begin
-
- b := ord(buf^.wrd[i]);
-
- putbite(b)
-
- end;
-
- lowvideo;
-
- write(buf^.wrd)
-
- end;
-
- cd := 0;
-
- cnt := 0;
-
- mask := 1;
-
- end;
-
- end;
-
- end;
-
-
-
- {Finish up or loop back for more work to do.}
-
- flushbit;
-
- close(infile);
-
- findnext(s);
-
- end;
-
- end;
-
-
-
- if not filefound then writeln('No matching files found.');
-
- flushbit;
-
- close(outfile);
-
- end;
-
- end.
-
-
-
- {That's all, folks!}
-
-
-
-