home *** CD-ROM | disk | FTP | other *** search
- From: gnu@hoptoad.uucp (John Gilmore)
- Newsgroups: comp.os.msdos.apps,sci.crypt
- Subject: Word Perfect "locked document encryption" is trivial to break
- Message-ID: <12163@hoptoad.uucp>
- Date: 27 Aug 90 22:58:27 GMT
- Organization: Cygnus Support, Palo Alto
-
- One thing that came up at Crypto '90 was a short paper from Ms. Helen
- Bergen at Queensland U. in Australia. She noticed the 'locked
- document' commands in Word Perfect, used by all the secretaries in her
- dept., and looked to see how strong it was. It turned out that the
- MSDOS DEBUG command and an envelope for scratch paper are enough for
- anyone to decode both a document AND the key used for it! Word Perfect
- Corp. didn't care about her results (letter reproduced below), but I
- thought that some Word Perfect losers, I mean users, here on the net
- might want to know.
-
- You should consider WP locked documents like ROT13: fine to keep the
- text garbled until you type a command, useless for keeping things private.
-
- John Gilmore
-
- From: <CSZBERGEN@qut.edu.au>
- Date: Mon, 27 Aug 90 10:28 +1000
- To: cygint!gnu
-
- Dear John,
-
- Here is the letter and a copy of the Latex source of my paper. It
- will be published in CRYPTOLOGIA in the near future. Thanks for your
- interest,
-
- Regards,
- Helen Bergen
-
- ****************************************************
- Quote from letter received from WordPerfect Pacific:
-
- Thankyou for the copy of your paper entitled "File Security in
- WordPerfect 5.0". I sent a copy of the paper to WordPerfect Corporation
- in the USA and recently received a reply from them.
-
- They confirmed that people have written programs to break the password.
- However, WordPerfect Corporation does not have such a program and
- therefore has no way of breaking it. They also pointed out that very
- few users would know how to write such a program.
-
- It is possible that the manual may be amended in a future edition to
- clarify the protection that a password gives. They recommend that
- anyone concerned about security may want to take higher precautions
- than the password protection.
-
- Thankyou for your interest in WordPerfect
-
- ********************************
-
- FILE SECURITY IN WORDPERFECT 5.0
-
- H.A. Bergen School of Computing Science
- W.J. Caelli Information Security Research Centre
-
- Faculty of Information Technology
- Queensland University of Technology
- G.P.O. Box 2434, Brisbane, Q 4001, AUSTRALIA
-
- ABSTRACT: Cryptanalysis of files encrypted with the 'locked document'
- option of the word processing package WordPerfect V5.0, is shown to be
- remarkably simple. The encryption key and the plaintext are easily
- recovered in a ciphertext only attack. File security is thus
- compromised and is not in accord with the claim by the manufacturer
- that: "If you forget the password, there is absolutely no way to
- retrieve the document".
-
- KEYWORDS: Cryptanalysis, WordPerfect.
-
- INTRODUCTION
-
- WordPerfect is one of the most popular word processing packages in use
- today. It has a 'locked document' option which aims at protection of a
- WordPerfect file from unauthorised access. The manual states "You can
- protect or lock your documents with a password so that no one will be
- able to retrieve or print the file without knowing the password - not
- even you". The manual also claims that "If you forget the password,
- there is absolutely no way to retrieve the document" [1].
-
- This option is used to 'add' a password to an existing or newly created
- WordPerfect file. The file is then encrypted using the password as the
- cryptographic key, and is stored on disk. Any subsequent retrieval or
- printing of the file via WordPerfect requires the entry of the correct
- password. With the increasing use of distributed file systems and
- sharing of data, this option might appear to be an attractive means of
- protecting sensitive files, particularly where they may reside on a
- shared network server. It is easily implemented without the expense
- and installation of another software protection/encryption package.
-
- The encryption algorithm used in the WordPerfect 4.2 version, however,
- was successfully cryptanalysed by Bennett [2]. He concluded that the
- encryption system was unsatisfactory for protection of sensitive
- documents.
-
- The present study extends this work to an investigation of the security
- of the WordPerfect 5.0 encryption system on both the IBM PC and DEC VAX
- systems as well as WordPerfect 5.1 on the IBM PC.
-
- WORDPERFECT FILES
-
- WordPerfect version 5.0 was used on an IBM-PC and other compatible
- systems to create various files consisting of original documents and
- their associated ciphertext with different passwords. The DOS utility
- DEBUG was used to display the content of the files in hexadecimal
- notation.
-
- The WordPerfect files were created on three different systems. By this
- we mean, three different licenced copies of WordPerfect running on
- different Personal Computers with different printers. An example from
- just one of these systems has been given in detail.
-
- Version 4.2 format
-
- Files created under 4.2 contain just the ASCII representation of the
- character text. Printer definitions and setup parameters are in
- separate files and are used only when the file is to be printed.
-
- For example, a file may contain zeros (ASCII code 30 hex) and new line
- characters (these are converted to the ASCII line feed character, 0A
- hex). The plaintext file in hexadecimal would be
-
- 30 30 30 30 30 30 30 30 30 30 0A
- 30 30 30 30 30 30 30 30 30 30 0A
- 30 30 30 30 30 30 30 30 30 30
-
- The corresponding ciphertext file with a key value of the ASCII letter A is
-
- FE FF 61 61 41 00
- 73 72 75 74 77 76 79 78 7B 7A 47
- 7C 7F 7E 61 60 63 62 65 64 67 5C
- 69 68 6B 6A 6D 6C 6F 6E 51 50
-
- Encrypted files contain an extra 6 bytes, shown in the first line of
- the above. The first 4 bytes are constant for all keys and are used by
- the WordPerfect program to determine whether the file is plaintext or
- ciphertext. The latter 2 bytes contain a checksum derived from the
- key, as described by Bennett [2].
-
- For example,
- FE FF 61 61 43 00 key = C
- FE FF 61 61 71 C0 key = AA
-
- Version 5.0 format
-
- Files created under version 5.0 are stored in a different format. With
- the default WordPerfect format, the file contains the document text
- appended to printer setup information. There are other options to save
- the file in DOS text format or in 4.2 format, and in these the printer
- information is omitted. For example, a document containing 32
- characters of text is saved in 5.0 format as a file of approximately
- 600 - 1000 bytes (depending on the particular printer system) and in
- 4.2 or DOS format as a file of 32 bytes.
-
- The locked document option in version 5.0 allows encryption of files
- only in WordPerfect format, the one containing all the printer
- information.
-
- * All version 5.0 files, original and encrypted forms have
- the same four characters in byte positions 0 - 3 :
-
- FF 57 50 43 (HEX)
- or W P C (ASCII)
-
- These codes were unchanged for files created on three different
- systems, i.e., three different licenced copies of WordPerfect 5.0 on
- three different PC's using different printers.
-
- * BYTES 4 - 7 are related to the offset address of the text, ie. the
- start of the document text.
-
- * BYTES 8 - 11 are constant for all files:
-
- 01 0A 00 00
-
- * ENCRYPTED TEXT STARTS HERE
-
- * BYTES 12 - 15 are constant for a plaintext file:
-
- 00 00 00 00
-
- For an encrypted file, however, bytes 12 and 13 in the above contain a
- checksum related to the key value used. This checksum appears to be
- the same as that used in the 4.2 version [2].
-
- * BYTES 16 - 21 were constant for files prepared on the three
- different systems and contained:
-
- FB FF 05 00 32 00
-
- * BYTES 22 - 31. Of these 10 bytes, 22, 23, 28 are file and system
- dependent, but bytes 24, 26, 29, 30, 31 are constant with value 00.
-
- * BYTES 32 - 39 were constant for files prepared on the three
- different systems and contained
-
- 42 00 00 00 02 00 56 00
-
- * BYTES 40 - 47. Of these 8 bytes, 42, 46, 47 are file and system
- dependent, but bytes 40, 41, 43, 44, 45 are constant with value 00.
-
- * The remaining bytes of the printer header information are
- dependent on the particular hardware and printer in use. These might
- change according to the printer setup values.
-
- * The remaining bytes are document text. The offset address in
- bytes 4 - 5 gives the start of the document text. This is dependent on
- the size of the printer information and this can obviously change from
- one system to another.
-
- * Other systems may well have different printers or setup parameters
- which change some of the bytes that we found to be constant. In general
- though, there will be a reasonable number of constant known plaintext
- bytes.
-
- ANALYSIS
-
- The encryption algorithm was found to be the same as that used in the
- 4.2 version [2]. The main differences between version 4.2 and 5.0 are
- in the file formats.
-
- Bytes 0 - 15 of the original and encrypted files contain some useful
- information. The offset address in bytes 4-5 gives the starting point
- of the document text. The checksum of the key in the encrypted file is
- in bytes 12 - 13. This gives the key directly if the key is a single
- character.
-
- The encryption of the file starts at byte number 16, so all the printer
- information as well as the document is encrypted.
-
- The Encryption Algorithm
-
- * Firstly, the ciphertext is XORed with an ascending sequence of
- bytes based on the sequence in Hexadecimal :
-
- 02 03 04 ... 79 7A 7B ... FD FE FF 00 01 02 03 ...
-
- Note that the sequence repeats from 00 not 02 after reaching FF. The
- keylength determines the starting point of the sequence to be used, ie.
-
- starting point = keylength + 1
-
- For example, for key = QWERTY the starting point of the ascending
- sequence would be at position 6 in the sequence giving a starting value
- of 07.
-
- * Secondly, the resulting text is XORed with the key characters in
- blocks of key length, to restore the original plaintext. This type of
- polyalphabetic substitution is called a Vigenere cipher. The analysis
- of Vigenere ciphers is well known and covered in the standard
- cryptography literature e.g. [3,4,5].
-
- Plan of attack
-
- In the 4.2 version, the only text encrypted was that contained in the
- actual document. This is unknown plaintext. In version 5.0, however,
- the printer information as well as the document text is encrypted. We
- have identified bytes 16 - 21, 24 - 27, 29 - 41, 43 - 45 as being
- constant for a particular system (as defined earlier, a particular
- licenced copy of WordPerfect on a particular PC and printer), and they
- do not change markedly from one system to another.
-
- So we have the ideal situation of known plaintext for a reasonable
- number of bytes. This can greatly simplify our attack as it makes it
- possible to recover the actual key. Then it is trivial to recover the
- plaintext by using WordPerfect to retrieve the file using the
- recovered key as the ''password''. Alternatively, a program could be
- written to do this as the encryption/decryption algorithm is known. We
- outline a strategy with the following example from one particular
- system:
-
- Document text consists of three lines of ten ASCII zeros each. The
- size of the original file and the encrypted file is 651 bytes.
-
- 0000000000
- 0000000000
- 0000000000
-
- Plaintext file contains in hexadecimal (for a particular printer):
-
- BYTES 0-15 FF 57 50 43 6B 02 00 00-01 0A 00 00 00 00 00 00
- 16-31 FB FF 05 00 32 00 2D 02-00 00 07 00 11 00 00 00
- 32-47 42 00 00 00 02 00 56 00-00 00 53 00 00 00 0C 00
- . ........
- .
- 619-623 30 30 30 30 30
- 624-639 30 30 30 30 30 0A 30 30-30 30 30 30 30 30 30 30
- 640-650 0A 30 30 30 30 30 30 30-30 30 30
-
- Ciphertext file contains in hexadecimal:
-
- BYTES 0-15 FF 57 50 43 6B 02 00 00-01 0A 00 00 6E 50 00 00
- 16-31 B0 B4 42 41 7E 47 6C 46-53 53 58 59 45 5F 59 4C
- 32-47 19 5B 57 51 5E 57 07 74-63 63 3C 69 64 6F 65 7C
- . .......
- .
- 619-623 19 14 1F 19 0C
- 624-639 1B 1B 17 11 1C 2D 11 14-03 03 0F 09 04 0F 09 1C
- 640-650 31 0B 07 01 0C 07 01 E4-F3 F3 FF
-
- We will illustrate a known ciphertext only attack, even though we
- obviously know the exact plaintext in this particular example. So we
- assume that we have a ciphertext file produced on some other hardware
- system using a different licenced copy of WordPerfect. As explained
- earlier, we can be confident that a substantial portion of text is
- common to all systems. Thus to summarise, the known plaintext we have
- is
-
- BYTES 16 - 21 known
- BYTES 22 - 31 known except for 22, 23, 28
- BYTES 31 - 39 known
- BYTES 40 - 47 known except for 42, 46, 47
-
- * Firstly, look at bytes in positions 12 - 15 in the ciphertext file
- above which contain the checksum of the key. If the key is one
- character, it will be evident in byte number 12. For longer keys bytes
- 12 and 13 are probably non zero. In this example the checksum is 6E 50
- which implies a key size greater then 1.
-
- * Now we consider bytes 16 - 47. For byte number 16, we will try to
- deduce the key character used. To do this, choose a keylength starting
- with likely values say, 4 to 10 characters. Then XOR the plaintext
- characters with the ascending sequence (in the algorithm section)
- starting with position keylength which has the value keylength + 1.
- Then XOR that result with the associated ciphertext and the key
- character should result. For example,
-
- keylength of 4 keylength of 8
-
- plaintext FB 1111 1011 FB 1111 1011
- sequence 05 0000 0101 09 0000 1001
- xor --------- ---------
- 1111 1110 1111 0010
-
- cipher B0 1011 0000 B0 1011 0000
- xor --------- ---------
- 0100 1110 0100 0010
- => key 4 E 4 2
-
- Thus we get the following table:
-
- keylength starting possible
- sequence key character
-
- 4 05 4E
- 5 06 4D
- 6 07 4C
- 7 08 43
- 8 09 42
- 9 0A 41
- 10 0B 40
-
- * Now for a keylength of 4, byte 16 gives a possible key character
- of 4E. Bytes 20, 24, 28 ... must also have been created from the same
- key character, so we deduce a potential key character for these other
- bytes to see if it is also 4E. It turns out that the other potential
- key characters are not 4E.
-
- * So we take the next possible key length, 5. Deduce the key
- character for bytes 21, 26 .. to see if they match the value for byte
- number 16 for that keylength, which is 4D. They do not.
-
- * When a match is obtained for the first key character, deduce the
- key characters for the remaining positions. We show the full analysis
- for bytes 16 - 31 for a keylength of 8. As we stated earlier, bytes at
- positions 22, 23 and 28 are unknown, and we signify these as ??.
-
- BYTES 16 - 23
- plaintext FB FF 05 00 32 00 ?? ??
- sequence 09 0A 0B 0C 0D 0E 0F 10
- xor ------------------------------
- F2 F5 0E 0C 3F 0C ?? ??
-
- cipher B0 B4 42 41 7E 47 6C 46
- xor ------------------------------
- => key 42 41 4C 4D 41 49 ?? ??
-
- BYTES 24 - 31
- plaintext 00 00 07 00 ?? 00 00 00
- sequence 11 12 13 14 15 16 17 18
- xor ------------------------------
- 11 12 14 14 ?? 16 17 18
-
- cipher 53 53 58 59 45 5F 59 4C
- xor -------------------------------
- => key 42 41 4C 4D ?? 49 4E 54
-
- * The repeating sequence for the key is obvious, even with three
- unknown bytes at positions 22, 23 and 28, and so the key characters
- are:
-
- 42 41 4C 4D 41 49 4E 54
- B A L M A I N T
-
- * Further checks on the key could be done using the known bytes from
- 32-47, if the repeating pattern of the key characters is ambiguous.
-
- * In general, the probability of deducing the key bytes is dependent
- on the keylength. Some definitions relating to the key byte are
- useful:
-
- * Known: the key byte may be determined at two or more
- different positions which correspond to known plaintext.
-
- * Possible: the key byte may be determined at only one position.
-
- * Unknown: the key byte may not be determined as there is no
- overlap of this byte with known plaintext.
-
- In summary, for a keylength of 1-9, the key bytes are all known and
- thus all of the key may always be deduced. For a keylength of 10-13,
- 15-17, there is a small proportion of possible to known key bytes. Thus
- all the key may be deduced with a high probability. Keys with
- keylengths of 14, 18-24 contain one, two or three unknown key bytes and
- an increasingly high proportion of possible to known key bytes. At
- least five bytes of the key may always be determined.
-
- * Retrieve the plaintext using WordPerfect with the
- key as the password. This is the easiest way to decrypt
- the document text.
-
- * If no access to WordPerfect is available, then it is
- straightforward to recover the plaintext with a short C
- program which implements the decryption algorithm as described
- previously. This has been done successfully.
-
- CONCLUSION
-
- The encryption key is easily recovered in an apparent KNOWN CIPHERTEXT
- ONLY attack, as the system provides enough known plaintext in the
- printer information regardless of the document plaintext. The
- analysis, as shown, can literally be done on the back of a (large)
- envelope.
-
- The analysis may be slightly more difficult where the physical system
- on which the files were prepared is completely unknown and vastly
- different to any system we have encountered, as this may reduce the
- amount of known plaintext. In these situations, statistical analysis
- based on the characteristic frequencies of characters in a language is
- used to decipher text files. This is a standard method which is
- straightforward although a program may have to be written.
-
- In summary, the cryptanalysis of files encrypted with the 'locked
- document' option in WordPerfect version 5.0 is remarkably simple. The
- inclusion of portions of known plaintext in the encrypted file is a
- fatal flaw in the system, since it provides a mechanism of attack in
- which the key can be recovered by hand, and document plaintext easily
- retrieved. All of the key can easily be recovered for keylengths of
- 1-13 and 15-17, far in excess of commonly used passwords of 8
- characters. A high proportion of the key can be deduced for keylengths
- of 14 and 18-24. The cipher used is too weak, providing little or no
- protection.
-
- If the attacker has knowledge of any other unencrypted file from the
- same system, the analysis is made even more simple. We stress that
- **both the key and the plaintext can be recovered**, independent of
- the content of the plaintext.
-
- The worst problem is that it may give a false sense of security. For
- example, an attacker may decrypt a document, modify it and re-encrypt
- so that the originator is unaware of the alterations. We conclude that
- the file security is not consistent with claims made by the
- manufacturer and is not sufficent to protect sensitive documents from
- anything but the most naive attack.
-
- References
-
- 1. WORDPERFECT CORPORATION (1989): WordPerfect for IBM Personal
- Computers.\\
- 2. BENNETT, J (1987): Analysis of the encryption algorithm
- used in the WordPerfect Word Processing Program,
- Cryptologia, Vol XI. No 4. pp 206-210.\\
- 3. KONHEIM, A G (1981): {\em Cryptography, A Primer}, Wiley.\\
- 4. DENNING, D E (1981): {\em Cryptography and Data Security},
- Addison Wesley.\\
- 5. CARROLL, J and Robbins, L E (1989): Computer Cryptanalysis
- of Product Ciphers, Cryptologia, Vol XIII. No 4. pp 303-326.\\
-
- Biographical
-
- Helen Bergen is a Lecturer in the School of Computing Science, Faculty
- of Information Technology, at the Queensland University of Technology.
- Her research interests within the Information Security Research Centre,
- Faculty of Information Technology, include cryptology and the
- application of supercomputers.
-
- Bill Caelli is Director of the Information Security Research Centre
- within the Faculty of Information Technology at the Queensland
- University of Technology. He is also Technical Director and Founder of
- ERACOM Pty. Ltd., a manufacturer of cryptographic equipment. His
- research interests lie in the development and application of
- cryptographic systems to enhance security, control and management of
- computer and data network systems.
- --
- John Gilmore {sun,pacbell,uunet,pyramid}!hoptoad!gnu gnu@toad.com
- The Gutenberg Bible is printed on hemp (marijuana) paper. So was the July 2,
- 1776 draft of the Declaration of Independence. Why can't we grow it now?
-