home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Esprit de Apple Corps
/
EDAC-2.iso
/
Graphics
/
Bulla.Disk
/
LOD3.6.DOC
< prev
next >
Wrap
Text File
|
1990-05-18
|
30KB
|
652 lines
The LOD/H Technical Journal, Issue #3: File 06 of 11
|||||||||||||||||||||||||||||||||||||||||||||||||||
+-+-+-+-+-+-+/ \+-+-+-+-+-+-+
\ L \ Secure Data Encryption with Cellular Automatons / L /
\ O \ / O /
\ D \ by / D /
+-+-+-+ +-+-+-+
\ L \ The Mentor / L /
\ O \ / O /
\ H \ A Legion of Doom Presentation! / H /
+-+-+-+ +-+-+-+
\_\_\_\_________________________________/_/_/_/
One of the key issues that concerns anyone who has sensitive or illegal
information on their computer system is preventing unauthorized access to this
information. Even if you hit a key that deletes everything on the hard disk
when you see that four-door sedan pull up in the driveway, any idiot with
Norton's Utilities (IBM) or Copy II+ (Apple) can recover anything that's on
your drive with minimal effort. A delete command only changes a flag in the
VTOC (volume table of contents), it doesn't actually *remove* the file from
your system.
There are two methods to ensure that your data can't be read by a sector
editor or recovered by NU. The first is to overwrite everything with a NULL
(FF) or anything else for that matter. I've seen one batch file that does a
global delete, creates a file that says 'EAT HOT DEATH', and then begins
copying it until disk space is full. Unfortunately, you can't always guarantee
that you will be able to get to your computer before someone else does.
The second method is to encrypt your data. Most people have visions of
data encryption being some sort of arcane process akin to summoning demons or
talking with Dead Cow Cult members (two closely related process- es.) In
practice, it isn't that difficult. This file is intended to show some very
short programs that will encrypt data beyond the ability of any- thing short of
a dedicated mainframe to crack.
How to use: The code examples I provide will be in MicroSoft's
AmigaBASIC. It is fairly generic and you should be able to convert it over to
IBM, //e,c,gs, Mac, ST, C64, or any flavor of mainframe you like. For those of
you setting up systems on Packet-Switched Networks (such as the LOD/H system
one of our members is implementing) data encryption should be considered
absolutely necessary to maintain security.
The terseness of the routines make them easy to insert in a bulletin
board also, although conversion into C or Assembly would be necessary for
decent speed.
Intro to Cryptography: Long before computers were around, there was a
need for data security. Everyone used lemon juice as 'invisible ink' when they
were a kid, heating it over a candle to bring it out. And everyone has seen
the substitution code where "A" = 1, B = "2", "Z" = 26, etc...
The easiest form of encryption involves a variation of the previous.
First of all, don't think of A = 1 as a substitution, think of it as a
remapping. Let's say we have a language made up of the five vowels, and we
wanted to remap them to the numbers 1-5. Our map would look like this:
"AEIOU12345" and our mapping function would be f(c) = POSITION(c) + x where c =
the letter to map and x is the key (in this case 5.) So every time we needed
to encrypt a letter, we would take its position in the map, add 5 to it, and
come up with the character to substitute. For the entire alphabet, the mapping
function would be f(c) = POS(c) + 26 for the map "A..Z,1..26".
Your map should be composed of all the characters that will be used for
encryption. In a text only encrypter, this will consist of all the printable
characters your machine can use. The same method can be used to encrypt binary
files, but it's not as clear as text only for a teaching example.
The problem with this simple form of encryption is that your average C64
could crack it in a matter of minutes. Enter into the next level of
cryptography, random numbers.
During World War II the Allied Forces created a system to generate
random electric noise, recorded this noise onto a wax cylinder, and scram- bled
radio transmissions by mixing the seemingly random noise with the voice
transmission. The soldiers in the field needed an imprint of the same cylinder
to be able to understand the message. Think of the wax cy- linder as a
'filter' for the crypted message.
A random number generator can be easily used to encrypt data providing
you realize the following- a random number generator on a computer is not
really random. If you initialize the generator with the same seed value on two
seperate occasions, it will return the same sequence of psuedo- random
numbers. Most BASIC's use the RANDOMIZE <seed> command to start the generator
off. If you leave off the seed, they get a seed from the system clock or some
other fairly random source, providing a much truer random selection. But by
declaring the seed yourself, you can be sure that you will be able to reference
this same string of numbers, a string that is very hard to figure out without
the key (seed.)
Program Listing 1 is an example of a BASIC encrypt/decrypt system that
uses the built-in random number generator include on the machine (or language
implementation.)
Program Listing 1
-----------------
REM ************************************************************************
REM
REM Ok, this is an example of very basic encryption. It takes the input
REM string and the input key and processes them using the machine's built
REM in random number generator. This version is written in AmigaBASIC 1.2.
REM It can be compacted quite a bit by writting it in C, but it's an easy
REM algorithm to crack.
REM
REM ************************************************************************
INPUT "String to be encoded"; C$
INPUT "Key Please! ";K
REM ************************************************************************
REM
REM When you use the RANDOMIZE command, it seeds the random number gener-
REM ator with the key K. *EVERY* time you seed the generator with the same
REM value, you will get the same sequence of psuedo-random numbers. Since
REM the built in random-number generator uses a linear algorithm to gener-
REM ate the sequence, it's easy (relatively) to crack.
REM
REM ************************************************************************
RANDOMIZE K
REM ************************************************************************
REM
REM The only difference between encoding and decoding is which way you
REM move in your Q$ array space. Encoding takes the original and shifts
REM to the right, decoding takes the codes value and shifts to the left.
REM
REM ************************************************************************
REREAD:
INPUT "Encode or Decode ? ";A$
A$=LEFT$(A$,1)
IF A$="E" OR A$="e" THEN
A=1
GOTO HEAD
END IF
IF A$="D" OR A$="d" THEN
A=-1
ELSE
GOTO REREAD
END IF
REM ************************************************************************
REM
REM Q$ contains all the characters that can be encoded. Every encoded
REM character will be mapped to a character in this array. I haven't
REM included any non-standard characters, so you will have to customize
REM it to your particular keyboard/system. I've included an error check
REM that will abort the encryption if it encounters a character that isn't
REM in Q$. I have to use the STRING$ command to insert the spacebar and
REM the quote into the string. It could also be done with a ASC(##) in
REM many basics. You could expand this to include any non-printable
REM characters you'd like so you could do non-text files.
REM
REM ************************************************************************
HEAD:
SPACE = 32
QUOTE = 34
Q$="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
Q$=Q$+"1234567890!@#$%^&*()-=_+[]{};:'.,<>/?\|`~"
Q$=Q$+STRING$(1,SPACE)+STRING$(1,QUOTE)
QSIZ = LEN(Q$)
REM ************************************************************************
REM
REM This is the main loop. L = length of the string to encrypt. In this
REM example, I am only encrypting a single string. Most people who will
REM actually use this will change the FOR loop to run until an EOF is
REM encountered in the input file. Since the syntax for that will vary
REM widely from system to system, I'll leave it out.
REM
REM ************************************************************************
L=LEN(C$)
FOR I = 1 TO L
REM /* Finds the character I in the input string */
X$ = MID$(C$,I,1)
REM /* Finds the integer location of the character in Q$
REM returns variable POZ */
GOSUB LOKPOZ
REM /* RND returns a random # between 0 and 1. Multiply it by the
REM size of array Q$ and you get the number of positions to move
REM when encoding or decoding. */
POZMV = (RND * QSIZ)
REM /* If you are encoding, you will shift to the right using addition.
REM you take the modula base QSIZ to keep the new character within
REM the bounds of Q$. */
IF A = 1 THEN
NUPOZ = (POZ + POZMV) MOD QSIZ
ELSE
REM /* Otherwise, you subtract, which takes a bit more math to keep
REM up with. Once you have the distance to shift, you must
REM convert it to a positive integer and then subtract two to
REM account for the head & tail of the array. */
NUPOZ = (POZ - POZMV) MOD QSIZ
NUPOZ = NUPOZ -2
IF NUPOZ < 1 THEN
NUPOZ = QSIZ - ABS(NUPOZ)
END IF
END IF
REM /* Now you assign the new character in array Q$ to Y$, and append
REM it to your converted string */
IF NUPOZ < 1 THEN
NUPOZ = QSIZ - ABS(NUPOZ)
END IF
Y$ = MID$(Q$,NUPOZ,1)
D$ = D$ + Y$
NEXT I
PRINT "Original = ";C$
PRINT "Modified = ";D$
END
REM /* This finds character X$ in array Q$ and returns an integer
REM value of the location. Called from the main loop. */
LOKPOZ:
FOUND = 0
POZ = 1
TOP:
IF FOUND = 1 THEN
RETURN
ELSE
TMP$ = MID$(Q$,POZ,1)
IF X$ = TMP$ THEN
FOUND = 1
END IF
POZ = POZ + 1
IF POZ > QSIZ THEN
PRINT "Error: Character '";X$"' not in array Q."
END
END IF
END IF
GOTO TOP
REM **********************************************************************
End of Program Listing 1
This method, while extremely simple, tight, and fast, is not fool-
proof. Most computers use the following algorithm for generating pseudo-
random number sequences: x(t+1) = ax(t) + b
x(t+1) = next random number
x(t) = previous random number
a & b are constants that will cause a fairly even distribution
For example, if you were using a three-bit system (8 possible postive
integers) you might make a = 3 & b = 7 (there's a reason behind using prime
numbers that is beyond the scope of this file.) If you seed the argument with
RANDOMIZE 5 you would get the following:
First x: x = 3*5 + 7 | Since we're restricting ourselves to three bits, and
22 won't fit in three bits, we'd need to perform a modula 8 on the
number. (Modulo divides x by eight and keeps the remainder as the
new value of x.) So MOD(22,8) is equal to 6 (16 + 6 = 22).
Ok, let's do some simple mapping using our vowel set and the above
three-bit random number generator. Let's say that the message reads "AAEOU"
Our first random number was 6. Our map looks like "AEIOU12345". POS(A) + 6
gives us 2 as the character.
Second x: x = 3*6 + 7 | MOD (25,8) = 1 | POS(A) + 1 gives us E.
Third x: x = 3*1 + 7 | MOD (10,8) = 2 | POS(E) + 2 gives us O.
Fourth x: x = 3*2 + 7 | MOD (13,8) = 5 | POS(O) + 5 gives us 4.
Fifth x: x = 3*5 + 7 | MOD (22,8) = 6 | POS(U) + 6 wraps around the map to A.
So our original "AAEOU" is crytped into "2E04A". This may at first
seem difficult to crack since 'A' mapped into a '2' on one pass and an 'E' on
the other, thus preventing a freuquency analysis from breaking the code.
Unfortunately, if someone knows the random number algorithm, they can
easily hack out the key. Since most of the people using this will be using it
on a pc, it would be trivial to get another pc to hack it out. And even if you
protect your random number algorithm, it is still a straight linear algebra
problem that an AT could work on over a weekend and probably figure it out,
especially if there is a fairly small map to work with.
Solution: What we need to do is combine the random mapping with a
random number generator that is tougher to figure out. Enter cellular
automatons.
CA's were first invented in the 1940's when John von Neumann (he of
the famous bottleneck) started to explore the mathmatic implications of very
simple machines. CA's are made of geometric patterns of cells that change
their state at each tick of a clock according to a fixed rule. Early work
provided automatons that could imitate a basic computer. Since the CA's are
inherently parallel (the entire geometry is updated each clock tick) and easy
to put on a chip, there is speculation that the next generation of parallel
processing computers will use CA's as a base rather than the Turing machine
model.
You have probably seen a CA at work and not realized it if you've
ever seen the computer graphic simulation 'LIFE' developed by John Conway at
MIT to model real organisms. The rule for automaton reproduction was incr-
edibly simple: If a cell has two or three neighbors, no change in the cell.
Fewer or more neighbors, it starves or is overcrowded to death, and repro-
duction occurs when a blank space has exactly three neighbors.
Using these simple rules, incredibly complex patterns can be produced.
Anything that can produce complex and varied results from a small algorithm is
a good target for a random number application. Enter Steven Wolfram from the
Institute of Advanced Studies in Princeton, NJ.
Wolfram has been doing research on one-dimensional cellular machines,
which have the advantage of being able to work with both todays machines and
future parallel machines. Wolfram has developed an automaton that is a one
dimensional circular array modified by the rule:
a(x,t) = a(x-1,t-1) XOR (a(x,t-1) OR a(x+1,t-1)) MOD k
Where x is the position in the array and t is the time,
k is the number of available characters (k = LEN(Q$)),
and a is the one-dimensional array.
This rule has several interesting properties. The problem we had with
linear algorithms was that simple algebra could be used to analyze the
evolution of the algorithm (the patterns produced.) All that you have to do is
figure out how *one* cell evolves, then apply that pattern across the entire
array. In the above case, there is no way of analyzing the array at time t
without loading the initial conditions and running the whole thing.
The second thing to note is that there are k to the power of w (where w
is the width (number of cells) in array a) possible states the machine can be
in, and not all of these states have a predecessor that generates it. These
states are called 'Garden of Eden' states, and can only occur if they are set
as an ititial condition.
As a result, this rule is neither a one-to-one mathmatical mapping,
nor is it and onto mapping of the set of arrays into itself. In laymans'
terms, this means that for any given array state it is impossible to tell what
(if any) previous state generated it by mere pattern analysis.
While this isn't a file on breaking codes- about the only way to crack
this one that's been discovered is to load *every* k**w state into memory and
page through them searching for a pattern. This method can be defeated easily
by setting w to more than 30 cells (assuming k=256, all the ASCII characters.)
If you are using my array Q$, you might want to set w to 40 or more. Since 256
to the 30th power is about a zillion bits, roughly equal to the largest memory
bank in existence, there isn't much chance of anyone breaking it. For the
truly paranoid, set w to 50 and sleep easy at night.
Anyway, back to the algorithm...
Each of the cells is filled on one of the k integers from 0 to k-1,
giving each cell k possible states. Wolfram found that the string of bits
occupying the 0 position (a(0,1), a(0,2), a(0,3)...) forms a random sequence
that passes all statistical tests, sometimes with better results than standard
DES algorithms.
Let's break this down and see what it's doing. First of all, we will
need two arrays. Each array is set up thus:
Array for Time One
|------| |------| |------| |------|
|---->|a(0,1)|------>|a(1,1)|------>|a(2,1)|----->|a(3,1)|-----|
| |------| |------| |------| |------| |
|--------------------------------------------------------------|
Array for Time Two
|------| |------| |------| |------|
|---->|a(0,2)|------>|a(1,2)|------>|a(2,2)|----->|a(3,2)|-----|
| |------| |------| |------| |------| |
|--------------------------------------------------------------|
The reason we need two arrays is so you can update the array without
destroying anything in it. In other words, you start out with array 1 active,
then you update the array into array 2 and change the active array to 2. On
the next clock tick you will update the active array (now 2) into the inactive
one (now 1) and set the active array back to 1. You keep swapping like this.
Logically, you only have one array- the active one.
To initialize the array, the ASCII values of each character in the key
are plugged into the first LEN(KEY$) spaces in the array. If you want to use a
short key, modify the code to fill the *entire* array with values of the key
(keep repeating a loop from 1 to W pulling characters out of K). Since the key
can be anything printable, use something 10-20 characters long that you can
remember- "HACK TO LIVE, LIVE TO HACK" is one of my favorites. Anyway, if you
use a short (less than 10) key in this program, the distri- bution will be
skewed for the first W MOD LEN(KEY$) itereations of the automaton, but will
smooth out nicely after that.
After the array is filled, it operates exactly like the first program
*except* when it need a random number of positions to move. Then it drops
down, updates each cell in the automaton, and then reads the value in A(0,time)
as the random number to shift by.
Let's look at the modified encryption code.
REM ************************************************************************
REM
REM This is an modification of Program 1 that doesn't use a machine
REM specific random number generator, but instead uses a cellular automaton
REM algorithm. W is the width of the actual automaton. A is dimensioned
REM at 32 to avoid having to reference element 0 of the array, which is
REM legal on some systems and illegal on the others. This way it can
REM be implemented on anything. Y is set for time 1, Y1 for time 2.
REM These correspond to the second dimension in array A.
REM
REM ************************************************************************
W = 30
DIM A(32,2)
Y = 1
Y1 = 2
REM ************************************************************************
REM
REM Once again, you can set this up to use files instead of strings. And
REM note that, unlike the first example, the key doesn't have to be
REM numeric.
REM
REM ************************************************************************
INPUT "String to be encoded"; C$
INPUT "Key Please! (Can be alpha-numeric) ";K$
REM ************************************************************************
REM
REM This is where K$ is broken down into a series of characters and their
REM ASCII value shoved sequentially into array A.
REM
REM ************************************************************************
FOR I = 1 TO LEN(K$)
T$ = MID$(K$,I,1)
A(I,Y1) = ASC(T$)
NEXT I
REM ************************************************************************
REM
REM The only difference between encoding and decoding is which way you
REM move in your Q$ array space. Encoding takes the original and shifts
REM to the right, decoding takes the codes value and shifts to the left.
REM
REM ************************************************************************
REREAD:
INPUT "Encode or Decode ? ";A$
A$=LEFT$(A$,1)
IF A$="E" OR A$="e" THEN
A=1
GOTO HEAD
END IF
IF A$="D" OR A$="d" THEN
A=-1
ELSE
GOTO REREAD
END IF
REM ************************************************************************
REM
REM Q$ contains all the characters that can be encoded. Every encoded
REM character will be mapped to a character in this array. I haven't
REM included any non-standard characters, so you will have to customize
REM it to your particular keyboard/system. I've included an error check
REM that will abort the encryption if it encounters a character that isn't
REM in Q$. I have to use the STRING$ command to insert the spacebar and
REM the quote into the string. It could also be done with a ASC(##) in
REM many basics. You could expand this to include any non-printable
REM characters you'd like so you could do non-text files.
REM
REM ************************************************************************
HEAD:
SPACE = 32
QUOTE = 34
Q$="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
Q$=Q$+"1234567890!@#$%^&*()-=_+[]{};:'.><,/?\|"
Q$=Q$+STRING$(1,SPACE)+STRING$(1,QUOTE)
QSIZ = LEN(Q$)
REM ************************************************************************
REM
REM This is the main loop. L = length of the string to encrypt. In this
REM example, I am only encrypting a single string. Most people who will
REM actually use this will change the FOR loop to run until an EOF is
REM encountered in the input file. Since the syntax for that will vary
REM widely from system to system, I'll leave it out.
REM
REM ************************************************************************
L=LEN(C$)
FOR H = 1 TO L
REM /* Finds the character I in the input string */
X$ = MID$(C$,H,1)
REM /* Finds the integer location of the character in Q$
REM returns variable POZ */
GOSUB LOKPOZ
REM /* CELLULAR updates the cells in the automaton, switches the active
REM time value, and returns X as the number of positions to shift. */
GOSUB CELLULAR
REM /* If you are encoding, you will shift to the right using addition.
REM you take the modula base QSIZ to keep the new character within
REM the bounds of Q$. */
IF A = 1 THEN
NUPOZ = (POZ + X) MOD QSIZ
ELSE
REM /* Otherwise, you subtract, which takes a bit more math to keep
REM up with. Once you have the distance to shift, you must
REM convert it to a positive integer and then subtract two to
REM account for the head & tail of the array. */
NUPOZ = (POZ - X) MOD QSIZ
NUPOZ = NUPOZ - 2
IF NUPOZ < 1 THEN
NUPOZ = QSIZ - ABS(NUPOZ)
END IF
END IF
REM /* Now you assign the new character in array Q$ to Y$, and append
REM it to your converted string */
IF NUPOZ < 1 THEN
NUPOZ = QSIZ - ABS(NUPOZ)
END IF
Y$ = MID$(Q$,NUPOZ,1)
D$ = D$ + Y$
NEXT H
PRINT "Original = ";C$
PRINT "Modified = ";D$
END
REM /* This finds character X$ in array Q$ and returns an integer
REM value of the location. Called from the main loop. */
LOKPOZ:
FOUND = 0
POZ = 1
TOP:
IF FOUND = 1 THEN
RETURN
ELSE
TMP$ = MID$(Q$,POZ,1)
IF X$ = TMP$ THEN
FOUND = 1
END IF
POZ = POZ + 1
IF POZ > QSIZ THEN
PRINT "Error: Character '";X$"' not in array Q."
END
END IF
END IF
GOTO TOP
REM ***********************************************************************
REM
REM This is the cellular automaton
REM
REM ***********************************************************************
CELLULAR:
REM /* Goes through the loop updating into the inactive time (1 or 2 dep-
REM ending on how Y and Y1 are assigned) */
FOR I = 1 TO W
A(I,Y) = A(I-1,Y1) XOR (A(I,Y1) OR A(I+1,Y1))
NEXT I
REM /* Updates the ends of the array (logical positions 0 and 31) that
REM are used in calculating other fields. */
A(0,Y) = A(W+1,Y1) XOR (A(0,Y1) OR A(1,Y1))
A(W+1,Y) = A(W,Y1) XOR (A(W+1,Y1) OR A(0,Y1))
REM /* Assigns the first cell to X as a random number */
X = A(1,Y)
REM /* Flips the active time */
TMP = Y
Y = Y1
Y1 = TMP
RETURN
Ok, let's trace through a *small* example. Assume our earlier
map of "AEIOU12345" and an automaton of width 5. For a key, we'll use
"A15".
1) Assign ASC(A) to a(1,1), ASC(1) to a(2,1), ASC(5) to a(3,1).
("0" will represent an empty cell in this example.)
A(time 1) = 0 65 49 53 0 0 0
(Remember that an array of width 5 is going to have 7 actual elements)
2) Now then, we want to encrypt the string "EE3"
First, we locate where E is in our map. LOKPOZ("E") = 2
3) Now then, we update the automaton.
a(1,2) = 0 XOR (65 OR 49)
a(2,2) = 65 XOR (49 OR 53)
a(3,2) = 49 XOR (53 OR 0)
a(4,2) = 53 XOR (0 OR 0)
a(5,2) = 0 XOR (0 OR 0)
Since this isn't a tutorial on binary numbers and boolean algebra, you'll
have to trust me on this one...
a(1,2) = 113
a(2,2) = 116
a(3,2) = 4
a(4,2) = 53
a(5,2) = 0
4) Now we update the ends.
a(0,2) = 0 XOR (0 OR 65)
a(6,2) = 0 XOR (0 OR 0)
Again...
a(0,2) = 65
a(6,2) = 0
5) Now we switch the active time from 1 to 2, and our new automaton is
a(time 2) = 65 113 116 4 53 0 0
6) We then pull off a(1,2) as the number to shift by.
7) Postion 2 + 113 (we're encoding, so we add) is 5 (modulo arithmatic.)
8) "E" is encoded into "U".
9) We repeat this two more times (you don't really want me to step through
it all, do you?) and end up with the encrypted version.
Well, that's going to pretty much wrap this file up. If you are
interested in more files of this nature, let me know. If you find this totally
confusing, but want to learn more, call The Phoenix Project at 512/441-3088
(300/1200/2400, 24 hours a day). Our friendly and helpful LOD/H staff will be
glad to assist you. Other people who you might want to talk to about
encryption include Dr. Cypher, Tuc, and Prime Suspect.
Also, if you are interested in seeing the above algorithm applied in
other languages let me know. If there's enough of a demand I'll release C,
Modula-2, and ADA versions.
This has been a Legion of Doom/Legion of Hackers presentation!
The Mentor
LOD/H
*****************************************************************************
References and Acknowledgments:
"How to Generate Cryptographically Strong Sequences of Pseudo-Random Bits";
M. Blum & S. Micali; SIAM Journal of Computing, vol. 13, p. 850 (1984)
"Functions of Random Variables"; John Freund & Ronald Walpole;
Mathmatical Statistics, 4th Edition; Prentice-Hall Inc., NJ; pp. 240-71
"Building an Encryption System"; Peter Wayner
Computer Language, Vol. 4, Num. 12, p. 67 (Dec. 1987 Issue)
"Random Sequence Generation by Cellular Automata"; Institute for Advanced
Study; Advances in Applied Mathmatics;
"Breaking Pseudo-Random Number Based Cryptographic Algorithms"; M. Vahle &
L. Tolendino; CRYPTOLOGIA, Oct 1982, p. 319
Also my thanks to: TUC, The Leftist, Prime Suspect, and Dr. Cypher, who all
contributed to this one way or another.
***************************************************************************