NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / sci / crypt / 2937 < prev next >

Wrap

Internet Message Format | 1992-08-18 | 14.7 KB

Path: sparky!uunet!dtix!darwin.sura.net!jvnc.net!yale.edu!yale!mintaka.lcs.mit.edu!hal.gnu.ai.mit.edu!knight From: knight@hal.gnu.ai.mit.edu (Knight of Tossing and Turning) Newsgroups: sci.crypt Subject: mcrypt again Message-ID: <1992Aug18.233021.15859@mintaka.lcs.mit.edu> Date: 18 Aug 92 23:30:21 GMT Sender: news@mintaka.lcs.mit.edu Organization: MIT Laboratory for Computer Science Lines: 300 In <1992Aug13.153842.24060@mintaka.lcs.mit.edu> knight@hal.gnu.ai.mit.edu (Knight of Tossing and Turning) says: >>This is the post to the creation method of the "Encryption Grid" of >>"Mass Encryption" aka mcrypt. >> example deleted > I'm afraid that I do not understand this fully. Mass Encryption was > earlier supposed to be a byte Simple Substitution followed by > bit-transposition on a 1K block: >> ...stuff deleted > How does the User Key create the *two* necessary permutations? > Or does the cipher simply re-use the *same* permutation for *both* > the substitution and transposition stages? But how can this be? > The Simple Substitution on bytes needs 256 entries of 0..255; a > complete bit-permutation for a 1K block needs 8192 entries of > 0..8191. As I explained earlier, I figured everyone knew what a simple substitution looked like, so I wouldn't waste time with it. Basically, I was going to use a program which is basically the algorythm used for shuffling decks of cards in Video Poker programs, just using some elements of the password line to generate them. I figured that this method would be safe as long as I didn't generate the bit transposition phase and the simple-substitution phase the exact same way or in some way directly linked outside of the password field. In other words, bits are used in the transposition phase, and integers and positioning are used in the simple-substitution phase. Therefore, not directly linked by anything but the password. >> For password string "X", and distribution of "X" in a 32 * 32 >> matrix is represented by the function GENKEY, . . . > OK, I think I see how the process works now: > ... accurate stuff deleted > It also means that adjacent bits in the input stream are about > 50% likely to be adjacent when enciphered; 25% likely to be separated > by one bit; 12.5% likely to be separated by two bits, etc. This > is in no sense a general permutation; it is better described as > key-controlled multiplexing or "braiding" (to recall the discussion > we had last year) without either the random control or the noise data > channel. Your right, I had this problem proving my system earlier. I wrote a program which took into consideration the rules which I generated and came up with a number in the whereabouts of 2^54. Since then, I've been debating if I should remove the required 0<x<255 rule. Removal of this would cause the permutations to jump dramatically. If you are off by a single bit, the entire encryption will likely be ruined and the simple substitution will cause a 1-in-256 chance of anything you've generated being accurate at all. It is basically a deck of cards -- pull out the wrong bit and the whole scheme will collapse. I wrote a program to solve for this series of numbers, I posted the results of the program to sci.crypt several days back. > On average, 16 bits of ciphertext contain two intermixed 8-bit > sequences; take 32 bits of ciphertext and you are sure to get a couple > of complete plaintext characters. In other words, the diffusion is > very weak, and diffusion is all that transposition does. Okay, lets get more to the heart of what you really were trying to point out -- and that was the arrangement must be stream-like, and therefore has a fairly finite method of extraction from 32 bits. First of all, lets say you extract 3 bytes of data from 32 bits of ciphertext. First of all, some of that data will be sequential to your needs. That is -- in order of extraction. However, some of that plaintext will be out of order -- after inverse takes place. Therefore, bytes 1, 2, and 3 could be mixed with bytes 678, 679, and 680. This is severely "out of phase" since you haven't even determined what bytes 4-677 are supposed to be yet. To repeatitively "take guesses" would be 95*95 (if in password field) or 10,355 ways to extract the very first BYTE, the first piece of pseudo-plaintext, and THAT could be one of 256 values, because of the simple-substitution made earlier. Therefore, the ultimate chances of extracting the very first byte, in accordance to the "system" which will extract all additional bytes accurately, is 1 in 2,650,880! And then you will have to guess the NEXT byte so that it follows the exact same parameters, which cannot be derrived from the first byte. Yes, I realize that there are only 256 possible numbers for the first byte, but this the encryption is highly structured. To take the first byte that "matches" will doubtless prevent the extraction of any other byte accurately -- you'd be "making it up" as you go along. > Moreover, the input data separations (gaps between the runs) are > not random, indeed, they *make sense* in the underlying reference > grid; they are, in fact, ASCII letter bit positions. Known-plaintext > (after any substitution) may well recover these letters, and thus the > actual User Key itself. Any phrase-like structure in the User Key > will help to fill in gaps in the cryptanalysis. Yes, but this is basically true of any encryption. However, this isn't ALWAYS true. Hence: 1 2 3 4 1 C H I N 2 A * 3 4 The "*" was a distinct, NON-ASCII character. Therefore, to assume you can treat any series at all as only ASCII bit positions, you would have the problem dealing with the length of the password itself. And how well do they make sense? Lets take the following pattern that I will agree to: for the first 6 digits (assuming a 6 digit minimum) the high (7) bit will be always "0". After digit 6, the number could be ASCII or the NON-ASCII value (190 possibilities). After that, it could be hashed to a number as high as 254 or as low as 1. So the "pattern" permutations jump right where most people start placing a password! You'd need as few as 7 digits to watch it climb past the "ascii" assumption. By lowering the number of digits "necessary" in my program, it causes the permutations to jump all the sooner. It would be maximized by a single letter password. > One remaining issue is how strong the Simple Substitution is at > preventing the information necessary for attack. Usually Simple > Substitution is considered very weak, and it will be much weaker > if the same "reference grid" is used for both substitution and > mixing. Apparently the substitution step was not displayed in the > "debug mode" output, so it is hard to know how this is done. Sorry, this step wasn't even in my program when I originally placed was requested to show my idea to "sci.crypt". I'll have this placed for all to view (and my source code) probably within the week, provided my relatives ever leave the house (and my telephone ever starts working right for a consistant period of time.) Even though simple substitution is weak, it was included for two basic reasons: 1) It adds doubt to anything removed out of the "bit transposition" phase (eliminating the bulk of the plaintext prior to the really tough(?) encryption.) Unless you have the real password, you will have to guess as to the result of this step will be. 2) It prevents the argument that if the encrypted file has the same number of bits as the unencrypted file, and the two files were of large enough proportions, they were the same file. >> The rest of the grid can be generated from happenstance values >>from manipulation of characters from the password characters that >>make up X. A few formulas would be: >> .. pointless hash functions deleted. > These formulas simply to expand the User Key. They do not produce > independent results; the results are completely defined by the > User Key. Thus, the maximum possible uniqueness is the size of the > User Key alphabet (95) to the power of the size of the User Key > string (1024), or 95^1024. Of course, since only 1024 bits are > used to encipher a block, there can be no more than 2^1024 > possibilities. Still huge, to be sure, but note the difference > from the claim: >> Total permutations for the GENKEY() function = 95 ^ (1024!). + Deleted 2^1024 vs 2^8192 harp, since everyone attacked it already. Also, my claim of 95^1024! is due to the fact that length is provided in the password "key grid" by placing of the non-ascii value. Therefore, for each limited length string between 1 and 1024 characters long, there exists a unique key grid -- which is reminicant of a factorial value. Admittably, I wish I had some computer to tell me which of the two numbers is smaller, but I can generate the factorial value inside of the 2^8192 grid, but only because I "trick" the "1024" position into not having to supply me with a length. (Hence, if all 1024 bytes are ascii, then I don't have to place a non-ascii value to tell me the length.) Looks to me as if my claim still holds valid. > Few users will have a 1024-character User Key, so the average > uniqueness will be much lower. The multiplexing will not be > well-distributed or random, because a User Key of any size will > probably be a language phrase, and that plaintext phrase will control > the multiplexer directly. The expansion formulas may well be used to > expand any cryptanalytic "breaks" to multiple places within the block. > The expansion formulas themselves may relate individual User Key > characters to multiple locations in the block, yielding multiple > attack possibilities (it may be possible to attack individual User > Key characters this way). For icing on the cake, the plaintext User > Key itself will exist in a known position the "reference grid," just > begging to be the first thing attacked. And this is not to mention > anagramming, the classical attack on transposition. A-HA! Great points! Let me see if I can defend against this: 1) Like all 'password' based encryptions, attacks against language phrases by using phonics can be used to reduce the total number of "attempts" required IF the word is indeed a word. Furthermore, the "BUG" story comes into effect as far as people using really small passwords, as many people prefer to do that. To quote Helen Gaines's book "Cryptanalysis", p3: "Even a basically weak cipher, in the hands of an expert, can be made to serve its purpose; and the strongest can be made useless when inproperly used." 2) Okay, admittably everything along the line is basically a series starting with 95 ways times several possibilties that are approximately 95 ways until reaching value 1024. But this is the limiting factor, which is still a huge number. Even a six digit password would be 95*95*95*95*95*95, or 95^6, which is basically 735,091,890,625 possibilities. That may look fairly weak, but if your not sure about the length of the password, the 7 digit possibilities- 69833729609375 and the 8 digit possibilties - 6634204312840625, are basically enough to make "standard" brute force unfeasable against those attempts. DES encryption has the problem that two strings could generate the same result. I believe this also holds true with RSA due to the relatively small number of permutations compared to possible passwords that can be entered. 3) Attack against the fact the password itself is used in the "key" grid. Lets consider the following bits were placed using mcrypt, and that these bits represent the area of the "grid" where the password (in ASCII format) was used to generate the positioning of these bits, find the password: POSITION byte 1 byte 2 byte 3 byte 4 byte 5 byte 6 byte 7 01234567 01234567 01234567 01234567 01234567 01234567 01234567 D: 01001011 11011011 11011001 00100110 11001000 11101110 10100001 D = Encrypted data. Starting from scratch, not knowing what the data inside contains, knowing it could very well have been compressed prior to encryption, knowing that the values of the bytes have been altered prior to bit transposition, how do you extract the password using nothing but that data? If the data could be anything, then the password itself likewise could be anything. How long is the password? Is it 6 digits? Is it 7 digits? Is it larger then the segment I removed? I took VERY careful consideration to make sure that "clues" to the password never made it into the encrypted file. I didn't think that using the ascii characters themselves in the encryption grid is a problem. Sure, it looks vulernable when you stare at them on paper, but after the file is encrypted, the entire thing is discarded completely. I figured it was the opposite, a STRENGTH to the encryption. That way, no matter what, I was guarenteed the maximum number of keys generatable by ASCII. > But the real problem is that the underlying multiplexing step > is so very weak. An eight digit password has 6,634,204,312,840,625 possible arrangements of bits. This doesn't include passwords of earlier length. If you mess up but a single BIT anywhere in that combination, the rest of the encryption will fall apart and you will unlikely be able to generate any provable plaintext from the entire system (due to the simple substitution.) > One might argue that the multiplexing still provides 2^[8192] options > for the data block (although I argue that correlation effects greatly > reduce this value). But few would disagree that each additional > block enciphered in the same way would greatly improve the ability > to attack the cipher. Every "additional" block encrypted will cause extensive damage to the I/O usage and DRAM required to analyze the encryption. I'm not sure if you are aware of what components of the computer are slower and faster then others, but the more you rely on disk drive seeks and calls to stored data in Dynamic Memory, the less likely your program has a chance to ever finish. More calls to the operating system to handle that, the more slowdown you will receive, also. "Hardware" protected? Maybe. This brings about a point not covered recently, and that is the lack of a "verification" system. Basically, Mcrypt has no way of telling if the data was properly extracted. You will have to create your own methods, which adds considerable amounts of time. > References > > [1] Geffe, P. 1973. How to protect data with ciphers that are > really hard to break. Electronics. January 4. 99-101. > > [2] Siegenthaler, T. 1985. Decrypting a Class of Stream Ciphers > Using Ciphertext Only. IEEE Transactions on Computers. > C34: 81-85. Will try to get, thank you. > --- > Terry Ritter ritter@cactus.org --- Eric Knight, knight@gnu.ai.mit.edu