home *** CD-ROM | disk | FTP | other *** search
- .oO Phrack 49 Oo.
-
- Volume Seven, Issue Forty-Nine
-
- 10 of 16
-
-
- A Steganography Implementation Improvement Proposal
-
- by: cjm1@concentric.net
-
- [ For those of you who do not know, steganography is cryptographic
- technique that simply hides messages inside of messages. The sender composes
- an innocuous message and then, using one of many tactics, injects the secret
- message into it. Some techniques involve: invisible inks, character
- distortion, handwriting differences, word/letter frequency doping, bit
- flipping, etc... The method the author discusses hinges upon a well known
- steganographic implementation, low-order bit flipping in graphic images. -d9 ]
-
- Steganography is a technique for hiding data in other data. The
- general method is to flip bits so that reading the low-order bit of each of
- 8-bytes gets one a character. This allows one to use a picture or a sound
- file and hide data, resulting in a small bit of hopefully unnoticeable noise
- in the data and a safely hidden cache of data that can later be extracted.
- This paper details a method for making steganographically hidden data more
- safe, by using pseudo-random dispersion.
-
- Ordinarily, if someone suspects that you have data hidden in, say, a
- GIF file, they can simply run the appropriate extractor and find the data. If
- the data is not encrypted, it will be plain for anyone to see. This can be
- ameliorated by using a simple password protection scheme, hiding the password
- in the GIF as a header, encrypting it first with itself. If someone does not
- know the password, they cannot extract the data. This is of course reasonably
- safe, depending on the encryption scheme used, and I recommend it. But, the
- hidden data can be made even safer.
-
- Pseudo-random dispersion works by hiding a password, and a seed for a
- random-number-generator in the encrypted header. then, a random number of bytes
- are passed by, before a low-order bit is flipped.
-
- To do this, one must first calculate how many bytes a bit can take up
- for itself. For instance, to hide an 800 character message in a GIF would
- mean each character needs 8 bytes (8 bits per character, 1 byte per low-order
- bit), so you need 6,400 bytes of data to hide the message in, 8 bytes per
- character. Let's say we have a GIF that is 10 times this size: 64,000 bytes.
- Thus we have 80 bytes per character to hide data in. Since each bit takes a
- byte, we have 10 bytes per bit to hide data in! Therefore, if we take a
- pseudo-random number between 1 and 10, and use that byte to hide our low-order
- bit in, we have achieved a message dispersed through the GIF in a pseudo-random
- fashion, much harder to extract. A message in which each byte has a bit which
- is significant to the steganographically hidden message can be extracted with
- ease relative to a message in which there are 10 possible bytes for each bit
- of each character. The later is exponentially harder to extract, given no
- esoteric knowledge.
-
- A slight improvement can be made to this algorithm. By re-calculating
- the number of available bytes left for each bit after each bit is hidden, the
- data is dispersed more evenly throughout the file, instead of being bunched up
- at the start, which would be a normal occurrence. If you use pseudo-random
- number generator, picking numbers from 0-9, over time, the values will smooth
- to 5. This will cause the hidden message to be clustered at the beginning
- of the GIF. By re-calculating each time the number of available bytes left
- we spread the data out throughout the file, with the added bonus that later
- bits will be further spread apart than earlier ones, resulting in possible
- search spaces of 20, 30, 100, or even 1,000 possible bytes per bit. This too
- serves to make the data much harder to extract.
-
- I recommend a header large enough for an 8 character ASCII password,
- an integral random-number seed, an integral version number, and an place
- holder left for future uses. The version number allows us to tweak the
- algorithm and still be able to be compatible with past versions of the
- program. The header should be encrypted and undispersed (ie: 1 byte per
- bit of data) since we haven't seeded the random-number generator yet for
- dispersion purposes.
-
- It is useful to make the extractor in such a way that it always
- extracts something, regardless of the password being correct or not. Doing
- this means that it is impossible to tell if you have guessed a correct password
- and gotten encrypted data out, or merely gotten out garbage that looks like
- encrypted data. Use of a password can also be made optional, so that none is
- necessary for extraction. A simple default password can be used in these
- cases. When hiding encrypted data, there is no difference to the naked
- eye between what is extracted and what is garbage, so no password is
- strictly necessary. This means no password has to be remembered, or
- transmitted to other parties. A third party cannot tell if a real password
- has been used or not. It is important for safety purposes to not hide the
- default password in the header if no password is used. Otherwise, a simple
- match can be made by anyone who knows the default password.
-
-