Usenet 1994 January

home *** CD-ROM | disk | FTP | other *** search

/ Usenet 1994 January / usenetsourcesnewsgroupsinfomagicjanuary1994.iso / sources / unix / volume24 / yabbawhap / part01 / ycoding.4b < prev

Wrap

Text File | 1991-10-09 | 31.2 KB | 718 lines

Y coding Daniel J. Bernstein Copyright 1991. All rights reserved. Draft 4b, March 19, 1991. This is a draft. That means it's supposed to be unreadable, misleading, boring, useless, and generally wrong. Any deviations from the norm are accidents---happy accidents, but accidents nonetheless. End of warning. ----- 1. Introduction --- LZW coding Fix an alphabet A, and take a string I over that alphabet. Construct a ``dictionary'' D---a one-to-one mapping from strings to integers---as follows: 0. Start by mapping each symbol of A to a unique integer. 1. Find the longest prefix p of I contained in D. (Rather, contained in the domain of D. It is convenient to ignore this distinction.) 2. Take the next symbol c after p in I. 3. Add pc to the dictionary, mapping to another unique integer. 4. Strip p from the front of I, so that I begins with c. 5. Repeat from step 1 until I is empty (i.e., until step 2 fails). For example, say A is the set abcdefghijklmnopqrstuvwxyz, and say I is the string yabbadabbadabbadoo. D starts with all single characters from A. Now the longest match between I and D is I's first character, y; and the charater after that is a. So we add ya to the dictionary and strip y from the front of I. Now I is abbadabbadabbadoo, and D has all single characters plus ya. The longest match between D and I is a, so we add ab to the dictionary and remove the a. We continue in this manner until I is empty: I match add new dictionary yabbadabbadabbadoo y ya ya (plus all single characters) abbadabbadabbadoo a ab ya ab bbadabbadabbadoo b bb ya ab bb badabbadabbadoo b ba ya ab bb ba adabbadabbadoo a ad ya ab bb ba ad dabbadabbadoo d da ya ab bb ba ad da abbadabbadoo ab abb ya ab bb ba ad da abb badabbadoo ba bad ya ab bb ba ad da abb bad dabbadoo da dab ya ab bb ba ad da abb bad dab bbadoo bb bba ya ab bb ba ad da abb bad dab bba adoo ad ado ya ab bb ba ad da abb bad dab bba ado oo o oo ya ab bb ba ad da abb bad dab bba ado oo o o While we construct the dictionary we can output the value of each match under the dictionary. This produces a sequence of numbers which, as we will see below, is sufficient to reconstruct the original string. This mapping from strings to sequences of numbers is called LZW coding. Typically the numbers in the dictionary are chosen sequentially, from 0 to |A| - 1 for the initial single-character strings and then from |A| on up. In the above example, the matches are y a b b a d ab ba da bb ad o o, which have numbers 24 0 1 1 0 3 27 29 31 28 30 14 14. That sequence is the result of yabbadabbadabbadoo under LZW coding. --- LZW decoding How do we reconstruct the original string if we're given the coded sequence? The secret is to reconstruct the dictionary. 0. Start by mapping each symbol of A to a unique integer. Set I to the empty string. Read a number from the input, and set p to the corresponding single-character string. 1. Append p to I. 2. Read a number from the input, and let c be the first character of the corresponding string in the dictionary. 3. Add pc to the dictionary, mapping to the next unique integer. 4. Set p to the dictionary string corresponding to the number read. 5. Repeat from step 1 until end-of-input (i.e., until step 2 fails). For example, take the input sequence 24 0 1 1 0 3 27 29 31 28 30 14 14. D starts with all single characters from A; p starts as the single character y; and I starts empty. Now p is appended to I, so that I contains y. The 0 in the input means that the next character of I is an a; and ya is added to the dictionary. p is then set to a. Next, p is appended to I, so that I contains ya. The 1 in the input means that the next character of I is b; and ab is added to the dictionary. p is then set to b. We continue this way through the entire input: input c added new p I 24 y y 0 a (ya,26) a ya 1 b (ab,27) b yab 1 b (bb,28) b yabb 0 a (ba,29) a yabba 3 d (ad,30) d yabbad 27 *a* (da,31) ab yabbadab 27 is ab, so *a* is the 1st char 29 _b_ (abb,32) ba yabbadabba 29 is ba, so _b_ is the 1st char 31 d (bad,33) da yabbadabbada 28 b (dab,34) bb yabbadabbadabb 30 a (bba,35) ad yabbadabbadabbad 14 o (ado,36) o yabbadabbadabbado 14 o (oo,37) o yabbadabbadabbadoo Notice the slightly twisted statement of steps 2 through 4. p is set to the dictionary string corresponding to the number read from the input, but first c is set to the first character of that string, and the old pc is added to the dictionary. This roundabout presentation is necessary because the number on the input may be the number of the very last dictionary string added---and that last string depends on the first character of the current string. This overlap is always safe but is one of the first problems to look out for in a new implementation. --- So who cares about this LZW stuff anyway? What's the point of LZW coding? When the input string has lots of the same characters or sequences of characters, the dictionary will rapidly grow to include those sequences. Then a single number in the output sequence might represent several characters of input, and will use less space. For example, take the simplest natural encoding of a string over our lowercase alphabet A. There are 26 symbols, so we can represent each with 5 bits. Our string yabbadabbadabbadoo has 18 characters and takes 90 bits under this encoding. On the other hand, the string after encoding has just 13 numbers: 24 0 1 1 0 3 27 29 31 28 30 14 14. For these values there are 26 27 28 29 30 31 32 33 34 35 36 37 38 choices respectively, so they take 5 5 5 5 5 5 5 6 6 6 6 6 6 bits in the natural way, for a total of just 71 bits. Most strings that come up in practice have similar redundancy, and LZW coding will produce an output that takes less space than its input. This is why LZW coding is usually called LZW compression. It often reduces longer strings to 50% or even 30% of their original size, so that they take a fraction as much disk space and communication time as they would in their uncompressed form. --- A bit of jargon Many audio and video compression methods throw away some information. They don't reconstruct exactly the original pictures and sounds, but nobody really notices. These are called inexact, or sometimes lossy, compressors. In contrast, LZW always lets you get back exactly the original string. It is an exact, or lossless, compressor. Any compressor that splits its input into substrings and codes the strings with a dictionary is called (logically enough) a dictionary compressor. A substring of an input string---particularly a substring that is coded as a single output number---is called a ``phrase.'' In LZW, one string is added to the dictionary per phrase. Some compression methods use fixed tables: ``the'' becomes a special, single character, ``of'' becomes another, etc. They're called static, or nonadaptive, compressors. Others, like LZW, build their dictionaries dynamically to adapt to any input; LZW is an example of an adaptive compressor. Somewhere in between are compressors that make two passes over the input, building a dictionary the first time through and then producing output the second time through. They're called semiadaptive. One useless theoretical property of LZW is that it is universal. This means that on a certain wide class of inputs it will give as close as you want to the best possible compression of any method. The longer your strings are, the closer it comes to optimal. Universality doesn't make one whit of difference in practice---to get within 1% of optimal with LZW you'd have to start compressing the planet---but it generally indicates that a method is both simple and trustworthy. Non-universal compressors are usually much more complicated, and will fail miserably on inputs they weren't designed specifically to handle. ----- 2. Implementing LZW --- Encoding The most natural way to find the longest match between I and strings in D is one character at a time. Start with p as the first character of I (or as the null string if that is more convenient). Read the next character, c. If pc is in the dictionary, set p to pc and repeat. Otherwise p and c are set. So the dictionary has to support two basic operations: searching for pc if p is known to be there already, and adding pc if it is not there. Most implementors use some form of trie---array trie, list trie, hash trie, huptrie---for this job. The array trie, as in Woods' ``hyper-LZ'' [9], offers extreme speed at the expense of |A| (typically 256) words of memory for each string in the trie. The huptrie and straight hash trie use only a small amount of memory per string but still allow reasonably fast searches in the average case. We will not consider these structures in further detail. In any case at most a fixed number of operations happen for every input character, so LZW coding takes linear time no matter how long the input is. Unfortunately, computers don't have infinite memory. Implementations typically limit the dictionary to some size, usually between 1024 and 65536 strings. When the dictionary reaches that size it can either remain fixed for the rest of the input, or it can be cleared and start from single characters again. The second strategy effectively breaks the input into ``blocks'' with independent dictionaries. If the input ``feels'' different in different blocks, blocking will produce good results, because it will weed useless strings out of the dictionary. The ``compress'' program [8] introduced an important blocking variant. Instead of clearing the dictionary as soon as it reaches its maximum size, compress periodically checks how well it is doing. It will only clear the dictionary when its compression ratio deteriorates. This way the blocks adapt to the input. Note that all of these blocking techniques require space for a special output code to clear the dictionary. Another variant is LRU replacement: the least-recently-used strings are removed from the dictionary at some opportune time. This provides a smoother transition between blocks than clearing the dictionary. Unfortunately, it is difficult to find good heuristics for when to remove what strings. Blocking is still more of an art than a science. --- Preparing for encryption It is very useful to compress a message before encrypting it, as compression removes much of the statistical regularity of straight text. However, the usual coding leaves a lot of redundancy. If the dictionary has 33 strings, for example, and 6 bits are used to represent a choice among those strings, then the high bit will almost always be 0. These high bits come in predictable locations, so an attacker can almost always figure out the key given a long enough stretch of compressed text. To prevent this, the compressor should introduce random bits as follows: Given a choice between 0 and 40, 0 through 22 are encoded as either themselves or from 41 through 63. The choice should be made with a high-delay random number generator such as a subtractive generator. 23 through 40 are always encoded as themselves. This method greatly reduces the amount of extra information available per bit without appreciably slowing down the coding. Because random bits are used at unpredictable times, an attacker cannot easily take advantage of the determinism of the pseudo-random generator. Most implementations also add a recognizable header to compressed text. This header should be removed before encryption. --- Decoding LZW decoding is even easier than encoding: the dictionary does not have to support searching. The easiest (and generally fastest) method is to keep I in memory as it is reconstructed, and to keep track of which substring of I a given dictionary number corresponds to. To add pc to the dictionary means to add the pair (pointer to start of pc within I, length of pc) at the right spot in an array indexed by dictionary numbers. There are methods which take slightly less memory than this, but they are slower. ----- 3. MW and AP coding --- MW: Adapting better LZW adapts to its input relatively slowly: strings in the dictionary only get one character longer at a time. If LZW is fed a million consecutive x's, its longest dictionary string will only have around 1400 x's, and it will only achieve around 1000:1 compression. Is there any better way for it to notice the redundancy? The answer is yes. Instead of adding p plus one character of the next phrase to the dictionary, we can add p plus the entire next phrase to the dictionary. This defines MW coding. For example: I match added to dictionary yabbadabbadabbadoo y abbadabbadabbadoo a ya (concatenation of last two matches) bbadabbadabbadoo b ab badabbadabbadoo b bb adabbadabbadoo a ba dabbadabbadoo d ad abbadabbadoo ab dab badabbadoo ba abba dabbadoo dab badab badoo ba dabba doo d bad oo o do o o Even such a short example shows how quickly MW begins to adapt. By the fifteenth character ``dabba'' is already established as a dictionary phrase. On the other hand, MW will miss shorter phrases where there is less redundancy, so it generally performs only somewhat better than LZW in practice. In this example it outputs 13 numbers, just like LZW. But given a million x's it will end up with a dictionary string of length around half a million, and it will output just 30 bytes. --- Problems of MW MW lacks some properties of LZW that we don't realize are so helpful until they are taken away. Most importantly, not every prefix of a dictionary string is in the dictionary. This means that the character-at-a-time search method we saw for LZW doesn't work. Instead, every prefix of a string in the dictionary must be added to the trie, and every node in the trie must be given a tag saying whether it is in the dictionary or not. Furthermore, finding the longest string may require backtracking: if the dictionary contains xxxx and xxxxxxxx, we don't know until we've read to the eighth character of xxxxxxxy that we have to choose the shorter string. This seems to imply that MW coding requires backtracking and hence is fundamentally slower than LZW coding. (Decoding can be made reasonably fast, though.) Another problem is that a string may be added to the dictionary twice. This rules out certain implementations and requires extra care in others. It also makes MW a little less safe than LZW before encryption. --- AP: Adapting faster There is a natural way to preserve the good adaptation of MW while eliminating its need for backtracking: instead of just concatenating the last two phrases and putting the result into the dictionary, put all prefixes of the concatenation into the dictionary. (More precisely, if S and T are the last two matches, add St to the dictionary for every nonempty prefix t of T, including T itself.) This defines AP coding. For example: I match added to dictionary yabbadabbadabbadoo y abbadabbadabbadoo a ya bbadabbadabbadoo b ab badabbadabbadoo b bb adabbadabbadoo a ba dabbadabbadoo d ad abbadabbadoo ab da, dab (d + all prefixes of ab) badabbadoo ba abb, abba (ab + all prefixes of ba) dabbadoo dab bad, bada, badab badoo ba dabb, dabba doo d bad oo o do o o Since AP adds more strings to the dictionary, it takes more bits to represent a choice. However, it does provide a fuller range of matches for the input string, and in practice it achieves slightly better compression than MW in quite a bit less time. ----- 4. Y coding --- Completing the square LZW adds one dictionary string per phrase and increments strings by one character at a time. MW adds one dictionary string per phrase and increments strings by several characters at a time. AP adds one dictionary string per input character and increments strings by several characters at a time. These properties define three broad classes of methods and point naturally to a fourth: coders that add one dictionary string per input character and increment strings by one character at a time. An example of such a method is Y coding. (It is worth noting at this point that LZ77 variants are characterized by adding several dictionary strings per input character.) --- The incomprehensible definition Y coding is defined as follows: The dictionary starts with all single characters. One string pc starting at each input position is added to the dictionary, where p is in the dictionary before that character and pc is not. To put it differently, ``yowie'' appears in the dictionary as soon as the regular expression yo.*yow.*yowi.*yowie matches the text. It is added at the final e. So yabbadabbadabbadoo leads to the dictionary ya, ab, bb, ba, ad, da, abb, bba, ada, dab, abba, bbad, bado, ado, oo. We haven't defined the coding yet. While we build the dictionary, we keep track of a current longest match (initially empty). Right after reading an input character and before integrating it into the dictionary, we see whether the longest match plus that character is still in the dictionary *constructed before the start of that longest match*. If so, we use that as the new longest match, and proceed. Otherwise, we output the number of the longest match, and set the longest match to just that new character. To decode, we read these matches, one by one. We take each one as a sequence of characters as defined by the dictionary, and output that sequence. Meanwhile we take each character and feed it as input to the dictionary-building routine. --- A comprehensible example We can run through the string keeping track of dictionary matches, as follows: I add current matches abcabcabcabcabcabcabcx a bcabcabcabcabcabcabcx ab b cabcabcabcabcabcabcx bc c abcabcabcabcabcabcx ca a bcabcabcabcabcabcx ab b cabcabcabcabcabcx abc bc c abcabcabcabcabcx bca ca a bcabcabcabcabcx cab ab b cabcabcabcabcx _____ abc bc c abcabcabcabcx abca / \ \___ bca ca a bcabcabcabcx bcab cab ab b \____ \_/ cabcabcabcx cabc abc bc c \__/ abcabcabcx abca bca ca a bcabcabcx abcab bcab cab ab b cabcabcx bcabc cabc abc bc c abcabcx cabca abca bca ca a bcabcx ,-----------------abcab bcab cab ab b cabcx abcabc `--bcabc cabc abc bc c abcx bcabca cabca abca bca ca a bcx cabcab abcab bcab cab ab b cx abcabc bcabc cabc abc bc c x abcabcx x bcabcx cabcx abcx bcx cx The strings on the right side are the current matches-so-far after each input character. We start with no matches. When we read a character, we append it to each match so far; if the results are in the dictionary, we make them the new matches, and add the character as a match by itself. If any of them are not in the dictionary we take them out of the list and add them to the dictionary. Before reading the fifth c, for example, the matches are bcab, cab, ab, and b. We append c to each one: bcabc doesn't match, so we add it to the dictionary. The rest are still in the dictionary, so our new list is cabc, abc, bc, and a lone c. And so on. When the x is read, the current list is abcab bcab cab ab b. Now none of abcabx bcabx cabx abx bx are in the dictionary, so we add all of them, and the list becomes just a single x. --- The crucial observation It is not clear so far that Y is a linear-time algorithm: each input character demands additions to several matches. However, *every substring of a dictionary string is in the dictionary*. This is clear from the regular-expression characterization of dictionary strings: if yo.*yow.*yowi.*yowie matches the text, then ow.*owi.*owie (for example) certainly does as well. Say the current match list is wonka onka nka ka a, and we read the character x. The next list has to be one of the following: wonkax onkax nkax kax ax x onkax nkax kax ax x nkax kax ax x kax ax x ax x x If onkax matches, for example, then nkax must match as well, and so must kax and ax and x. So the match list will always consist of one string and its suffixes. Hence we can keep track of just the longest string in the match list. For example, with the input string oompaoompapaoompaoompapa: input test add match o o o oo oo o m om om m p mp mp p a pa pa a o ao ao o o oo oo m oom oom om p omp omp mp a mpa mpa pa p pap pap ap p a pa pa o pao pao ao o aoo aoo oo m oom oom p oomp oomp omp a ompa ompa mpa o mpao mpao pao ao o aoo aoo m aoom aoom oom p oomp oomp a oompa oompa ompa p ompap ompap mpap pap a papa papa apa pa --- A comprehensible definition Here is another definition of how to construct the Y dictionary, based on the chart above. 0. Start with the dictionary mapping every character of A to a unique integer. Set m to the empty string. 1. Append the next character c of I to m. 2. If m is not in the dictionary, then add m to the dictionary, remove the first character of m, and repeat this step. 3. Repeat from step 1 until the input is exhausted. We must be careful in defining output: the output is not synchronized with the dictionary additions, and we must be careful not to have the longest output match overlap itself. To this end we split the dictionary into two parts, the first part safe to use for the output, the second part not safe. 0. Start with S mapping each single character to a unique integer; T empty; m empty; and o empty. (S and T are the two parts of the dictionary.) 1. Read a character c. If oc is in S, set o to oc; otherwise output S(o), set o to c, add T to S, and remove everything from T. 2. While mc is not in S or T, add mc to T (mapping to the next available integer), and chop off the first character of m. 3. After m is short enough that mc is in the dictionary, set m to mc. 4. Repeat as long as there are enough input characters (i.e., until step 1 fails). 5. Output S(o) and quit. Here's how to decode: 0. Initialize (D,m) as above. 1. Read D(o) from the input and take the inverse under D to find o. 2. As long as o is not the empty string, find the first character c of o, and update (D,m) as above. Also output c and chop it off from the front of o. 3. Repeat from step 1 until the input is exhausted. The coding only requires two fast operations on strings in the dictionary: testing whether sc is there if s's position is known, and finding s's position given cs's position. The decoding only requires the same operations plus finding the first character of a string given its spot in the dictionary. --- Some properties of Y coding We propose ``per-phrase dictionary'' to describe the dictionaries constructed by LZW and MW, ``per-character dictionary'' for AP and Y. We also propose ``phrase-increment'' to describe MW and AP, ``character-increment'' for LZW and Y. More precisely, a per-phrase dictionary is one which contains one string per output phrase, while a per-character dictionary contains one string per input character. Each string in a character-increment dictionary is exactly one character longer than a string added at a previous step; in contrast, a string in a phrase-increment dictionary can be much longer than any of its prefixes formerly in the dictionary. According to this terminology, Y is an exact character-increment per-character dictionary compressor. Y has a similar feel to LZJ, which has a dictionary consisting of all unique substrings up to a certain length in a block of the input. However, Y can adapt much more effectively to redundant text. ----- 5. Results The author has implemented Y coding [2] along with, for instruction and amusement, AP coding. In this section we survey various implementation issues and compare the effectiveness of the coding methods explained here. The author's implementation, yabbawhap, uses a huptrie [3] for the dictionary. yabba keeps track of the separate dictionaries S and T for Y coding by remembering the highest position within D that is ``safe'' inside S; all higher positions are in T. yabbawhap uses compress's strategy of adaptive blocking. The user may vary most aspects of compression dynamically, including two parameters of the heuristic used to control blocking. Here are results of these methods on the Bell-Witten-Cleary text corpus, along with a highly redundant 180489-byte ``makelog'' added by the author to show an extreme case. __bib__book1__book2____geo_makelog___news___obj1___obj2_paper1_paper2_ Z12 54094 394419 325147 78026 34757 230765 16528 160099 29433 40881 Z14 46817 357387 281354 77696 25647 202594 14048 138521 25077 37196 Z16__46528_332056_250759__77777___25647_182121__14048_128659__25077__36161_ MW___41040_336793_230862__80296____8299_168652__13273_109266__22749__34234_ AP2 47056 389702 297205 84582 20925 219665 13824 134547 26937 39415 AP6 40770 338046 261270 79471 14762 190502 13824 123323 22413 34637 AP3__40311_322178_228978__80106____8821_167896__13825_113296__22414__33320_ Y2 46882 363339 287110 80817 28159 212617 13858 141783 26131 38037 Y6 40874 320622 256578 76275 21220 185097 13858 125900 22452 33671 Y3___40456_306813_229851__76695___14411_168287__13859_114323__22453__32733_ paper3_paper4_paper5_paper6_____pic__progc__progl__progp__trans_ Z12 23567 7091 6670 22333 66236 21825 31987 22936 46185 Z14 22163 6957 6580 18695 63277 19143 27116 19209 39605 Z16__22163___6957___6580__18695___62215__19143__27148__19209__38240_ MW___21495___6697___6169__16899___65102__16976__22223__15095__27742_ AP2 22293 6595 6146 19770 74061 18868 27191 17962 38781 AP6 20869 6595 6146 16786 68349 16691 22716 15138 30415 AP3__20870___6596___6147__16787___67980__16692__22451__15139__28056_ Y2 21609 6443 6033 19418 71952 18897 27607 19429 40444 Y6 20355 6443 6033 16677 66117 17063 23625 16616 33026 Y3___20356___6444___6034__16678___65377__17064__23512__16617__31300_ Z12 is LZW with 12 bits (i.e., a dictionary of at most 4096 strings), using compress -b12. MW is MW using the author's squeeze program [4], with a dictionary of at most 300000 strings (roughly equivalent to 1500000 input characters). AP2 is AP with an input block size of 21000, using whap -m21000; AP6 has a block size of 65533, and AP3 has a block size of 300000. Y is like AP but using yabba. Y is notably effective upon book1 and news. XXX what else do people want to see in this section? ----- 6. Conclusion --- Life goes on Y coding is not much more complex than LZW coding and produces generally better results. It is one of the most effective non-Huffman-based LZ78-style compressors. --- How to achieve fame and fortune It may be possible to implement Y so that the decoding does not have to do all the work of the coding. In particular, three-fourths of the dictionary strings are typically not used at all; they should not have to be handled during decoding. Even if Y cannot be sped up, there is probably some character-increment per-character dictionary compressor that achieves similar results to Y but runs as quickly as LZW or AP. This is a area for future research. We have not discussed different ways to code the output numbers. Huffman coding and arithmetic coding both produce quite respectable improvements over and above the compression of the basic Y method. Little is known about the best way to code a sequence from a dynamically expanding alphabet; it is a bit counterintuitive that semiadaptive Huffman coding upon an output Y sequence can produce worse results than adaptive Huffman coding. It would be interesting to compare a straight modeller with Y plus a Huffman coder to see which produces better results. --- Acknowledgments The author would like to thank James Woods for his support and encouragement, as well as for many copies of papers; Richard Stallman, for motivating the author to find Y coding; and Herbert J. Bernstein for general comments and for suggesting the order of presentation of this material. --- So who are these LZWMWAPY people anyway? LZW stands for Lempel, Ziv, Welch. In 1977 Ziv and Lempel discovered the first in what are now called the LZ family of dictionary compressors. The original LZ algorithms were like LZW, but transmitted both p and c and then skipped past pc. They also started with a dictionary of just the null string. (For detailed surveys of various LZ methods, the reader should consult [1], [5], [7], [11].) Welch popularized the LZ variant, now called LZW, in which the extra character was not transmitted but became the first character of the next phrase. Miller and Wegman independently discovered several LZ variants in 1984, including LZW and MW. In fact, Seery and Ziv had in 1978 [6] proposed an LZ variant where two adjacent phrases were concatenated; this is related to MW in approximately the same way that LZ is related to LZW. (Seery and Ziv also introduced an important improvement for binary alphabets: if 01101 and 01100 are in the dictionary, there is no way that 0110 can possibly be a longest match, so it can effectively be removed from the dictionary.) AP stands for ``all prefixes.'' It was originally discovered by Storer in 1988 (?) and independently rediscovered by this author in 1990. Y actually does stand for yabbadabbadoo. The author discovered Y coding on December 26, 1990; he used yabbadabbadabbadoo as an example in explaining the method to Woods the next day. Woods replied [10] ``I'll have to look at your yabbadabbadoo code again to see how it differs from LZR.'' The author promptly adopted ``Y coding'' as the official name. (Ross Williams has suggested [12] that ``LZDB coding'' might be more appropriate.) --- References Yeah, yeah, I'm still writing up proper citations. XXX [1] Bell, Witten, Cleary, compression article, 1989 [2] Bernstein, yabbawhap program, 1990-1991 [3] Bernstein, huptrie article, 1990 [4] Bernstein, squeeze program, 1989 [5] Miller and Wegman, compression article, 1987 [6] Seery and Ziv, LZ78 extensions article, 1978 [7] Storer, compression book, 1988 [8] Woods et al., compress program, 1985 [9] Woods, private communication, 1990 [10] Woods, private communication, 1990 [11] Williams, compression book, 1991 [12] Williams, private communication, 1991