NetNews Usenet Archive 1993 #3

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #3 / NN_1993_3.iso / spool / comp / compress / 4760 < prev next >

Wrap

Text File | 1993-01-21 | 1.8 KB | 40 lines

Newsgroups: comp.compression Path: sparky!uunet!zaphod.mps.ohio-state.edu!uwm.edu!linac!att!att!dptg!ulysses!allegra!princeton!tyrolia!jmd From: jmd@tyrolia.Princeton.EDU (John M. Danskin) Subject: Re: request for compression algorithm for small text buffers Message-ID: <1993Jan20.143605.4160@Princeton.EDU> Originator: news@nimaster Sender: news@Princeton.EDU (USENET News System) Nntp-Posting-Host: tyrolia.princeton.edu Reply-To: jmd@tyrolia.Princeton.EDU (John M. Danskin) Organization: Dept. of Computer Science, Princeton University References: <23244@venera.isi.edu> Date: Wed, 20 Jan 1993 14:36:05 GMT Lines: 25 In article <23244@venera.isi.edu>, sondeen@isi.edu (Jeff Sondeen) writes: |> I'm looking for a compression routine that I can use for small text |> buffers accessed separately within huge text files. For example, a |> dictionary whose word entries are hashed (ala gdbm) to their |> definition text -- but I want the definition text strings to be |> compressed with minimal (space) overhead. Would the separate "table" |> overheads of LZ type compressions be a disadvantage compared to a |> huffman/arithmetic compression based on a fixed table? Thanks for any |> tips. |> |> /jeff It seems to me that you could improve on the effect of a fixed huffman table by generating a standard LZW table from a small sample subset of your dictionary definitions. When you want to encode a new definition, always start from this sample table. Sizing the sample table is an interesting trade off between initial code size and code power. In practice, once you had a system running, it would be easy to try sample tables resulting in several different code sizes, and just use the best one. John Danskin | (609) 258-5386 | Gradual student (609) 258-1771 fax | Graphics systems jmd@cs.princeton.edu | (efficient low bandwidth graphics?)