home *** CD-ROM | disk | FTP | other *** search
-
- Notes on GIF Data Structures
-
- Jake Lund 76703,3051
-
-
- This is a brief explanation of fast and memory-efficient tree
- technique for encoding GIF pictures. It could easily be adapted
- to GIF decoding. I wrote the encoder in 6502 assembly language
- for the Commodore 64, but the idea is general enough that it could
- be written in any language.
-
- I've looked at other GIF encoders/decoders and when I see a
- series of weird multiplies and adds I know a hash table is coming
- up. I don't like hash tables. They use up more memory than they
- should and there's always a problem with resolving collisions.
- Hash tables are sloppy. Now that I've said that, here's the tree
- that avoids hashing...
-
- GIF codes can be up to 12 bits long (4096). The table has to
- accommodate up to 4096 codes and this technique needs only five
- bytes per code (at most a little over 20K of memory, which easily
- fits in a 64 or almost any other computer).
-
- First, you have to imagine memory being split up into five-
- byte chunks, each of which is called a "record". Each record has
- a "record number," and records are stored sequentially according
- to their record numbers. If you were doing this in C, you'd need
- a one-dimensional array of structures where each struct had a one-
- byte char and two two-byte ints. Record numbers start at zero.
-
- Let's say 16384 (hex $4000) is where the code table resides.
- Record 0 is at 16384, record 1 is at 16384+5, record 2 is at
- 16384+10, and so on. You also need a five-byte buffer and two
- routines: one for copying the buffer up to a specific address in
- memory and one for copying a record down to the buffer. In
- assembly, you shift left twice (times four), add the original
- record, and add the address of the table.
-
- Each record is arranged like this:
-
- Byte 0: The "name"
- Bytes 1-2: The record # for the "oldest child"
- Bytes 3-4: The record # for the "next-youngest sibling"
-
- If the child is zero, it means there is no child. Likewise
- for the sibling.
-
- When you add a new record, you give it a name, but you also
- mark it as having no children or siblings. At the same time, you
- have to go back to the parent or older sibling and link in the new
- arrival.
-
- At the beginning of an encoding session, you already know the
- palette and its size. If you have 16 colors, records 0-15
- correspond to the original colors. Records 16 and 17 are reserved
- for the "clear table" and the "end of picture" codes. The next
- available record (NEXT) would be 18. When you start a new picture
- (or encounter a clear table code), you have to clear the root
- entries (0-15 or whatever) so they have no children and no
- siblings. NEXT has to be reset to 18 (or "end of picture" plus
- one).
-
- Here's an example. There are four colors in the palette.
- The colors are ABCD (codes 0-3) and their record numbers are 0-3.
- Code 4 means "clear the table" and code 5 means "end of pic." The
- next available code (NEXT) is 6.
-
- Let's say the color stream looks like this: ABACACCADACCD.
- Get the first color (A). It's automatically code 0 (the GIF code
- at this point is 0). Copy record 0 to the buffer and get the next
- color, which is B. Check the buffer for children. Record 0 has
- no children. First link NEXT (record 6) as the oldest child --
- record 0 has a child 6. Copy the buffer back to memory with the
- record # of the child 6. Child 6 has a name (B), but no children
- or siblings. Copy record 6 to memory. Send the GIF code (0) to
- the data stream. Now we have the following situation:
-
- . data stream: 0
- . color stream: xBACACCADACCD ("x" means the color was handled)
- . last color: B
- . NEXT: 7
- . ancestry: 0A 1B 2C 3D
- . !
- . 6B
-
- Record 6 is the oldest child of 0. The name of record 6 is
- color "B."
-
- The last color was B, so the GIF code is automatically 1.
- Get another color, which is A. Copy record 1 to the buffer. It
- doesn't have any kids, so record 7 (NEXT) has to be linked as the
- oldest kid to record 1. Insert $0007 in bytes 1-2 of the buffer
- and copy back to the code table. Add a record 7, with a name "A"
- and no kids or siblings.
-
- . data stream: 0 1
- . color stream: xxACACCADACCD
- . last color: A
- . NEXT: 8
- . ancestry: 0A 1B 2C 3D
- . ! !
- . 6B 7A
-
- The last color was A, so the GIF code is 0. Get a color,
- which is C and check record 0 for kids. Aha! It has a child,
- record #6. Copy that from memory to the buffer. Check the name.
- We want a C, but the kid's name is B. Plus, B doesn't have any
- younger siblings. We have to add a younger brother/sister (#8) to
- record 6. Add $0008 to positions 3-4 in the buffer and copy 6 up
- to memory. Add a record #8, with a name C and no kids or
- siblings. Send out the GIF code 0.
-
- . data stream: 0 1 0
- . color stream: xxxCACCADACCD
- . last color: C
- . NEXT: 9
- . ancestry: 0A 1B 2C 3D
- . ! !
- . 6B = 8C 7A
-
- By the way, in the tree above, the "!" means "has a child"
- and the "=" means "has a younger sibling." Records 0 and 1 have
- children but no siblings. Record 6 has a sibling but no children.
- Records 2, 3, 7, and 8 have no kids and no siblings.
-
- The current/last color was C (root record #2) and the next
- color is A. C doesn't have any kids, so we add A as the oldest
- child and send out a 2.
-
- . data stream: 0 1 0 2
- . color stream: xxxxACCADACCD
- . last color: A
- . NEXT: 10
- . ancestry: 0A 1B 2C 3D
- . ! ! !
- . 6B = 8C 7A 9A
-
- Now it gets interesting. We've got an A (GIF code 0). The
- next incoming color is a C. A has at least one child, so we check
- record 6. Doesn't match. The name of 6 is B and we want a C.
- But record 6 has a sibling record 8. Copy 8 down to the buffer.
- Bingo! The name matches, so there's a new GIF code (8). Loop
- back to the get-a-color routine. The next one is C. That's no
- good because record 8 has no offspring. So the program does three
- things: 1) add NEXT (record 10) as the child of record 8, 2) add
- record 10 to memory (name C, no kids or siblings), 3) send out GIF
- code 8.
-
- . data stream: 0 1 0 2 8
- . color stream: xxxxxxCADACCD
- . last color: C
- . NEXT: 11
- . ancestry: 0A 1B 2C 3D
- . ! ! !
- . 6B = 8C 7A 9A
- . !
- . 10C
-
- The last color was C. The GIF code is 2. Get another color,
- which is A. Does record 2 have children? Yes, record 9. The
- name of 9 is A, which is exactly what we need. The new GIF code
- is 9 and we get a color (D). Record 9 has no kids, so insert the
- number 11 as the child of 9. Add record 11 (name D, no kids or
- siblings). Send out a GIF code 9.
-
- . data stream: 0 1 0 2 8 9
- . color stream: xxxxxxxxDACCD
- . last color: D
- . NEXT: 12
- . ancestry: 0A 1B 2C 3D
- . ! ! !
- . 6B = 8C 7A 9A
- . ! !
- . 10C 11D
-
- Last color was D. GIF code 3. Get a new color (A). Record
- 3 has no kids, so A is added. Record 3 now has a child 12.
- Record 12 has a name A but no kids or brothers/sisters. Send out
- the GIF code 3.
-
- . data stream: 0 1 0 2 8 9 3
- . color stream: xxxxxxxxxACCD
- . last color: A
- . NEXT: 13
- . ancestry: 0A 1B 2C 3D
- . ! ! ! !
- . 6B = 8C 7A 9A 12A
- . ! !
- . 10C 11D
-
- Start with A. GIF code 0. Record 0 has a child 6 (name B),
- which is wrong, but 6 has a sibling 8 (name C), which is right.
- The GIF code is 8. Next color is C. Record 8 has a child 10
- (name C), which matches. The GIF code is 10. Record 10 has no
- kids or siblings, so 1) add 13 as the child of 10, 2) add record
- 13 with the name D (no children or siblings), and 3) send out GIF
- code 10.
-
- . data stream: 0 1 0 2 8 9 3 10
- . color stream: xxxxxxxxxxxxD
- . last color: D
- . NEXT: 14
- . ancestry: 0A 1B 2C 3D
- . ! ! ! !
- . 6B = 8C 7A 9A 12A
- . ! !
- . 10C 11D
- . !
- . 13D
-
- The encoding would continue from there. Decoding would be
- even easier (all you need is a name and a parent).
-
- I hope this explanation was understandable. By the way, the
- Commodore 64 encoder is only about 1.5K of assembly language.
- With a little tweaking, we should be able to shrink it down to
- about 1K.
-
- Jake Lund 76703,3051
-
-
-