home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Resource Library: Graphics
/
graphics-16000.iso
/
formats
/
gif
/
encdat.asc
< prev
next >
Wrap
Text File
|
1988-03-21
|
9KB
|
223 lines
Notes on GIF Data Structures
Jake Lund 76703,3051
This is a brief explanation of fast and memory-efficient tree
technique for encoding GIF pictures. It could easily be adapted
to GIF decoding. I wrote the encoder in 6502 assembly language
for the Commodore 64, but the idea is general enough that it could
be written in any language.
I've looked at other GIF encoders/decoders and when I see a
series of weird multiplies and adds I know a hash table is coming
up. I don't like hash tables. They use up more memory than they
should and there's always a problem with resolving collisions.
Hash tables are sloppy. Now that I've said that, here's the tree
that avoids hashing...
GIF codes can be up to 12 bits long (4096). The table has to
accommodate up to 4096 codes and this technique needs only five
bytes per code (at most a little over 20K of memory, which easily
fits in a 64 or almost any other computer).
First, you have to imagine memory being split up into five-
byte chunks, each of which is called a "record". Each record has
a "record number," and records are stored sequentially according
to their record numbers. If you were doing this in C, you'd need
a one-dimensional array of structures where each struct had a one-
byte char and two two-byte ints. Record numbers start at zero.
Let's say 16384 (hex $4000) is where the code table resides.
Record 0 is at 16384, record 1 is at 16384+5, record 2 is at
16384+10, and so on. You also need a five-byte buffer and two
routines: one for copying the buffer up to a specific address in
memory and one for copying a record down to the buffer. In
assembly, you shift left twice (times four), add the original
record, and add the address of the table.
Each record is arranged like this:
Byte 0: The "name"
Bytes 1-2: The record # for the "oldest child"
Bytes 3-4: The record # for the "next-youngest sibling"
If the child is zero, it means there is no child. Likewise
for the sibling.
When you add a new record, you give it a name, but you also
mark it as having no children or siblings. At the same time, you
have to go back to the parent or older sibling and link in the new
arrival.
At the beginning of an encoding session, you already know the
palette and its size. If you have 16 colors, records 0-15
correspond to the original colors. Records 16 and 17 are reserved
for the "clear table" and the "end of picture" codes. The next
available record (NEXT) would be 18. When you start a new picture
(or encounter a clear table code), you have to clear the root
entries (0-15 or whatever) so they have no children and no
siblings. NEXT has to be reset to 18 (or "end of picture" plus
one).
Here's an example. There are four colors in the palette.
The colors are ABCD (codes 0-3) and their record numbers are 0-3.
Code 4 means "clear the table" and code 5 means "end of pic." The
next available code (NEXT) is 6.
Let's say the color stream looks like this: ABACACCADACCD.
Get the first color (A). It's automatically code 0 (the GIF code
at this point is 0). Copy record 0 to the buffer and get the next
color, which is B. Check the buffer for children. Record 0 has
no children. First link NEXT (record 6) as the oldest child --
record 0 has a child 6. Copy the buffer back to memory with the
record # of the child 6. Child 6 has a name (B), but no children
or siblings. Copy record 6 to memory. Send the GIF code (0) to
the data stream. Now we have the following situation:
. data stream: 0
. color stream: xBACACCADACCD ("x" means the color was handled)
. last color: B
. NEXT: 7
. ancestry: 0A 1B 2C 3D
. !
. 6B
Record 6 is the oldest child of 0. The name of record 6 is
color "B."
The last color was B, so the GIF code is automatically 1.
Get another color, which is A. Copy record 1 to the buffer. It
doesn't have any kids, so record 7 (NEXT) has to be linked as the
oldest kid to record 1. Insert $0007 in bytes 1-2 of the buffer
and copy back to the code table. Add a record 7, with a name "A"
and no kids or siblings.
. data stream: 0 1
. color stream: xxACACCADACCD
. last color: A
. NEXT: 8
. ancestry: 0A 1B 2C 3D
. ! !
. 6B 7A
The last color was A, so the GIF code is 0. Get a color,
which is C and check record 0 for kids. Aha! It has a child,
record #6. Copy that from memory to the buffer. Check the name.
We want a C, but the kid's name is B. Plus, B doesn't have any
younger siblings. We have to add a younger brother/sister (#8) to
record 6. Add $0008 to positions 3-4 in the buffer and copy 6 up
to memory. Add a record #8, with a name C and no kids or
siblings. Send out the GIF code 0.
. data stream: 0 1 0
. color stream: xxxCACCADACCD
. last color: C
. NEXT: 9
. ancestry: 0A 1B 2C 3D
. ! !
. 6B = 8C 7A
By the way, in the tree above, the "!" means "has a child"
and the "=" means "has a younger sibling." Records 0 and 1 have
children but no siblings. Record 6 has a sibling but no children.
Records 2, 3, 7, and 8 have no kids and no siblings.
The current/last color was C (root record #2) and the next
color is A. C doesn't have any kids, so we add A as the oldest
child and send out a 2.
. data stream: 0 1 0 2
. color stream: xxxxACCADACCD
. last color: A
. NEXT: 10
. ancestry: 0A 1B 2C 3D
. ! ! !
. 6B = 8C 7A 9A
Now it gets interesting. We've got an A (GIF code 0). The
next incoming color is a C. A has at least one child, so we check
record 6. Doesn't match. The name of 6 is B and we want a C.
But record 6 has a sibling record 8. Copy 8 down to the buffer.
Bingo! The name matches, so there's a new GIF code (8). Loop
back to the get-a-color routine. The next one is C. That's no
good because record 8 has no offspring. So the program does three
things: 1) add NEXT (record 10) as the child of record 8, 2) add
record 10 to memory (name C, no kids or siblings), 3) send out GIF
code 8.
. data stream: 0 1 0 2 8
. color stream: xxxxxxCADACCD
. last color: C
. NEXT: 11
. ancestry: 0A 1B 2C 3D
. ! ! !
. 6B = 8C 7A 9A
. !
. 10C
The last color was C. The GIF code is 2. Get another color,
which is A. Does record 2 have children? Yes, record 9. The
name of 9 is A, which is exactly what we need. The new GIF code
is 9 and we get a color (D). Record 9 has no kids, so insert the
number 11 as the child of 9. Add record 11 (name D, no kids or
siblings). Send out a GIF code 9.
. data stream: 0 1 0 2 8 9
. color stream: xxxxxxxxDACCD
. last color: D
. NEXT: 12
. ancestry: 0A 1B 2C 3D
. ! ! !
. 6B = 8C 7A 9A
. ! !
. 10C 11D
Last color was D. GIF code 3. Get a new color (A). Record
3 has no kids, so A is added. Record 3 now has a child 12.
Record 12 has a name A but no kids or brothers/sisters. Send out
the GIF code 3.
. data stream: 0 1 0 2 8 9 3
. color stream: xxxxxxxxxACCD
. last color: A
. NEXT: 13
. ancestry: 0A 1B 2C 3D
. ! ! ! !
. 6B = 8C 7A 9A 12A
. ! !
. 10C 11D
Start with A. GIF code 0. Record 0 has a child 6 (name B),
which is wrong, but 6 has a sibling 8 (name C), which is right.
The GIF code is 8. Next color is C. Record 8 has a child 10
(name C), which matches. The GIF code is 10. Record 10 has no
kids or siblings, so 1) add 13 as the child of 10, 2) add record
13 with the name D (no children or siblings), and 3) send out GIF
code 10.
. data stream: 0 1 0 2 8 9 3 10
. color stream: xxxxxxxxxxxxD
. last color: D
. NEXT: 14
. ancestry: 0A 1B 2C 3D
. ! ! ! !
. 6B = 8C 7A 9A 12A
. ! !
. 10C 11D
. !
. 13D
The encoding would continue from there. Decoding would be
even easier (all you need is a name and a parent).
I hope this explanation was understandable. By the way, the
Commodore 64 encoder is only about 1.5K of assembly language.
With a little tweaking, we should be able to shrink it down to
about 1K.
Jake Lund 76703,3051