home *** CD-ROM | disk | FTP | other *** search
- To: A.Pereira@cs.ucl.ac.uk
- From: Eddie Edwards <eddie@powerslv.demon.co.uk>
- Subject: Re: Fast texture coordinate calculation
- Reply-To: eddie@powerslv.demon.co.uk
- Date: Tue, 02 May 1995 20:23:49 +0100
- Message-ID: <19950502.202349.32@powerslv.demon.co.uk>
- Organization: I Inhaled
- X-Mailer: Archimedes TTFN Version 0.36
-
- > Cool! Maybe we'll be able to figure out how Michael Abrash managed to
- > get all of Quake's divides for free!
-
- *All* of them - now that's impressive. What I've worked out is somewhat
- more modest than that. It can only be used to speed up a series of
- 'similar' divides (I'll define 'similar' in a minute!). So it's not much
- use for, say, doing perspective transformations on random groups of
- points. In fact, I'd be amazed if it was *possible* to eliminate all the
- divides from a 3D game - but then, that's Michael Abrash for you ;-)
-
- I'll try not to get *too* mathematical, but it is a mathematical trick so
- it's inevitable to some extent.
-
- PROBLEM
- =======
-
- To calculate u in the fast texture coordinate calculation for DOOM-style
- walls (see previous post) requires a divide per strip. We don't want that.
-
- ANALYSIS
- ========
-
- The divide occurs in the following statement:
-
- u = (128 * y2)/height
-
- y2 and height are both functions of x (the integer loop variable,
- representing the current strip's x coordinate) and hence u is too.
- Assuming for simplicity that x starts at 0 (and goes upwards in increments
- of 1),
-
- y2 = y2inc * x
- height = lheight + (heightinc * x)
-
- Since y2inc, lheight and heightinc are arbitrary fixed-point numbers, what
- we need is a general way of calculating the value of
-
- A.n / (C.n + D)
-
- For A,C,D any fixed-point numbers and n an integer going from zero upwards.
-
- In fact, why not make the work pay off later and say we want to calculate
-
- (A.n + B) / (C.n + D)
-
- Then it might come in handy in another program ;-)
-
- THOUGHTS
- ========
-
- Well, the Bresenham line algorithm is famous for eliminating the divide and
- using only integer arithmetic. It basically calculates values of the
- function
-
- A.x / B
-
- without calculating A/B first. I am going to introduce a new way of
- looking at the Bresenham algorithm, in terms of integer equalities.
-
- Firstly, this assumes *fixed point* numbers. This is important, because
- there must exist a value X such that A.X is an integer for any value of
- A. For, say, 16.16 fixed-point numbers, this value is 65536. Basically,
- you're just 'forgetting' the point, but this isn't as easy to do with
- floating point, and FP is too slow anyway (certainly without a copro ;-)
-
- Since y is an integer (it's a pixel coordinate) and x is too, then when we
- write
-
- y = A.x / B
-
- we are lying! y is an integer, but A.x / B is a rational number and very
- probably *not* an integer. They are almost certainly *not* equal! To make
- this into an *equality* we must add an error term:
-
- y = A.x / B + e
-
- e is the actual error in the calculation. I always round down (rounding to
- nearest or rounding up can be implemented using offsets to the original
- values, as in Bresenham, if you prefer), so to get the best value of y we
- must have e in the range 0 (inclusive) to 1 (exclusive).
-
- I'm going to stop there and move back to the problem of
- (A.n + B) / (C.n + D).
- The methods applied later can be brought back and applied to the Bresenham
- equality above, and Bresenham's algorithm can be derived (as an exercise
- for the reader ;-)
-
- DERIVING THE SOLUTION
- =====================
-
- For the problem in hand, the equation is:
-
- y = (A.x + B) / (C.x + D) + e 0 <= e < 1
-
- Let's multiply through by (C.x + D) to get rid of that divide
-
- y.(C.x + D) = A.x + B + e.(C.x + D) 0 <= e < 1
-
- Trick #1: e is just a random error, and we don't care what value it has.
- We just need to limit it's value so that we get the right value
- of y. Therefore, just rename e.(C.x + D) to f and change the
- limits of f to be between 0 and (C.x + D). Then rename f to e
- again ;-)
-
- y.(C.x + D) = A.x + B + e 0 <= e <= (C.x + D)
-
- Now, multiply through by 65536 and every fixed-point number becomes an
- integer.
- Simply forget the points on all the numbers. Easy! The above is now an
- integer equality.
-
- The algorithm must take a value of y and x, increment x, and return a
- value of y for the new x. This might hurt ...
-
- Keep the equality 'God'. That equation above must always be satisfied. If
- you change the equation by adding something to both sides, or switching
- values around, you must still have a valid equation.
-
- We want to replace all the 'x's in the equation above with '(x+1)'s. We
- do so in two stages, at all times *keeping the equality*. First we change
- the right x to x+1, and to keep the equality we subtract A from e:
-
- y.(C.x + D) = A.(x+1) + B + e' e' = (e - A) (1)
-
- But now e' might be out of range. We check to see if it is (only one check
- is needed if you know the sign of A in advance) and if it is we adjust y by
- units of 1, and adjust e by units of (C.x + D) (again, keeping the
- equality) until it is correct. This is exactly analogous to the Bresenham
- step.
-
- Now adjust x on the left:
-
- y.(C.(x+1) + D) = A.(x+1) + B + e" e" = (e' + C.y) (2)
-
- Then adjust e" and y (using (C.(x+1) + D) as the adjustment) until e" is in
- the *new* range 0 to (C.(x+1) + D).
-
- It's almost like mutating the numbers until they form the next equation in
- the set. Look at what we have:
-
- y'.(C.x' + D) = A.x' + B + e'
-
- x' is x+1, e' is some new error - don't care what it is, but it's between
- 0 and (C.x' + d). Which means that y' is the number we are after!
- Hoorah! ;-)
-
- To start the ball rolling, we just need the value of y and e for x = 0.
- This is just y = B/D and e = Dy - B (note that e will not be zero because
- of the finite precision of the divide). If B happens to be zero, y starts
- at zero and not one divide is needed. This is the case in the
- implementation below.
-
- IMPLEMENTATION
- ==============
-
- You can see a lot of interlacing of work here. The calculation of C.y
- looks expensive, but in fact you need merely keep a running total which
- you adjust when you adjust y.
-
- Here's the full algorithm, plus a few typos (undoubtedly!) in pseudo-C.
-
- map(lx,lh,lt,ly,rx,rh,rt,ry,scrn)
- /*
- l* and r* are for the left and right sides of the wall to be drawn
- lx,rx - x coordinates
- lh,rh - heights
- lt,rt - texture x coordinates
- ly,ry - top pixel of strip
-
- All these are ints, and the left ones are inclusive values while the
- right ones are exclusive. I.e. we start at the left values and
- approach, but never quite reach, the right values.
-
- scrn is a pointer to the frame buffer, which is 320 pixels wide
- */
-
- w = rx - lx // width of wall in pixels
- h = lh << 16 // height in fixed-point ( > 0 )
- hinc = ((rh - lh) << 16) / w // height increment per x pixel
- y = ly << 16 // top y in fixed-point ( > 0 )
- yinc = ((ry - ly) << 16) / w // y increment per x pixel
- t = lt // current value of texture index (int)
- hinct = 0 // current value of C.x for calcs
- et = 0 // current value of error term
- esub = (((rt - lt) << 16) / w) * rh // the value subtracted first from e
- // note that A is greater than zero
-
- for (i = lx ; i < rx ; i++)
- {
- sp = scrn + (y >> 16)*320 + i // not a multiply, 2 shifts
-
- plot_strip(sp, h >> 16, t) // (target, height, texture coordinate)
-
- et -= esub // now in equation (1)
-
- while (et < 0) // only need to check for less than
- { // as A is positive
- et += h // adjust et
- t += 1 // adjust t
- hinct += hinc // adjust C.t
- }
-
- h += hinc // adjust (C.x + D) - the height
- et += hinct // adjust et by C.t (equation (2))
-
- while (et < 0) // this loop not needed if hinc < 0
- {
- et -= h // adjust et down
- t -= 1 // adjust t down
- hinct -= hinc // adjust C.t down
- }
-
- while (et > 0) // this loop not needed if hinc > 0
- {
- et += h
- t += h
- hinct += hinc
- }
-
- y += yinc // update y coordinate of wall
- }
-
- EXTENSIONS
- ==========
-
- In plot_strip you'll probably want to divide the texture height by the
- strip height, to get an increment value. This expression is of the form:
-
- A / (C.x + D)
-
- and can also be calculated using this method. Note that the divides at the
- start (3 of them) are all by the same number, so only one divide is really
- needed. Add the divide to get the initial value A/D (or texture height
- over initial strip height) for the plot_strip increment, and thats two
- divides per polygon! Not bad ;-)
-
- Note the following fascinating fact: If hinc > 0, then *every* while loop
- increments t by 1. Therefore, the total number of while loops executed is
- exactly equal to (rt - lt). This limits the overall cost of the algorithm.
- If hinc < 0, we can just write another function which plots walls
- backwards and brings hinc > 0 again ;-)
-
- TO THOSE WHO SNEER AT DOOM-STYLE WALLS
- ======================================
-
- In the future, when Owls have Virtual Reality, these algorithms will become
- important once more. <grin>
-
- --
- Eddie xxx (eddie@powerslv.demon.co.uk, ee@datcon.co.uk, leg@lize.it)