The Datafile PD-CD 3

home *** CD-ROM | disk | FTP | other *** search

/ The Datafile PD-CD 3 / PDCD_3.iso / utilities / utilsf / graphtext / Eddie2 < prev

Wrap

Internet Message Format | 1995-06-12 | 9.5 KB

To: A.Pereira@cs.ucl.ac.uk From: Eddie Edwards <eddie@powerslv.demon.co.uk> Subject: Re: Fast texture coordinate calculation Reply-To: eddie@powerslv.demon.co.uk Date: Tue, 02 May 1995 20:23:49 +0100 Message-ID: <19950502.202349.32@powerslv.demon.co.uk> Organization: I Inhaled X-Mailer: Archimedes TTFN Version 0.36 > Cool! Maybe we'll be able to figure out how Michael Abrash managed to > get all of Quake's divides for free! *All* of them - now that's impressive. What I've worked out is somewhat more modest than that. It can only be used to speed up a series of 'similar' divides (I'll define 'similar' in a minute!). So it's not much use for, say, doing perspective transformations on random groups of points. In fact, I'd be amazed if it was *possible* to eliminate all the divides from a 3D game - but then, that's Michael Abrash for you ;-) I'll try not to get *too* mathematical, but it is a mathematical trick so it's inevitable to some extent. PROBLEM ======= To calculate u in the fast texture coordinate calculation for DOOM-style walls (see previous post) requires a divide per strip. We don't want that. ANALYSIS ======== The divide occurs in the following statement: u = (128 * y2)/height y2 and height are both functions of x (the integer loop variable, representing the current strip's x coordinate) and hence u is too. Assuming for simplicity that x starts at 0 (and goes upwards in increments of 1), y2 = y2inc * x height = lheight + (heightinc * x) Since y2inc, lheight and heightinc are arbitrary fixed-point numbers, what we need is a general way of calculating the value of A.n / (C.n + D) For A,C,D any fixed-point numbers and n an integer going from zero upwards. In fact, why not make the work pay off later and say we want to calculate (A.n + B) / (C.n + D) Then it might come in handy in another program ;-) THOUGHTS ======== Well, the Bresenham line algorithm is famous for eliminating the divide and using only integer arithmetic. It basically calculates values of the function A.x / B without calculating A/B first. I am going to introduce a new way of looking at the Bresenham algorithm, in terms of integer equalities. Firstly, this assumes *fixed point* numbers. This is important, because there must exist a value X such that A.X is an integer for any value of A. For, say, 16.16 fixed-point numbers, this value is 65536. Basically, you're just 'forgetting' the point, but this isn't as easy to do with floating point, and FP is too slow anyway (certainly without a copro ;-) Since y is an integer (it's a pixel coordinate) and x is too, then when we write y = A.x / B we are lying! y is an integer, but A.x / B is a rational number and very probably *not* an integer. They are almost certainly *not* equal! To make this into an *equality* we must add an error term: y = A.x / B + e e is the actual error in the calculation. I always round down (rounding to nearest or rounding up can be implemented using offsets to the original values, as in Bresenham, if you prefer), so to get the best value of y we must have e in the range 0 (inclusive) to 1 (exclusive). I'm going to stop there and move back to the problem of (A.n + B) / (C.n + D). The methods applied later can be brought back and applied to the Bresenham equality above, and Bresenham's algorithm can be derived (as an exercise for the reader ;-) DERIVING THE SOLUTION ===================== For the problem in hand, the equation is: y = (A.x + B) / (C.x + D) + e 0 <= e < 1 Let's multiply through by (C.x + D) to get rid of that divide y.(C.x + D) = A.x + B + e.(C.x + D) 0 <= e < 1 Trick #1: e is just a random error, and we don't care what value it has. We just need to limit it's value so that we get the right value of y. Therefore, just rename e.(C.x + D) to f and change the limits of f to be between 0 and (C.x + D). Then rename f to e again ;-) y.(C.x + D) = A.x + B + e 0 <= e <= (C.x + D) Now, multiply through by 65536 and every fixed-point number becomes an integer. Simply forget the points on all the numbers. Easy! The above is now an integer equality. The algorithm must take a value of y and x, increment x, and return a value of y for the new x. This might hurt ... Keep the equality 'God'. That equation above must always be satisfied. If you change the equation by adding something to both sides, or switching values around, you must still have a valid equation. We want to replace all the 'x's in the equation above with '(x+1)'s. We do so in two stages, at all times *keeping the equality*. First we change the right x to x+1, and to keep the equality we subtract A from e: y.(C.x + D) = A.(x+1) + B + e' e' = (e - A) (1) But now e' might be out of range. We check to see if it is (only one check is needed if you know the sign of A in advance) and if it is we adjust y by units of 1, and adjust e by units of (C.x + D) (again, keeping the equality) until it is correct. This is exactly analogous to the Bresenham step. Now adjust x on the left: y.(C.(x+1) + D) = A.(x+1) + B + e" e" = (e' + C.y) (2) Then adjust e" and y (using (C.(x+1) + D) as the adjustment) until e" is in the *new* range 0 to (C.(x+1) + D). It's almost like mutating the numbers until they form the next equation in the set. Look at what we have: y'.(C.x' + D) = A.x' + B + e' x' is x+1, e' is some new error - don't care what it is, but it's between 0 and (C.x' + d). Which means that y' is the number we are after! Hoorah! ;-) To start the ball rolling, we just need the value of y and e for x = 0. This is just y = B/D and e = Dy - B (note that e will not be zero because of the finite precision of the divide). If B happens to be zero, y starts at zero and not one divide is needed. This is the case in the implementation below. IMPLEMENTATION ============== You can see a lot of interlacing of work here. The calculation of C.y looks expensive, but in fact you need merely keep a running total which you adjust when you adjust y. Here's the full algorithm, plus a few typos (undoubtedly!) in pseudo-C. map(lx,lh,lt,ly,rx,rh,rt,ry,scrn) /* l* and r* are for the left and right sides of the wall to be drawn lx,rx - x coordinates lh,rh - heights lt,rt - texture x coordinates ly,ry - top pixel of strip All these are ints, and the left ones are inclusive values while the right ones are exclusive. I.e. we start at the left values and approach, but never quite reach, the right values. scrn is a pointer to the frame buffer, which is 320 pixels wide */ w = rx - lx // width of wall in pixels h = lh << 16 // height in fixed-point ( > 0 ) hinc = ((rh - lh) << 16) / w // height increment per x pixel y = ly << 16 // top y in fixed-point ( > 0 ) yinc = ((ry - ly) << 16) / w // y increment per x pixel t = lt // current value of texture index (int) hinct = 0 // current value of C.x for calcs et = 0 // current value of error term esub = (((rt - lt) << 16) / w) * rh // the value subtracted first from e // note that A is greater than zero for (i = lx ; i < rx ; i++) { sp = scrn + (y >> 16)*320 + i // not a multiply, 2 shifts plot_strip(sp, h >> 16, t) // (target, height, texture coordinate) et -= esub // now in equation (1) while (et < 0) // only need to check for less than { // as A is positive et += h // adjust et t += 1 // adjust t hinct += hinc // adjust C.t } h += hinc // adjust (C.x + D) - the height et += hinct // adjust et by C.t (equation (2)) while (et < 0) // this loop not needed if hinc < 0 { et -= h // adjust et down t -= 1 // adjust t down hinct -= hinc // adjust C.t down } while (et > 0) // this loop not needed if hinc > 0 { et += h t += h hinct += hinc } y += yinc // update y coordinate of wall } EXTENSIONS ========== In plot_strip you'll probably want to divide the texture height by the strip height, to get an increment value. This expression is of the form: A / (C.x + D) and can also be calculated using this method. Note that the divides at the start (3 of them) are all by the same number, so only one divide is really needed. Add the divide to get the initial value A/D (or texture height over initial strip height) for the plot_strip increment, and thats two divides per polygon! Not bad ;-) Note the following fascinating fact: If hinc > 0, then *every* while loop increments t by 1. Therefore, the total number of while loops executed is exactly equal to (rt - lt). This limits the overall cost of the algorithm. If hinc < 0, we can just write another function which plots walls backwards and brings hinc > 0 again ;-) TO THOSE WHO SNEER AT DOOM-STYLE WALLS ====================================== In the future, when Owls have Virtual Reality, these algorithms will become important once more. <grin> -- Eddie xxx (eddie@powerslv.demon.co.uk, ee@datcon.co.uk, leg@lize.it)