void Test(void) { float latitude, longitude; float dToR = M_PI / 180.0; glColor(0, 0, 0); for (latitude = -90; latitude < 90; ++latitude) { glBegin(GL_QUAD_STRIP); for (longitude = 0; longitude <= 360; ++longitude) { GLfloat x, y, z; x = sin(longitude * dToR) * cos(latitude * dToR); y = sin(latitude * dToR); z = cos(longitude * dToR) * cos(latitude * dToR); glColor3f(x, y, z); glColor3f(x, y, z); x = sin(longitude * dToR) * cos((latitude+1) * dToR); y = sin((latitude+1) * dToR); z = cos(longitude * dToR) * cos((latitude+1) * dToR); glColor3f(x, y, z); glColor3f(x, y, z); } glEnd(); } }
% cc -o perf -O -p perf.c -lGLU -lGL -lX11 % perf % prof perf ------------------------------------------------------------- Profile listing generated Wed Jul 19 17:17:03 1995 with: prof perf ------------------------------------------------------------- samples time CPU FPU Clock N-cpu S-interval Countsize 219 2.2s R4000 R4010 100.0MHz 0 10.0ms 0(bytes) Each sample covers 4 bytes for every 10.0ms (0.46% of 2.1900sec) ---------------------------------------------------------------------- -p[rocedures] using pc-sampling. Sorted in descending order by the number of samples in each procedure. Unexecuted procedures are excluded. ----------------------------------------------------------------------- samples time(%) cum time(%) procedure (file) 112 1.1s( 51.1) 1.1s( 51.1) __sin (/usr/lib/libm.so:trig.s) 29 0.29s( 13.2) 1.4s( 64.4) Test (perf:perf.c) 18 0.18s( 8.2) 1.6s( 72.6) __cos (/usr/lib/libm.so:trig.s) 16 0.16s( 7.3) 1.8s( 79.9) Finish (/usr/lib/libGLcore.so:../EXPRESS/gr2_context.c) 15 0.15s( 6.8) 1.9s( 86.8) __glexpim_Color3f (/usr/lib/libGLcore.so:../EXPRESS/gr2_vapi.c) 14 0.14s( 6.4) 2s( 93.2) _BSD_getime (/usr/lib/libc.so.1:BSD_getime.s) 3 0.03s( 1.4) 2.1s( 94.5) __glim_Finish (/usr/lib/libGLcore.so:../soft/so_finish.c) 3 0.03s( 1.4) 2.1s( 95.9) _gettimeofday (/usr/lib/libc.so.1:gettimeday.c) 2 0.02s( 0.9) 2.1s( 96.8) InitBenchmark (perf:perf.c) 1 0.01s( 0.5) 2.1s( 97.3) __glMakeIdentity (/usr/lib/libGLcore.so:../soft/so_math.c) 1 0.01s( 0.5) 2.1s( 97.7) _ioctl (/usr/lib/libc.so.1:ioctl.s) 1 0.01s( 0.5) 2.1s( 98.2) __glInitAccum64 (/usr/lib/libGLcore.so:../soft/so_accumop.c) 1 0.01s( 0.5) 2.2s( 98.6) _bzero (/usr/lib/libc.so.1:bzero.s) 1 0.01s( 0.5) 2.2s( 99.1) GetClock (perf:perf.c) 1 0.01s( 0.5) 2.2s( 99.5) strncpy (/usr/lib/libc.so.1:strncpy.c) 1 0.01s( 0.5) 2.2s(100.0) _select (/usr/lib/libc.so.1:select.s) 219 2.2s(100.0) 2.2s(100.0) TOTALAlmost 60% of the program's time for a single frame is spent computing trigonometric functions (__sin and __cos).
There are several ways to improve this situation. First consider reducing the resolution of the quad strips that model the sphere. The current representation has over 60,000 quads, which is probably more than is needed for a high-quality image. After that, consider other changes. For example:
void Test(void) { glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); glCallList(1); } .... void RunTest(void){... glNewList(1, GL_COMPILE); for (latitude = -90; latitude < 90; ++latitude) { glBegin(GL_QUAD_STRIP); for (longitude = 0; longitude <= 360; ++longitude) { GLfloat x, y, z; x = sin(longitude * dToR) * cos(latitude * dToR); y = sin(latitude * dToR); z = cos(longitude * dToR) * cos(latitude * dToR); glNormal3f(x, y, z); glVertex3f(x, y, z); x = sin(longitude * dToR) * cos((latitude+1) * dToR); y = sin((latitude+1) * dToR); z = cos(longitude * dToR) * cos((latitude+1) * dToR); glNormal3f(x, y, z); glVertex3f(x, y, z); } glEnd(); } glEndList(); printf("%.2f frames per second\n", Benchmark(Test)); }This version of the program achieves a little less than 2.5 frames per second, a noticeable improvement.
When the glClear(), glNormal3f(), and glVertex3f() calls are again replaced with glColor3f(), the program runs at roughly 4 frames per second. This implies that the program is no longer CPU limited, so you need to look further to find the bottleneck.