Technical: Hardware: G4
Advanced Search
Apple Developer Connection
Member Login Log In | Not a Member? Support

AltiVec and the G3 (and earlier processors)

Apple does not provide an AltiVec emulator for the G3 and earlier processors. If one of these processors encounters AltiVec instructions, it may take an illegal instruction exception. If your application will be running on older processors you must check at run time for the presence of the AltiVec unit.

Checking for AltiVec

To check whether the user's machine supports AltiVec in Classic or CFM Carbon, use Apple's Gestalt Manager:

//For Classic, use Gestalt.h
#include <Gestalt.h>

//For Carbon, use CoreServices.h instead
#include <CoreServices/CoreServices.h>

Boolean IsAltiVecAvailable( void )
{

long cpuAttributes;
Boolean hasAltiVec = false;
OSErr err = Gestalt( gestaltPowerPCProcessorFeatures, &cpuAttributes );

if( noErr == err )
hasAltiVec = ( 1 << gestaltPowerPCHasVectorInstructions) & cpuAttributes;

return hasAltiVec;

}

For mach-o applications use sysctl():

#include <sys/sysctl.h>

//returns: 0 for scalar only, 1 for AltiVec
//Note: may return >1 in the future
int GetAltiVecTypeAvailable( void )
{

int sels[2] = { CTL_HW, HW_VECTORUNIT };
int vType = 0; //0 == scalar only
size_t length = sizeof(vType);
int error = sysctl(sels, 2, &vType, &length, NULL, 0);

if( 0 == error ) return vType;

return 0;

}

Finally, for applications that need to also work outside of MacOS X, you may as a last resort patch the signal handler for illegal instructions, attempt a vector instruction and see if the signal handler was called. This should work on any platform that supports signals:

#include <signal.h>

volatile int gIsAltiVecPresent = -1L;

static sigjmp_buf gEnv;

void sig_ill_handler( int sig )
{

//Set our flag to 0 to indicate AltiVec is illegal
gIsAltiVecPresent = 0;

//long jump back to safety
siglongjmp( gEnv, 0);

}

int IsAltiVecAvailable( void )
{

if( -1L == gIsAltiVecPresent )
{

sig_t oldhandler;
sigset_t signame;
struct sigaction sa_new, sa_old;

//Set AltiVec to ON
gIsAltiVecPresent = 1;

//Set up the signal mask
sigemptyset( &signame );
sigaddset( &signame, SIGILL );

//Set up the signal handler
sa_new.sa_handler = sig_ill_handler;
sa_new.sa_mask = signame;
sa_new.sa_flags = 0;

//Install the signal handler
sigaction( SIGILL, &sa_new, &sa_old );

//Attempt to use AltiVec
if( 0 == sigsetjmp( gEnv, 0) )
{

#if defined( __GNUC__ )
asm volatile ( "vor v0, v0, v0" );
#elif defined( __MWERKS__ )
asm{ vor v0, v0, v0 }
#else
#error unknown compiler
#endif

}

//Restore the old signal handler
sigaction( SIGILL, &sa_old, &sa_new );

}

return gIsAltiVecPresent;

}

Please note that the signal based method relies on the compiler to honor the volatile declaration of gIsAltiVecPresent. In addition, inline asms are used in an attempt to prevent the compiler from generating vector stack save and restore instructions in the signal based IsAltiVecAvailable() function as are normally required by the PowerPC AltiVec ABI. If the compiler decides to generate a stack frame for AltiVec anyway, the function will trigger an illegal instruction exception when executed on a G3 or earlier processor before we make it to the first signal system call, likely causing your application to prematurely terminate. Thus, we are relying on implementation dependent compiler behavior here for correct operation of your application. According to your tolerance for risk, at minimum it may be safer to move the vor statement to a separate function, and take steps to ensure it is not inlined. (See below for further details.) Safer yet would be to construct the function in PowerPC assembly. Some assemblers use "0" as the register name rather than "v0".

For these and other reasons, it is recommended that developers use either the Gestalt() or sysctl() methods instead of the signal method.

Precautions Necessary for Safe Conditional Execution

A mfspr vrsave instruction is likely to be automatically generated by the compiler in the preamble of any function that uses vector types. (This behavior is required for proper function on MacOS.) The vrsave special purpose register is not present on G3 and earlier processors. Trying to use it with mfspr it will cause problems. For this reason, a simple if statement such as the following is not in itself sufficient to prevent earlier processors from encountering vector code.

//FAILS because the presence of vector code in this block
//induces a mfspr vrsave in the (invisible) prolog to this
//C function!
if( IsAltiVecPresent() )
{
vector unsigned char a_constant = vec_splat_u8(0);
...

}
else
{

unsigned char a_constant = 0;
...

}

You must make separate functions to hold the AltiVec and scalar versions of the same code. Be careful of automatic inlining by some compilers.

 

Tips for Writing Code that Runs on Both G3 and G4 / G5

If you intend to write an application that uses AltiVec but which also must run on a G3 or earlier processor, here are some tips to help make the process go more smoothly.

  • For functions that you know will have two versions, one AltiVec accelerated, one not, write the AltiVec accelerated version first.
    • AltiVec has more stringent data alignment and organization requirements than do the scalar units.
    • This will help ensure that you avoid committing to data formats or software architectures that are not amenable to vectorization.
    • It is much easier to write scalar code to mirror vector code than the other way around. This is because to a limited extent, the scalar units can be used as a vector unit:
      • A PowerPC has two or three scalar integer units that can be used in parallel. Formally this is like a single 64 or 96 bit vector register. Only one of the many integer units can do multiplication and division however.
      • The FPU is pipelined to a depth of several cycles. In order to avoid FPU data dependency stalls, you need multiple independent data "streams". A FPU with a 4 cycle pipeline can be thought of as a single (unpipelined) vector unit with registers that hold four elements.
      • For these reasons, design elements that work well for vector code (efficient cache usage, high-throughput function design, increased parallelism) also benefit scalar code.
    • In situations where writing for AltiVec first would be premature optimization, but an AltiVec version seems likely, spend a few minutes designing your probable vector approach before writing the scalar version. This will help reduce the probability that you will have to rewrite both later.

  • Separate vector and scalar versions into code units that can be accessed polymorphically.
    • In C++, use a factory method to instantiate a scalar or vector class instance depending on the run time environment. Put the runtime dependent code in virtual class methods.
    • In C, you can use function pointers to achieve the same results. It may be useful to move the vector and scalar code elements out into plug-ins, and load the appropriate plug-in at run time.
    • This sort of function pointer based interface helps guarantee that the compiler will not inline these functions where it should not.
  • The best place to branch to scalar vs. AltiVec is not in leaf functions. Usually, it is at the last moment that a function knows how to address all of the data, not a small subset of it. For example, if you are vectorizing PaintRect(), don't branch to vector code at the level of the function that draws each horizontal row of the rectangle, or an individual pixel. Branch at the level of the function in charge of drawing the whole rectangle.
    • AltiVec functions typically have high stack frame setup overhead and are easily data starved.
    • AltiVec functions perform best with complex calculations
    • Knowing where all the data is makes it easier to correctly prefetch data into the caches
    • Don't trim so high up that the calculation becomes so complex that the function runs out of vector registers.
  • Use the scalar and vector versions of identical functions to check each other for correctness.
  • Avoid lookup tables. If you must use them for scalar code, document thoroughly how the table was derived, so that they may be replaced with direct calculation in the vector unit later.
  • Use vector-friendly data layouts for the scalar code.

Many of these topics and their rationale will be covered in more detail in the sections that follow. Please see the G5 section for key differences between G4 and G5.

Table of ContentsNextPrevious