![]() |
![]() |
![]() |
|
![]() | |
![]() |
![]() |
![]() |
![]() |
![]() |
Log In | Not a Member? |
Support
![]() |
![]() |
TutorialProgramming Example: Hello WorldLike all good programming tutorials, here is the "Hello World" program done in AltiVec.
The program starts by declaring a variable 'v1' which is a vector type. There are eleven types of vectors defined in the AltiVec C programming Model.
Then the program sets the variable v1 to the sequence of characters 'Hello World 1234'. This is done by casting a set of 16 chars to a vector. Note that parentheses () are used here instead of brackets {} to assign literals to a vector. Finally, the program calls printf to write out the variable v1. printf has been extended to understand vectors. Vectors can be formatted as follows: Note: printf() may not correctly recognize vector types on MacOS 10.0.x and MacOS 10.1.x. In such cases you will generally see the format specifier (e.g. %vc) printed instead of the vector. Compiling and LinkingCompiling and linking the Hello World program using MrC, Project Builder / GCC and most third party compilers are simple tasks. Details for each are provided below: Project Builder / GCC
![]() XCodeXCode is similar to Project Builder, in that it uses GCC to do the compiling. Apple has made turning on AltiVec easier by adding a button for this purpose. Select the appropriate target in the XCode main window, and then select Show Inspector... from the Project Menu. Click the Build pane, then select the GNU C/Obj C Compiler subset of settings in the drawer at the left side of the inspector window. "Enable AltiVec Extension" should be visible. Make sure the check box has a check mark in it. Other Compilers Please check with your compiler vendor about how best to use AltiVec with that product. If the compiler offers you the choice to auto generate VRSAVE instructions or keep the vrsave register up to date for you, you should enable that option. Proper use of the vrsave register is required on both MacOS 9 and X. The operating system uses the value stored in the vrsave special purpose register to keep track of which vector registers to save in the event of a thread or task switch or other variety of preemption such as an interrupt level task. Only those registers marked in use in the vrsave special purpose register will be restored when control is returned to your application. All others may be overwritten with a garbage value such as 0x7FFFDEAD. On MacOS X, registers may be saved and restored in sets. Also, vector register context switches may be done lazily. As a result, it is not guaranteed that every register marked unused by vrsave will be overwritten by 0x7FFFDEAD when a context switch happens. Please see the FORTRAN page for more information about FORTRAN compilers.
Example: Two's ComplementThis second example shows how vectors can be passed to subroutines. The C programming model allows for both passing of vectors by value and by reference. In this example we will pass by value. Passing by value is in general a lot more efficient with vectors as it reduces the amount of loading and storing that needs to be done. The AltiVec instruction set does not contain a 'Complement' instruction. To perform a two's complement we can either first perform a one's complement and then add 1 or subtract the number from zero. The AltiVec instruction also does not include a 'not' instruction; however it does include a 'nor' instruction, which will perform a 'not' if both of the parameters are the same. This approach uses the ones complement plus 1 method: #include <stdio.h> vector signed int vec_2sComp (vector signed int x); int main (void) { vector signed int v1 = (vector signed int) (-2,0,2,8); vector signed int v2; v2 = vec_2sComp (v1); printf ("Numbers %9.8vld\n", v1); printf ("2's Comp %9.8vld\n", v2); return 0; } vector signed int vec_2sComp (vector signed int x) { vector signed int one = (vector signed int) (1); x = vec_nor (x, x); x = vec_add (x, one); return x; } As in the last example we can initialize a vector variable by casting a set of scalars to be a vector. The set of scalars must have either one element or the same number of elements as the vector. The compiler is cognizant of a number of vectors that can be generated "on the fly" rather than being loaded from memory. Refer to the constants example for more details. The constant '1' could also be generated as it is needed by moving it into the vec_add instruction as shown here:
In fact for those of you who like to write condensed code, this can all be written as one like this
Another approach to the problem is to subtract the number from zero. This approach yields the following code.
This routine is so simple now that we might be tempted to turn it into a macro instead of a subroutine. The problem is the constant zero. It has been given a specific type, vector signed int. If we write a macro with this constant definition in it we can not use it as a general purpose routine to find the two's complement of any vector, regardless of type. We need to come up with a way of generating the value zero instead of defining it. This can be done in many ways, such as subtracting x from itself or xor x with itself. Here is the subroutine using subtract:
And here it is as a macro
It should be noted that because implied type conversion is not allowed with vector types and overt type coercion does not actually change any bits in the vector, extremely stringent type checking is not quite as necessary with vectors as it is with scalar integer or floating point types. This is not to say that it is unnecessary, however. Types ensure that you get the correct form of an operation when there is more than one kind available. In addition, AltiVec macros are still subject to problems common to preprocessor macro expansion. Here is a classic example where a macro would fail, but a normal function would succeed:
As a result, you may find that much of your code is better written as an inline function. Inline functions prevent use of the function with types for which it does not make sense. For example, using vec_2sComp() on a vector float is dubious -- the trick used here to generate zero does not work if x is NaN or +- Inf. So, you may choose to provide vec_2sComp as inline functions for just those vector types that make sense:
Inline functions should have similar performance to the macro as long as the compiler honors the inline request. C++ template functions are another approach that gives stronger type checking. |
||