[<<Previous Entry]
[^^Up^^]
[Next Entry>>]
[Menu]
[About The Guide]
FMUL4X4: OPCODE: db,f1 IIT ONLY
This instruction is available only on the IIT (Integrated
Information Technology Inc.) math processors.
Takes 242 clocks.
The instruction performs a 4x4 matrix multiply in one
instruction using four banks of 8 floating point registers.
The operands must be loaded to a specific bank in a specific
order. The equation solved can be represented by:
Xn = (A00 * Xo) + (A01 * Xo) + (A02 * Xo) + (A03 * Xo)
Yn = (A10 * Yo) + (A11 * Yo) + (A12 * Yo) + (A13 * Yo)
Zn = (A20 * Zo) + (A21 * Zo) + (A22 * Zo) + (A23 * Zo)
Vn = (A30 * Vo) + (A31 * Vo) + (A32 * Vo) + (A33 * Vo)
Where Xo stands for the original X value and Xn for the
result. Operands must be loaded to the following registers
in the specified banks in the specified order.
Before FMUL4X4 After FMUL4X4
bank bank
Register: 0 1 2 0
ST(0) Xo A33 A31 Xn
ST(1) Yo A23 A21 Yn
ST(2) Zo A13 A11 Zn
ST(3) Vo A03 A01 Vn
ST(4) A32 A30 ?
ST(5) A22 A20 ?
ST(6) A12 A10 ?
ST(7) A02 A00 ?
All four banks can be selected by using the bankswitching
instructions, but only bank 0, 1 and 2 make sense since bank
3 is an internal scratchpad. The separate banks can contain
8 floating points and may be re-used with normal
instructions. Each bank acts like an independent i80287,
except when bankswitched inbetween, in those cases where the
initial status is not maintained;
Pseudo- multichip operation can be performed in each bank
and even in multiple banks at the same time (although only
one instruction will operate on one register at any given
time), provided that the active register and top register
are not changed after switching from bank to bank.
EXAMPLE:
FINIT ; reset control word
FSBP1 ; select bank 1
FLD DWORD PTR es:[si] ; first original
FLD DWORD PTR es:[si+4] ; second original
FLD DWORD PTR es:[si+8] ; third original
FSTCW WORD PTR [bx] ; save FPU control status
FSBP2 ; NOTE ! you will see three
active registers in this
bank when using a
debugger
FINIT ; nothing visible
FLD DWORD PTR [si] ; new value
FLD DWORD PTR [si+4] ; second new value
FADD ST,ST(1) ; two values visible
FSTP DWORD PTR [si+8] ; one value visible
FSBP1 ; one original visible
FLDCW WORD PTR [bx] ; restore FPU status to the
one active in bank 1,
causing original three
values to be visible
again in correct
sequence
... simply continue with what you wanted to do with
those numbers from es:[si], they are still there.
FLD DWORD PTR [si+8] ; for instance...
This feature of the IIT chips can be used to perform complex
operations in registers with many components remaining the
same for a large dataset, only saving intermediary results
to ONE memory location, bankswitching to the next series of
operands, loading that ONE operand and continuing the
calculation with the next set of operands already in that
bank. This does require another read into the new bank but
may save time and memoryspace compared to memory based
operands or multiple pass algorithms with multiple arrays of
intermediary results.
BANKSWITCH INSTRUCTIONS:
This page created by ng2html v1.05, the Norton guide to HTML conversion utility.
Written by Dave Pearson