home *** CD-ROM | disk | FTP | other *** search
-
- [Note: this is a fairly technical reply to Jered's question, but there
- are a bunch of assembler-heads on this list, so I thought I'd post it.
- Most people will just want to skip this message.]
-
- >>>>> "jered" == jered <jered@MIT.EDU> writes:
-
- jered> I just saw on the linux kernel discuss meeting that 486es
- jered> and higher have a special instruction for converting
- jered> big-endian to/from little-endian. Does anyone know if gcc
- jered> (djgpp) uses this and optimizes for it, what sort of
- jered> performance increase it might give, and if it would be
- jered> worth anyone's while to have a 486-higher executable of
- jered> Executor?
-
- Jered is talking about the "bswap" instruction, which byte swaps a
- four-byte value in a register in one cycle. It was added when the
- 80486 came out, so it isn't present on 80386's. It's non-pairable on
- the Pentium.
-
- gcc doesn't generate the "bswap" instruction, because it won't work on
- an 80386. I don't know if gcc has any way of doing anything special
- for byte swaps anyway. The -m486 flag isn't allowed to generate code
- that won't run on an 80386, so gcc couldn't generate a bswap. From
- gcc.info:
-
- `-m486'
- `-mno-486'
- Control whether or not code is optimized for a 486 instead of an
- 386. Code generated for an 486 will run on a 386 and vice versa.
-
- Executor's C code uses inline assembly to byte swap with three rotate
- instructions, which works on both the 80386 and 80486+. Our CPU
- emulator (syn68k) decides at runtime if you have an 80486 or better
- and generates bswap instructions "on the fly" if you do. Otherwise,
- it generates three rotate instructions.
-
- A version of Executor that didn't work on 80386's would be a little
- smaller and a little faster than the current one, but there's no
- reason to think the performance difference would be huge. We
- benchmarked such a version long, long ago and found that an
- 80486-specific version was something like 5-10% faster.
-
- Note that since NEXTSTEP/Intel only works on 80486's or better,
- Executor/NEXTSTEP/Intel assumes the presence of an 80486 and takes
- advantage of it.
-
- The new, faster blitter I'm writing may take advantage of the bswap
- instruction, if present. Once that's done, the CPU emulator and the
- graphics engine will both use bswap, so there won't be much
- performance to gain by creating an 80486-specific version of Executor.
-
- Thanks for the suggestion, though.
-
- -Mat
-
-