Performance of C-ified code

The speed-up clearly depends on the amount of C-ification and on the statistical importance of C-ified code in the execution profile of a program (see figure [*]). We have noticed between 10-20% speed increase for programs which take advantage of C-ified code moderately, As these programs spend only 20-30% of their time in C-ified sequences performances are expected to scale correspondingly when we extend this approach to the full BinProlog instruction set and implement low-level gcc direct jumps instead of function calls and anti-calls.

Figure: Performance of emulated (emBP) and partially C-ified BinProlog 3.22 (C-BP) compared to emulated (emSP) and native (natSP) SICStus 2.1_9 on a Sparc 10/20).
\begin{figure}\begin{center}
\begin{tabular}{\vert\vert l\vert\vert r\vert r\ver...
...070 \\ \hline
\hline
\hline
\end{tabular} \\
\medskip\end{center}\end{figure}

Code-sizes for C-ified BinProlog executables (dynamically linked on Sparcs with Solaris 2.3) are usually even smaller than `compact' Sicstus code which uses classical instruction folding (a few hundreds of opcodes) to speed-up the emulator.

The following table shows some code-size/execution-speed variations with respect to the threshold for the semi-ring (SEMI3) benchmark. Clearly, excessively small chunks can influence adversely not only on size but also on speed. Something like threshold=20, looks like a practical optimum for this program.

threshold:   0    4    8    20  30  1000 emBP emSP natSP
size:  (K)  34.5 32.2 29.9 16.3 13.1 12.9  4.8 22.0 31.9
speed: (ms) 1480 1430 1440 1450 1810 1790 1800 1810 1310