There have been no updates to Compaq's CPML library for Alpha Linux since 2002. I downloaded the latest CPML distribution from the Alpha Tools website and examined the timestamps on the contents.
$ rpm --verbose --package --query --list cpml_ev6-5.2.0-1.alpha.rpm drwxr-xr-x 2 root root 0 Feb 15 2002 /usr/doc/cpml-5.2.0 -rwxr-xr-x 1 root root 9142 Feb 15 2002 /usr/doc/cpml-5.2.0/README -rwxr-xr-x 1 root root 519 Feb 15 2002 /usr/doc/cpml-5.2.0/Release_Notes-5.2.0 lrwxrwxrwx 1 root root 31 Feb 15 2002 /usr/include/cpml.h -> ../lib/compaq/cpml-5.2.0/cpml.h drwxr-xr-x 2 root root 0 Feb 15 2002 /usr/lib/compaq/cpml-5.2.0 -rwxr-xr-x 1 root root 5000 Apr 17 2000 /usr/lib/compaq/cpml-5.2.0/cpml.h -rw-r--r-- 1 root root 1250518 Feb 11 2002 /usr/lib/compaq/cpml-5.2.0/libcpml_ev6.a -rw-r--r-- 1 root root 0 Feb 15 2002 /usr/lib/compaq/cpml-5.2.0/libcpml_ev6.so lrwxrwxrwx 1 root root 33 Feb 15 2002 /usr/lib/libcpml.a -> ./compaq/cpml-5.2.0/libcpml_ev6.a lrwxrwxrwx 1 root root 34 Feb 15 2002 /usr/lib/libcpml.so -> ./compaq/cpml-5.2.0/libcpml_ev6.so
As you can see, the last modifications are from 2002.
The cpml distribution includes benchmark data comparing CPML to libm for some double precision maths functions. It is reasonable to assume that in the years since the last update to cpml, the glibc developers have made some progress with maths performance, I wanted to find out if libm has caught up with CPML performance.
These are the benchmark statistics quoted by Compaq:
| subroutine | libcpml | libm |
|---|---|---|
| acos | 95 | 1185 |
| asin | 99 | 1193 |
| atan | 83 | 242 |
| atan2 | 108 | 408 |
| cos | 73 | 251 |
| exp | 67 | 376 |
| log | 62 | 299 |
| log10 | 62 | 367 |
| pow | 118 | 1047 |
| sin | 74 | 367 |
| sqrt | 68 | 1017 |
| tan | 101 | 505 |
Note: these figures are in cycles, lower is better.
Note: I'm not including the F_ functions, as they use relaxed checking to improve performance, similar to gcc's -ffast-math.
If these figures are accurate, the cpml implementation of these functions offer as much as a 10 fold performance increase over the libm equivalents. I own the same revision cpu Compaq state was used to obtain these statistics, so I decided to replicate these benchmarks with an up-to-date libm.
To test this, I wrote a quick C program to test the different implementations using the same input values used by compaq, which they describe in the cpml documentation. You can download the program I used here.
Use gcc -O2 -ldl -o cpml cpml.c to compile.
NOTE: This program should work on x86 as well, you can compare the maths functions that come with icc..
In order to get as accurate results as possible, I compiled the program then changed to single user mode to minimise interference from other processes. I ran the program 3 times and took the average of the results to compile this table.
I didn't included the F_ "fast" functions, although my program does test them.
| subroutine | libcpml | libm | comment |
|---|---|---|---|
| acos | 99 | 111 | Ten fold improvement in libm since testing by compaq, now only nanoseconds slower than cpml. |
| asin | 106 | 119 | Another great improvement in libm. |
| atan | 89 | 52.3 | libm implementation now faster than cpml. |
| atan2 | 167 | 146 | libm atan2 now out performs cpml equivalent. |
| cos | 79 | 199 | libm implementation improved nearly 50% since compaq testing, however cpml still the better performer. |
| exp | 72 | 49 | libm overtakes cpml, a dramatic 8 fold improvement. |
| log | 80.3 | 162 | libm almost 50% faster since compaq tests, dropping from 299 to 162 cycles. |
| log10 | 66 | 245 | Slight improvement from libm, cpml still leader. |
| pow | 65 | 212 | Considerable improvement from libm, cpml still ahead. cpml pow() performed better for me than it did for compaq! |
| sin | 92 | 206 | Not much change since compaq tests, slight improvement from libm. |
| sqrt | 75.6 | 26 | libm overtakes cpml for sqrt() performance! |
| tan | 101 | 198 | libm tan() 5 times faster since compaq test. |
A massive performance improvement in glibc makes the difference between libm and cpml negligible on some functions, and libm even overtakes on some functions, such as atan(), atan2() and sqrt().
If nanoseconds matter, where high performance maths are required, the cpml functions should be used selectively to allow the libm implementations that now out-perform cpml to be used. The dramatic 10 fold performance improvement initially offered by cpml no longer exists, the libm developers have improved their performance significantly.
To use the cpml functions that still offer better performance, whilst allowing libm functions to be otherwise, a preload library can be used.
#define _GNU_SOURCE
#include <stdio.h>
#include <dlfcn.h>
/* # gcc -O2 -ldl -shared -o /usr/local/lib/libcpmlloader.so
* # echo /usr/local/lib/libcpmlloader.so >> /etc/ld.so.preload
*/
static double (* acosptr)(double x),
(* asinptr)(double x),
(* cosptr)(double x),
(* hypotptr)(double x, double y),
(* logptr)(double x),
(* log10ptr)(double x, double y),
(* powptr)(double x, double y),
(* sinptr)(double x),
(* tanptr)(double x);
static void __attribute__ ((constructor)) init (void)
{
void * library;
if (!(library = dlopen ("libcpml.so", RTLD_LAZY))) {
fprintf (stderr, "failed to open libcpml.so: %s\n", dlerror());
if (!(library = dlopen ("libm.so", RTLD_LAZY))) {
fprintf (stderr, "attempted to fall back to libm.so, but that failed as well: %s\n",
dlerror());
/* continue anyway, but this probably wont be good... */
}
}
acosptr = (double (*)()) dlsym (library, "acos");
asinptr = (double (*)()) dlsym (library, "asin");
cosptr = (double (*)()) dlsym (library, "cos");
hypotptr = (double (*)()) dlsym (library, "hypot");
logptr = (double (*)()) dlsym (library, "log");
log10ptr = (double (*)()) dlsym (library, "log10");
powptr = (double (*)()) dlsym (library, "pow");
sinptr = (double (*)()) dlsym (library, "sin");
tanptr = (double (*)()) dlsym (library, "tan");
return;
}
double acos (double x) { return acosptr (x); }
double asin (double x) { return asinptr (x); }
double cos (double x) { return cosptr (x); }
double hypot (double x, double y) { return hypotptr (x, y); }
double log (double x) { return logptr (x); }
double log10 (double x, double y) { return log10ptr (x, y); }
double pow (double x, double y) { return powptr (x, y); }
double sin (double x) { return sinptr (x); }
double tan (double x) { return tanptr (x); }