1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
|
Version 0.21 of free fast math routines (libffm, preliminary version) now
released !
Hi,
Kazushige Goto and myself, we are now releasing version 0.21 of free fast math
routines to eventually replace libm. These routines are based on work I did
years ago for another RISC CPU, but those now use different (*better* !)
approximations and some other ideas that grew in the meantime. Kazushige Goto
again did a really great job in optimizing the assembler code, "vectorizing"
polynomial evaluation code and improving instruction scheduling to get the
code so fast as it is now !
Some of the routines seem to run even nearly twice as fast as the already
fast "Cray" routine I ported some weeks ago, at comparable accuracy.
This version 0.21 of libffm still does not yet contain checking for invalid
arguments. But this is planned for the next release, as also sinh, cosh,
tanh, and a full precision pow function, besides other small utilities
functions like fmod, ldexp, frexp and all the like (time permitting !).
Support for ECOFF and profiling has already been added.
Further modification could be (besides the argument checking) the fine-tuning
of the last bits of the constants used and of the order of evaluation to
minimize or compensate the effect of rounding errors.
Copyright is now changed to GNU Library GPL, as requested.
The routines can be downloaded at
http://people.frankfurt.netsurf.de/Joachim.Wesner/libffm.0.21.tar.gz
See file README for further instructions and details.
Approximate running times in us in a tight loop for random arguments 0..10
(0..1 for asin/acos) on a 533MHz LX 21164 (n = 63).
libffm libfm libm
sin 0.14 0.21 0.43
cos 0.15 0.21 0.43
tan 0.15 0.27 0.54
cotan 0.15 ---- ----
asin 0.21 ---- 1.33
acos 0.21 ---- 1.27
atan 0.19 ---- 0.58
atan2 0.20 ---- 0.72
log2 0.17 ---- ----
log 0.17 0.16 0.44
log10 0.17 0.22 0.53
exp2 0.10 ---- ----
exp 0.14 0.17 0.50
exp10 0.14 ---- ----
powr(x,n) 0.10 0.35 1.67 (pow)
powr(x,y) 0.35 0.35 1.67 (pow)
sqrt 0.13 0.13 0.19
Joachim
<joachim.wesner@frankfurt.netsurf.de>
<goto@statabo.rim.or.jp>
|