1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196
|
This file is examples/README and is also linked to from the doxygen main page.
/**
@page ExamplesREADME Examples, Tests and Benchmarks
The examples/ directory contains tests, examples and timing
harnesses for the components of the Random123 library.
@section building Compiling and Running the code
Installing and using Random123 requires only the use
of the header files, and has no prerequisites other than
a reasonable C99 or C++98 compiler.
With a modern GNU make (3.80 or newer), building and running the core tests
and examples can be as easy as running gmake with no arguments.
Note, though, that the provided examples/GNUmakefile intentionally avoids setting
any of the standard make variables: CC, CXX, CPPFLAGS, CFLAGS,
CXXFLAGS, TARGET_ARCH, LDFLAGS, LOADLIBES, LDLIBS. GNU make
will inherit settings for these variables from the environment,
or they may be set on the command line. If none are set,
compilation will proceed using system-wide default flags, generally
without advanced optimization, architectural tuning, warnings, or other
common options.
Before putting the Random123 library to use in an application,
it is important to test it using the same compiler flags and
features that the application will use. In other words,
the conventional make variables should be set
the same way when testing the library as they will be set when the
library is actually compiled into your application.
Something like:
@code
gmake CFLAGS="-std=c99" CXXFLAGS="-std=c++0x" CPPFLAGS="/alternate/location/include -O3 -Wall -Wstrict-aliasing=2" TARGET_ARCH="-march=native"
@endcode
would confirm that all is well with optimization on, and output targeted at
an architecture with the same capabilities as the machine running the compilation.
Very old versions of GNU make (pre-2002) or non-GNU
make will not work with examples/GNUmakefile.. Lacking a suitably modern GNU make,
our advice is to invoke the
C or C++ compiler directly on the source files in the examples/ directory.
The file: examples/BUILD.LOG contains a list of sample build commands. They
will almost certainly need to be adapted to the target system.
For Windows users, BUILDVC.BAT invokes the Microsoft
Visual Studio compiler. Edit it as needed for your platform.
@section tests Tests
It is recommended that Random123 be tested <b> on the target system,
with the target compiler, intended optimization levels, options,
target architectures, etc.</b>
before relying it. The library
uses architecture- and compiler-specific intrinsics,
features and assembly language. We have seen cases where
one compiler (open64 version 4.2.4) masquerades as another compiler (it defines __GNUC__) accepts extensions
specific to the other compiler (__uint128_t)
without error or warning, and then silently produces incorrect code.
The only way to guard against this kind of misbehavior is to
compile and run the tests with the compiler and options that you intend to use and
the platform that you intend to run on.
@subsection kat Known Answer Tests
Testing that your compiled code computes the same "Known Answers" as the
reference implementation which has been subjected to the Crush batteries of
statistical tests is critically important.
The file \c examples/kat_vectors contains a few thousand "Known Answer
Test" vectors, i.e., tuples of (method, counter, key, answer). The
source file katc.c is incorporated into kat_c.c (C),
kat_cpp.cpp (C++), kat_cuda.cu (CUDA) and kat_opencl.c
(OpenCL), which are compiled into kat_c, kat_cpp, kat_cuda
and kat_opencl, respectively. Each of these will read kat_vectors
and verify that the compiled code obtains the same "known answers".
The kat vectors are not language-specific. Implementations of CBRNGs in
other languages could also be validated against \c kat_vectors. The
kat vectors are also byte-order independent. In other
words, the CBRNGs in the library should produce the same numerical
results on little-endian and big-endian hardware, but this behavior
is largely untested.
@subsection ut Unit Tests
examples/ also contains tests of specific components of the library. While not
exhaustive, these tests verify that a variety of invariants are satisfied
by the public methods (e.g., that incr(N) is the same as incr() N times). They
also serve to verify some of the compile-time feature-test logic which, if incorrect can
lead to mysterious errors (e.g., is it necessary to <c>#include <smmintrin.h></c>).
Unit tests include:
<ul>
<li> ut_features - verifies compile-time feature-test logic.
<li> ut_carray - verifies the capabilities of the @ref arrayNxW "r123arrayNxW" types.
<li> ut_M128 - verifies the capabilities of the r123m128i type (only when SSE2 is available).
<li> ut_ReinterpretCtr - verifies the r123::ReinterpretCtr wrapper template.
<li> ut_Engine - verifies the capabilities of the r123::Engine wrapper template.
<li> ut_aes - verifies that the @ref AESNI "AESNI" cbrngs match known answers from FIPS-197.
<li> ut_gsl - tests the @ref GSL_CBRNG adapter <b>Requires the GNU Scientific Library</b>.
</ul>
@section examples Examples
@subsection simple Simple examples in C and C++
There are two extremely short examples that show all the code necessary to
obtain and print a few random numbers in C and C++:
<ul>
<li> simple.c
<li> simplepp.cpp
</ul>
@subsection generation Generation of uniformly distributed real values.
The uniformly distributed integers that the CBRNGs produce are rarely
what is required by applications. Sampling other distributions
is beyond the scope of Random123. Many
distributions can be sampled with GSL (using \<Random123/conventional/gsl_cbrng.h\>
or with C++11's \<random\> (using \<Random123/MicroURNG.hpp\> or \<Random123/conventional/Engine.hpp\>.
Nevertheless, some distributions are so simple that the
machinery of \<random\> or GSL seems like overkill. We provide
code to generate uniformly distributed real numbers in
the range (0, 1) and (-1, 1) in two header files:
<ul>
<li> uniform.hpp
<li> u01fixedpt.h
</ul>
We encourage you to copy these header files and use them (or modify them)
to suit your needs. They are not as thoroughly tested or as portable
as the headers in the library itself, but they should be safe to use
on any platform with IEEE-754 floating point support. They are
documented in comments in the files themselves.
@subsection pi Estimating pi using different APIs
Using random numbers to estimate pi is a classic example. The idea
is to choose points at random in a square and to count how many of
them lie within the inscribed circle. Since the area of the square
is 4*r^2 and the area of the circle is pi*r^2, the ratio of the
number of points in the circle to the total number of points should
approach pi/4 as the number of points grows.
We give several examples of pi estimation, each of
which illustrates a slightly different API
<ul>
<li> pi_capi - using only the basic C API
<li> pi_cppapi - using only the basic C++ API
<li> pi_u01 - using the C++ API and uniform.hpp
<li> pi_gsl - using a Random123 generator, but a gsl distribution to obtain real-valued random numbers. <b>Requires the GNU Scientific Library</b>
<li> pi_microurng - using a Random123 generator, but a C++0x \<random\> distribution to obtain real-valued random numbers
<li> pi_cuda - using the Random123 library with CUDA, runnable on an NVIDIA GPU
<li> pi_cudapp - using the C++ API with CUDA, runnable on an NVIDIA GPU
<li> pi_opencl - using the Random123 library with OpenCL, runnable on any OpenCL platform: e.g. NVIDIA or ATI GPUs or Intel or AMD CPUs. The actual
compute kernel lives in the \c pi_opencl_kernel.ocl file and is transformed by \c gencl.sh into strings that get included in \c pi_opencl.c, since
the OpenCL kernels get compiled for the target OpenCL platform at run-time
<li> pi_aes - uses the AESNI4x32 Random123 generator
</ul>
@section timers Measuring performance
We include some timing harnesses that can be used to measure
the performance of these CBRNGs on various platforms. These
timing harnesses report a cycles-per-byte (cpB) metric, which
should be independent of clock-rate or number of cores, but
depends on compilers and the architecture of
the processor being run on. They also report aggregate throughput
in GB/sec: a more direct
measure of performance, but one that depends on clock speed
and number of cores being used.
The timing harnesses are obscured by tricks required for
portability across platforms and CBRNG type.
As a result,
they are not
recommended as examples of the use of library
and its APIs.
<ul>
<li> time_serial - uses the C API and reports performance for a
single core.
<li> timers - uses the C++ API, and is the only tool that reports
AESNI1xm128i and ARS1xm128i performance (if your CPU supports the AES-NI instruction
extensions).
<li> time_thread - uses the C API and pthreads to report
multithreaded performance, uses all cores available on the platform.
<li> time_cuda - uses the C API within NVIDIA CUDA to run on NVIDIA GPUs.
<li> time_opencl - uses the C API within OpenCL to run on GPUs or CPUs.
</ul>
time_serial, time_thread, time_cuda, time_opencl all use a common
kernel defined in time_random123.h. They all use various
util_* header files for utility functions and platform-related
boilerplate (also used by the pi_* examples).
*/
|