File: basic_usage.rst

package info (click to toggle)
xsimd 13.2.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 2,716 kB
  • sloc: cpp: 36,557; sh: 541; makefile: 184; python: 117
file content (53 lines) | stat: -rw-r--r-- 1,696 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
.. Copyright (c) 2016, Johan Mabille and Sylvain Corlay

   Distributed under the terms of the BSD 3-Clause License.

   The full license is in the file LICENSE, distributed with this software.

Basic usage
===========

Manipulating abstract batches
-----------------------------

Here is an example that computes the mean of two batches, using the best
architecture available, based on compile time informations:

.. literalinclude:: ../../test/doc/manipulating_abstract_batches.cpp

The batch can be a batch of 4 single precision floating point numbers (e.g. on
Neon) or a batch of 8 (e.g. on AVX2).

Manipulating parametric batches
-------------------------------

The previous example can be made fully parametric, both in the batch type and
the underlying architecture. This is achieved as described in the following
example:

.. literalinclude:: ../../test/doc/manipulating_parametric_batches.cpp

At its core, a :cpp:class:`xsimd::batch` is bound to the scalar type it contains, and to the
instruction set it can use to operate on its values.

Explicit use of an instruction set extension
--------------------------------------------

Here is an example that loads two batches of 4 double floating point values, and
computes their mean, explicitly using the AVX extension:

.. literalinclude:: ../../test/doc/explicit_use_of_an_instruction_set.cpp

Note that in that case, the instruction set is explicilty specified in the batch type.

This example outputs:

.. code::

   (2.0, 3.0, 4.0, 5.0)

.. warning::

   If you allow your compiler to generate AVX2 instructions (e.g. through
   ``-mavx2``) there is nothing preventing it from optimizing the above code
   using AVX2 instructions.