File: README.md

package info (click to toggle)
volk 3.3.0-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 4,164 kB
  • sloc: ansic: 50,363; cpp: 2,840; asm: 918; python: 897; xml: 385; sh: 157; makefile: 14
file content (68 lines) | stat: -rw-r--r-- 2,183 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# How to create custom kernel dispatchers

A kernel dispatcher is kernel implementation that calls other kernel implementations.
By default, a dispatcher is generated by the build system for every kernel such that:
  * the best aligned implementation is called when all pointer arguments are aligned,
  * and otherwise the best unaligned implementation is called.

The author of a VOLK kernel may create a custom dispatcher,
to be called in place of the automatically generated one.
A custom dispatcher may be useful to handle head and tail cases,
or to implement different alignment and bounds checking logic.

## Code for an example dispatcher w/ tail case

```cpp
#include <volk/volk_common.h>

#ifdef LV_HAVE_DISPATCHER

static inline void volk_32f_x2_add_32f_dispatcher(float* cVector, const float* aVector, const float* bVector, unsigned int num_points)
{
    const unsigned int num_points_r = num_points%4;
    const unsigned int num_points_x = num_points - num_points_r;

    if (volk_is_aligned(VOLK_OR_PTR(cVector, VOLK_OR_PTR(aVector, bVector))))
    {
        volk_32f_x2_add_32f_a(cVector, aVector, bVector, num_points_x);
    }
    else
    {
        volk_32f_x2_add_32f_u(cVector, aVector, bVector, num_points_x);
    }

    volk_32f_x2_add_32f_g(cVector+num_points_x, aVector+num_points_x, bVector+num_points_x, num_points_r);
}

#endif //LV_HAVE_DISPATCHER
```

## Code for an example dispatcher w/ tail case and accumulator

```cpp
#include <volk/volk_common.h>

#ifdef LV_HAVE_DISPATCHER

static inline void volk_32f_x2_dot_prod_32f_dispatcher(float * result, const float * input, const float * taps, unsigned int num_points)
{
    const unsigned int num_points_r = num_points%16;
    const unsigned int num_points_x = num_points - num_points_r;

    if (volk_is_aligned(VOLK_OR_PTR(input, taps)))
    {
        volk_32f_x2_dot_prod_32f_a(result, input, taps, num_points_x);
    }
    else
    {
        volk_32f_x2_dot_prod_32f_u(result, input, taps, num_points_x);
    }

    float result_tail = 0;
    volk_32f_x2_dot_prod_32f_g(&result_tail, input+num_points_x, taps+num_points_x, num_points_r);

    *result += result_tail;
}

#endif //LV_HAVE_DISPATCHER
```