File: ops_summary.rst

package info (click to toggle)
rocprim 6.4.3-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 11,428 kB
  • sloc: cpp: 153,383; python: 1,397; sh: 404; xml: 217; makefile: 119
file content (67 lines) | stat: -rw-r--r-- 3,256 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
.. meta::
  :description: rocPRIM documentation and API reference library
  :keywords: rocPRIM, ROCm, API, documentation

.. _ops-summary:

********************************************************************
 Summary of the Operations
********************************************************************

Basics
=========

* ``transform`` applies a function to each element of the sequence, equivalent to the functional operation ``map``
* ``select`` takes the first `N`` elements of the sequence satisfying a condition (via a selection mask or a predicate function)
* ``unique`` returns unique elements within a sequence
* ``histogram`` generates a summary of the statistical distribution of the sequence

Aggregation
============

* ``reduce`` traverses the sequence while accumulating some data, equivalent to the functional operation ``fold_left``
* ``scan`` is the cumulative version of ``reduce`` which returns the sequence of the intermediate values taken by the accumulator

Differentiation
=================

* ``adjacent_difference`` computes the difference between the current element and the previous or next one in the sequence
* ``discontinuity`` detects value change between the current element and the previous or next one in the sequence

Rearrangement
================

* ``sort`` rearranges the sequence by sorting it. It could be according to a comparison operator or a value using a radix approach
* ``partial_sort`` rearranges the sequence by sorting it up to and including a given index, according to a comparison operator.
* ``nth_element`` places the nth element in its sorted position, with elements less-than before, and greater after, according to a comparison operator.
* ``exchange`` rearranges the elements according to a different stride configuration which is equivalent to a tensor axis transposition
* ``shuffle`` rotates the elements

Partition/Merge
====================

* ``partition`` divides the sequence into two or more sequences according to a predicate while preserving some ordering properties
* ``merge`` merges two ordered sequences into one while preserving the order

Data Movement
===============

* ``store`` stores the sequence to a continuous memory zone. There are variations to use an optimized path or to specify how to store the sequence to better fit the access patterns of the CUs.
* ``load`` the complementary operations of the above ones.
* ``memcpy`` copies bytes between device sources and destinations

Sequence Search
===============

* ``find_first_of`` searches for the first occurrence of any of the provided elements.
* ``adjacent_find`` searches a given sequence for the first occurence of two consecutive equal elements.
* ``search`` searches for the first occurrence of the sequence.
* ``search_n`` searches for the first occurrence of a sequence of count elements all equal to value.
* ``find_end`` searches for the last occurrence of the sequence.

Other operations
======================

* ``run_length_encode`` generates a compact representation of a sequence
* ``binary_search`` finds for each element the index of an element with the same value in another sequence (which has to be sorted)
* ``config`` selects a kernel's grid/block dimensions to tune the operation to a GPU