File: VectorPredication.rst

package info (click to toggle)
llvm-toolchain-13 1%3A13.0.1-11
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 1,418,840 kB
  • sloc: cpp: 5,290,826; ansic: 996,570; asm: 544,593; python: 188,212; objc: 72,027; lisp: 30,291; f90: 25,395; sh: 24,898; javascript: 9,780; pascal: 9,398; perl: 7,484; ml: 5,432; awk: 3,523; makefile: 2,913; xml: 953; cs: 573; fortran: 539
file content (88 lines) | stat: -rw-r--r-- 3,120 bytes parent folder | download | duplicates (27)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
==========================
Vector Predication Roadmap
==========================

.. contents:: Table of Contents
  :depth: 3
  :local:

Motivation
==========

This proposal defines a roadmap towards native vector predication in LLVM,
specifically for vector instructions with a mask and/or an explicit vector
length.  LLVM currently has no target-independent means to model predicated
vector instructions for modern SIMD ISAs such as AVX512, ARM SVE, the RISC-V V
extension and NEC SX-Aurora.  Only some predicated vector operations, such as
masked loads and stores, are available through intrinsics [MaskedIR]_.

The Vector Predication (VP) extensions is a concrete RFC and prototype
implementation to achieve native vector predication in LLVM.  The VP prototype
and all related discussions can be found in the VP patch on Phabricator
[VPRFC]_.

Roadmap
=======

1. IR-level VP intrinsics
-------------------------

- There is a consensus on the semantics/instruction set of VP.
- VP intrinsics and attributes are available on IR level.
- TTI has capability flags for VP (``supportsVP()``?,
  ``haveActiveVectorLength()``?).

Result: VP usable for IR-level vectorizers (LV, VPlan, RegionVectorizer),
potential integration in Clang with builtins.

2. CodeGen support
------------------

- VP intrinsics translate to first-class SDNodes
  (eg  ``llvm.vp.fdiv.* -> vp_fdiv``).
- VP legalization (legalize explicit vector length to mask (AVX512), legalize VP
  SDNodes to pre-existing ones (SSE, NEON)).

Result: Backend development based on VP SDNodes.

3. Lift InstSimplify/InstCombine/DAGCombiner to VP
--------------------------------------------------

- Introduce PredicatedInstruction, PredicatedBinaryOperator, .. helper classes
  that match standard vector IR and VP intrinsics.
- Add a matcher context to PatternMatch and context-aware IR Builder APIs.
- Incrementally lift DAGCombiner to work on VP SDNodes as well as on regular
  vector instructions.
- Incrementally lift InstCombine/InstSimplify to operate on VP as well as
  regular IR instructions.

Result: Optimization of VP intrinsics on par with standard vector instructions.

4. Deprecate llvm.masked.* / llvm.experimental.reduce.*
-------------------------------------------------------

- Modernize llvm.masked.* / llvm.experimental.reduce* by translating to VP.
- DCE transitional APIs.

Result: VP has superseded earlier vector intrinsics.

5. Predicated IR Instructions
-----------------------------

- Vector instructions have an optional mask and vector length parameter. These
  lower to VP SDNodes (from Stage 2).
- Phase out VP intrinsics, only keeping those that are not equivalent to
  vectorized scalar instructions (reduce,  shuffles, ..)
- InstCombine/InstSimplify expect predication in regular Instructions (Stage (3)
  has laid the groundwork).

Result: Native vector predication in IR.

References
==========

.. [MaskedIR] `llvm.masked.*` intrinsics,
   https://llvm.org/docs/LangRef.html#masked-vector-load-and-store-intrinsics

.. [VPRFC] RFC: Prototype & Roadmap for vector predication in LLVM,
   https://reviews.llvm.org/D57504