File: GATHER.md

package info (click to toggle)
intel-graphics-compiler 1.0.12504.6-1%2Bdeb12u1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 83,912 kB
  • sloc: cpp: 910,147; lisp: 202,655; ansic: 15,197; python: 4,025; yacc: 2,241; lex: 1,570; pascal: 244; sh: 104; makefile: 25
file content (82 lines) | stat: -rw-r--r-- 2,248 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
<!---======================= begin_copyright_notice ============================

Copyright (C) 2020-2021 Intel Corporation

SPDX-License-Identifier: MIT

============================= end_copyright_notice ==========================-->

 

## Opcode

  GATHER = 0x39

## Format

| | | | | | | |
| --- | --- | --- | --- | --- | --- | --- |
| 0x39(GATHER) | Elt_size | Is_modified | Num_elts | Surface | Global_offset | Element_offset |
|              | Dst      |             |          |         |               |                |


## Semantics




                    for (i = 0; i < exec_size; ++i) {
                      if (ChEn[i]) {
                        dst[i] = surface[global_offset+element_offset[i]]; //1, 2, or 4 byte
                      }
                    }

## Description


    Performs 1, 8, or 16 element scattered read from <surface> and stores the result into <dst>.

- **Elt_size(ub):** 
 
  - Bit[1..0]: encodes the byte size of each element
 
    - 0b00:  1 byte 
    - 0b01:  2 bytes 
    - 0b10:  4 bytes
- **Is_modified(ub):** The field is ignored, the read always return the last write from this thread

- **Num_elts(ub):** 
 
  - Bit[1..0]: encodes the number of elements that will be read
 
    - 0b00:  8 elements 
    - 0b01:  16 elements 
    - 0b10:  1 element 
  - Bit[7..4]: encodes the execution mask as described in Table 4.

- **Surface(ub):** Index of the surface variable. It must be a buffer. Valid values are:
 
  - 0: T0 - Shared Local Memory (SLM) access 
  - 5: T255 - Stateless surface access
- **Global_offset(scalar):** The global offset of all elements, in the unit of element size. Must have type UD

- **Element_offset(raw_operand):** The first Num_elts elements will be used as the offsets (after adding the global offset) into the surface, and they are in the unit of element size. Must have type UD

- **Dst(raw_operand):** The variable storing the results of the read. The first num_elts elements will be written to. For 1 and 2 byte accesses the upper bytes have undefined values. Must have type UD,D,F

#### Properties
- **Out-of-bound Access:** On read: zeros are returned. 


## Text
```
    

		GATHER.<elt_size> <surface> <global_offset> <element_offset> <dst>
```



## Notes