File: SCATTER_SCALED.md

package info (click to toggle)
intel-graphics-compiler 1.0.12504.6-1%2Bdeb12u1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 83,912 kB
  • sloc: cpp: 910,147; lisp: 202,655; ansic: 15,197; python: 4,025; yacc: 2,241; lex: 1,570; pascal: 244; sh: 104; makefile: 25
file content (105 lines) | stat: -rw-r--r-- 2,803 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
<!---======================= begin_copyright_notice ============================

Copyright (C) 2020-2021 Intel Corporation

SPDX-License-Identifier: MIT

============================= end_copyright_notice ==========================-->

 

## Opcode

  SCATTER_SCALED = 0x79

## Format

| | | | | | | |
| --- | --- | --- | --- | --- | --- | --- |
| 0x79(SCATTER_SCALED) | Exec_size | Pred           | Block_size | Num_blocks | Scale | Surface |
|                      | Offset    | Element_offset | Src        |            |       |         |


## Semantics




      for (i = 0; i < exec_size; ++i) {
        if (ChEn[i]) {
          surface[offset+element_offset[i]] = src[i]; //<num_blocks> bytes
        }
      }

## Description


    Performs <exec_size> element scattered write into <surface> using the values from <src>. For each channel <num_blocks> bytes are written. The address to be written is determined by the offset and each element offset.

- **Exec_size(ub):** Execution size
 
  - Bit[2..0]: size of the region for source and destination operands
 
    - 0b000:  1 element (scalar) 
    - 0b001:  2 elements 
    - 0b010:  4 elements 
    - 0b011:  8 elements 
    - 0b100:  16 elements 
    - 0b101:  32 elements 
  - Bit[7..4]: execution mask (explicit control over the enabled channels)
 
    - 0b0000:  M1 
    - 0b0001:  M2 
    - 0b0010:  M3 
    - 0b0011:  M4 
    - 0b0100:  M5 
    - 0b0101:  M6 
    - 0b0110:  M7 
    - 0b0111:  M8 
    - 0b1000:  M1_NM 
    - 0b1001:  M2_NM 
    - 0b1010:  M3_NM 
    - 0b1011:  M4_NM 
    - 0b1100:  M5_NM 
    - 0b1101:  M6_NM 
    - 0b1110:  M7_NM 
    - 0b1111:  M8_NM
- **Pred(uw):** Predication control

- **Block_size(ub):** This field is ignored and block size is always one byte

- **Num_blocks(ub):** 
 
  - Bit[1..0]: encodes the number of blocks to read at each address.
 
    - 0b00:  1 block 
    - 0b01:  2 blocks 
    - 0b10:  4 blocks
- **Scale(uw):** This field is ignored and scale is always zero

- **Surface(ub):** Index of the surface variable.  When scale is non-zero, only T0 (SLM) and T5 (stateless) are supported

- **Offset(scalar):** The global byte offset. Must have type UD

- **Element_offset(raw_operand):** The first Exec_size elements of the operand will be used as the byte offsets into the surface. Must have type UD

- **Src(raw_operand):** The variable providing the values to be written. The first exec_size elements will be used. For 1 and 2 byte accesses the upper bytes will be ignored. Must have type UD,D,F

#### Properties
- **Out-of-bound Access:** On write: data is dropped.


## Text
```
    

		[(<P>)] SCATTER_SCALED.<num_blocks> (<exec_size>) <surface> <offset> <element_offset> <src>
```



## Notes



    The behavior is undefined if more than one channel writes to the same address.