File: SCATTER4_TYPED.md

package info (click to toggle)
intel-graphics-compiler 1.0.17791.18-1
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 102,312 kB
  • sloc: cpp: 935,343; lisp: 286,143; ansic: 16,196; python: 3,279; yacc: 2,487; lex: 1,642; pascal: 300; sh: 174; makefile: 27
file content (143 lines) | stat: -rw-r--r-- 4,922 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
<!---======================= begin_copyright_notice ============================

Copyright (C) 2020-2022 Intel Corporation

SPDX-License-Identifier: MIT

============================= end_copyright_notice ==========================-->

## Opcode

  SCATTER4_TYPED = 0x4c

## Format

| | | | | | | |
| --- | --- | --- | --- | --- | --- | --- |
| 0x4c(SCATTER4_TYPED) | Exec_size | Pred | Channels | Surface | U | V |
|                      | R         | LOD  | Src      |         |   |   |


## Semantics


```

      UD ch_pos = 0;
      for (c = 0; c < 4; ++c) {
        if (ch_mask[c] == 1) {
          for (i = 0; i < exec_size; ++i) {
            if (ChEn[i]) {
              UD ch_start = ch_pos * max(exec_size, GRF_SIZE / 4);  // GRF_SIZE is the register size in bytes
              *(&surface[u][v][r] + c) = src[ch_start+i]; //one pixel, with type conversion
            }
          }
          ch_pos++;
        }
      }
```

## Description





```
    Performs <exec_size>*<num_enabled_channels> dword scattered write into <surface>, using the values from <src>.
```


- **Exec_size(ub):** Execution size

  - Bit[2..0]: size of the region for source and destination operands

    - 0b011:  8 elements
  - Bit[7..4]: execution mask (explicit control over the enabled channels)

    - 0b0000:  M1
    - 0b0001:  M2
    - 0b0010:  M3
    - 0b0011:  M4
    - 0b0100:  M5
    - 0b0101:  M6
    - 0b0110:  M7
    - 0b0111:  M8
    - 0b1000:  M1_NM
    - 0b1001:  M2_NM
    - 0b1010:  M3_NM
    - 0b1011:  M4_NM
    - 0b1100:  M5_NM
    - 0b1101:  M6_NM
    - 0b1110:  M7_NM
    - 0b1111:  M8_NM

- **Pred(uw):** Predication control


- **Channels(ub):**

  - Bit[3..0]: determines the write masks for the RGBA channel, with R being bit 0 and A bit 3. At least one channel must be enabled (i.e., "0000" is not allowed)


- **Surface(ub):** Index of the surface variable.  It must be a 1D, 2D, or 3D surface.

            - T0 (SLM): no
            - T5 (stateless): no

- **U(raw_operand):** The first exec_size elements contain the U offset. Must have type UD


- **V(raw_operand):** The first exec_size elements contain the V offset. Must have type UD


- **R(raw_operand):** The first exec_size elements contain the R offset. Must have type UD


- **LOD(raw_operand):** The first exec_size elements contain the LOD. Must have type UD


- **Src(raw_operand):**  The values to be written. For each enabled channel in RGBA order, exec_size elements will be written to the surface subject to predication. The next enabled channel will get its data from the next register. Must have type UD,D,F


#### Properties
- **Out-of-bound Access:** On write: data is dropped.




## Text
```



    [(<P>)] SCATTER4_TYPED.<channels> (<exec_size>) <surface> <u> <v> <r> <lod> <src>

    //<channels> is one of R, G, B, A, RG, RB, RA, RGB, RGBA, GB, GA, GBA, BA
```
## Notes






    .. table:: Type conversion from register to surface is performed based on the following rules:
      :align: center

      +---------------------+------------------------+--------------------------------------------------------------------------------------------------+
      | Src Data Type       | Surface Format Type    | Write Conversion                                                                                 |
      +---------------------+------------------------+--------------------------------------------------------------------------------------------------+
      |         F           |           FLOAT        |IEEE float conversion. Round to even and denormalize if destination is narrower.                  |
      +---------------------+------------------------+--------------------------------------------------------------------------------------------------+
      |         F           |     SNORM, UNORM       |Convert IEEE float to fixed point. Round to even and clamp to min/max if destination is narrower. |
      +---------------------+------------------------+--------------------------------------------------------------------------------------------------+
      |         D           |         SINT           |Clamp to min/max if destination is narrower.                                                      |
      +---------------------+------------------------+--------------------------------------------------------------------------------------------------+
      |         UD          |         UINT           |Clamp to min/max if destination is narrower.                                                      |
      +---------------------+------------------------+--------------------------------------------------------------------------------------------------+


    The behavior is undefined if more than one channel writes to the same address.
    If an offset operand is not applicable for the surface accessed (e.g., V and R for 1D surfaces), it should be set to V0 (the null variable).