File: FCVT.md

package info (click to toggle)
intel-graphics-compiler 1.0.17791.18-1
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 102,312 kB
  • sloc: cpp: 935,343; lisp: 286,143; ansic: 16,196; python: 3,279; yacc: 2,487; lex: 1,642; pascal: 300; sh: 174; makefile: 27
file content (103 lines) | stat: -rw-r--r-- 2,549 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
<!---======================= begin_copyright_notice ============================

Copyright (C) 2020-2022 Intel Corporation

SPDX-License-Identifier: MIT

============================= end_copyright_notice ==========================-->

## Opcode

  FCVT = 0x1d

## Format

| | | | |
| --- | --- | --- | --- |
| 0x1d(FCVT) | Exec_size | Dst | Src0 |


## Semantics


```

                    for (i = 0; i < exec_size; ++i) {
                      if (ChEn[i]) {    // ChEn[i] is always true if dst has FP8 type
                        dst[i] = src0[i];
                      }
                    }
```

## Description





```
    Perform type conversion between FP8 and HF from <src0> to <dst>. FP8 here is BF8, it is an 8-bit float with 1-bit sign, 5-bit exponent, and 2-bit mantissa, aka E5M2.  HF-to-BF8 conversion uses the RTE rounding mode (round-to-nearest-even), and denoms are retained. FP8-to-HF is a precise conversion, thus no rounding is involved. BF8 is denoted by type UB as visa has no BF8 type.

    {PVC_XT+}It also performs conversion from float to TF32 (tensorfloat, 1-bit sign, 8-bit exponent, and 10-bit mantissa). It uses RTE for float to TF32. Denorms are flushed to zero. No conversion from TF32 to float, as TF32 is a valid F type.


```


- **Exec_size(ub):** Execution size

  - Bit[2..0]: size of the region for source and destination operands

    - 0b000:  1 element (scalar)
    - 0b001:  2 elements
    - 0b010:  4 elements
    - 0b011:  8 elements
    - 0b100:  16 elements
    - 0b101:  32 elements
  - Bit[7..4]: execution mask (explicit control over the enabled channels)

    - 0b0000:  M1
    - 0b0001:  M2
    - 0b0010:  M3
    - 0b0011:  M4
    - 0b0100:  M5
    - 0b0101:  M6
    - 0b0110:  M7
    - 0b0111:  M8
    - 0b1000:  M1_NM
    - 0b1001:  M2_NM
    - 0b1010:  M3_NM
    - 0b1011:  M4_NM
    - 0b1100:  M5_NM
    - 0b1101:  M6_NM
    - 0b1110:  M7_NM
    - 0b1111:  M8_NM

- **Dst(vec_operand):** The destination operand. Operand class: general


- **Src0(vec_operand):** The first source operand. Operand class: general


#### Properties
- **Supported Types:** B,HF,UB.{PVC_XT+}F,UD
- **Source Modifier:** false




## Text
```
FCVT (<exec_size>) <dst> <src0>
```

## Notes





    - If Dst has HF type, Src0 must have UB type (which represents a BF8 value).
    - If Dst has UB type (which represents a BF8 value), Src0 must have HF type. NM (NoMask) mask control must be used.
    {PVC_XT+}- If Dst has UD type (as TF32 value), Src0 must have F type. Src0 can not be UD type (as TF32 value).