File: external_hardware_docs.rst

package info (click to toggle)
mesa 26.0.1-2
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 326,172 kB
  • sloc: ansic: 2,260,907; xml: 1,035,283; cpp: 528,081; python: 83,456; asm: 40,568; yacc: 12,040; lisp: 3,663; lex: 3,461; sh: 1,035; makefile: 223
file content (134 lines) | stat: -rw-r--r-- 6,775 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134

External Hardware Documentation and Resources
=============================================

Information about hardware behavior comes from a mix of official and
reverse-engineered sources.

Command buffers
^^^^^^^^^^^^^^^

 * `NVIDIA open-gpu-doc repository`_ is official documentation from NVIDIA that
   has been released to the public. The majority of this documentation comes in
   the form of class headers which describe the class state registers.

 * `NVIDIA open-gpu-kernel-modules repository`_ is the open-source kernel mode
   driver that NVIDIA ships on Turing+ GPUs with GSP. The code here can provide
   examples of how to use some hardware features. If open-gpu-doc is missing a
   class header, sometimes there will be one here.

 * Reverse-engineered command names from `envytools`_ are available in mesa
   under eg. ``src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h``. These are no
   longer updated. nvk instead uses the open-gpu-doc headers

 * `envyhooks`_ is the modern way to dump command sequences from the proprietary
   driver

 * ``nv_push_dump`` is part of mesa and can disassemble command sequences (build
   with ``-D tools=nouveau``, run ``src/nouveau/headers/nv_push_dump`` from the
   build dir)

 .. _NVIDIA open-gpu-doc repository: https://github.com/NVIDIA/open-gpu-doc
 .. _NVIDIA open-gpu-kernel-modules repository: https://github.com/NVIDIA/open-gpu-kernel-modules
 .. _envyhooks: https://gitlab.freedesktop.org/nouveau/envyhooks

Shader ISA
^^^^^^^^^^

 * `NVIDIA PTX documentation`_ is NVIDIA documentation for CUDA's
   intermediate representation. We don't use PTX directly, but this often has
   hints about how underlying hardware instructions work. For example, the PTX
   `redux` instruction is pretty much identical to the hardware instruction of
   the same name.

 * `CUDA Binary Utilities`_ is documentation for CUDA's disassembler,
   `nvdisasm`. It includes a brief description of most hardware instructions.
   There's also an `older version`_ that has older architectures (Kepler through
   Volta).

 * Kuter Dinel has reverse-engineered instruction encodings for the `Hopper
   ISA`_ and `Ada ISA`_ which are autogenerated from his `nv_isa_solver`_
   project.

 * `nv-shader-tools`_ has some additional tools for disassembling and fuzzing
   the hardware ISA

 * Mel has dumped a `list of avaiable instructions`_ and their opcodes on recent
   architectures by scraping nvdisasm error messages.

 * The `Volta whitepaper`_ section "Independent Thread Scheduling" has an
   overview of the control flow model used on Volta+ GPUs.

 * `Dissecting the NVidia Turing T4 GPU via Microbenchmarking`_ has
   reverse-engineered info about the Turing instruction encoding. See especially
   section "2.1 Control information" for an overview of compiler-inserted delays
   and waits on Maxwell and later.

 * `Analyzing Modern NVIDIA GPU cores`_ has additional reverse-engineered info
   about the semantics of compiler-inserted delays and waits.

 * `Control Flow Management in Modern GPUs`_ has more detail about control flow
   reconvergence on Volta+

 * `maxas`_ has some reverse-engineered info on the Maxwell ISA

 * `asfermi`_ has some reverse-engineered info on the older Fermi ISA

 * Red Hat has some NDA'd documentation on instruction latencies from NVIDIA.
   Bother karolherbst or airlied on irc if you're missing a latency class for an
   instruction on recent architectures.

 * Behavior of instructions are tested using the hardware tests in
   ``src/nouveau/compiler/nak/hw_tests.rs`` and the corresponding ``Foldable``
   implementations in ``src/nouveau/compiler/nak/ir.rs`` (build with ``-D
   build-tests=true`` and run ``src/nouveau/compiler/nak hw_tests`` from the
   build dir)

 * NAK's instruction encodings are tested against nvdisasm using
   ``src/nouveau/compiler/nak/nvdisasm_tests.rs`` (build with ``-D
   build-tests=true`` and run ``src/nouveau/compiler/nak nvdisasm_tests`` from
   the build dir)

 * The old GL driver's compiler, under ``src/gallium/drivers/nouveau/codegen``,
   has some information. This is especially useful for graphics-only
   instructions, which are often not covered by other sources.

 * `Compiler explorer`_ is a convenient tool to see what assembly NVIDIA
   generates for a given CUDA program.

 .. _NVIDIA PTX documentation: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html
 .. _CUDA Binary Utilities: https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#instruction-set-reference
 .. _older version: https://docs.nvidia.com/cuda/archive/11.8.0/cuda-binary-utilities/index.html#instruction-set-ref
 .. _Hopper ISA: https://kuterdinel.com/nv_isa/
 .. _Ada ISA: https://kuterdinel.com/nv_isa_sm89/
 .. _nv_isa_solver: https://github.com/kuterd/nv_isa_solver
 .. _nv-shader-tools: https://gitlab.freedesktop.org/nouveau/nv-shader-tools
 .. _list of avaiable instructions: https://gitlab.freedesktop.org/mhenning/re/-/tree/main/opclass?ref_type=heads
 .. _Volta whitepaper: https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf
 .. _Dissecting the NVidia Turing T4 GPU via Microbenchmarking: https://arxiv.org/pdf/1903.07486
 .. _Analyzing Modern NVIDIA GPU cores: https://arxiv.org/pdf/2503.20481
 .. _Control Flow Management in Modern GPUs: https://arxiv.org/pdf/2407.02944
 .. _maxas: https://github.com/NervanaSystems/maxas/wiki
 .. _asfermi: https://github.com/hyqneuron/asfermi/wiki
 .. _Compiler explorer: https://godbolt.org/z/1jrfhq5G7

Misc
^^^^

 * `envytools`_ has reverse-engineered documentation for maxwell and earlier
   hardware.
 * The nvidia architecture whitepapers give a basic overview of what has changed
   between hardware revisions. See eg. the `Blackwell whitepaper`_
 * The nvidia architecture tuning guides often mention how details of a hardware
   generation has changed, often with information about the memory subsystem or
   occupancy. See eg. the `Blackwell tuning guide`_
 * `The Nouveau wiki's CodeNames page`_ is useful for mapping NVIDIA marketing
   names to engineering names
 * `Matching CUDA arch and CUDA gencode for various NVIDIA architectures`_ has a
   useful table comparing SM versions to engineering names

 .. _envytools: https://envytools.readthedocs.io/en/latest/hw/index.html
 .. _Blackwell whitepaper: https://images.nvidia.com/aem-dam/Solutions/geforce/blackwell/nvidia-rtx-blackwell-gpu-architecture.pdf
 .. _Blackwell tuning guide: https://docs.nvidia.com/cuda/blackwell-tuning-guide/index.html
 .. _The Nouveau wiki's CodeNames page: https://nouveau.freedesktop.org/CodeNames.html
 .. _Matching CUDA arch and CUDA gencode for various NVIDIA architectures: https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/