docs/nvk: Add a list of external hardware docs

Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37828>
author: Mel Henning <mhenning@darkrefraction.com> 2025-10-10 17:45:56 -0400
committer: Marge Bot <marge-bot@fdo.invalid> 2025-10-29 20:08:38 +0000
commit: 0afd4bc831409a61546f0884bef53e6526611e13 (patch)
tree: 729d417464c101dd21a7fe429f10ef38bcee8d92 /docs
parent: 6da54821dab9aba6be3be1d6c5dba83dc806dfd7 (diff)
2 files changed, 140 insertions, 6 deletions
diff --git a/docs/drivers/nvk.rst b/docs/drivers/nvk.rst
index 494d69bf840..354dd955328 100644
--- a/docs/drivers/nvk.rst
+++ b/docs/drivers/nvk.rst
@@ -75,10 +75,10 @@ specific to NVK:
    poorly tested or completely broken.  This is intended for developer use
    only.
 
-Hardware Documentation
-----------------------
+Developer info
+--------------
 
-What little documentation we have can be found in the `NVIDIA open-gpu-doc
-repository <https://github.com/NVIDIA/open-gpu-doc>`__.  The majority of
-our documentation comes in the form of class headers which describe the
-class state registers.
+.. toctree::
+   :glob:
+
+   nvk/*
diff --git a/docs/drivers/nvk/external_hardware_docs.rst b/docs/drivers/nvk/external_hardware_docs.rst
new file mode 100644
index 00000000000..1ba2f451d26
--- /dev/null
+++ b/docs/drivers/nvk/external_hardware_docs.rst
@@ -0,0 +1,134 @@
+
+External Hardware Documentation and Resources
+=============================================
+
+Information about hardware behavior comes from a mix of official and
+reverse-engineered sources.
+
+Command buffers
+^^^^^^^^^^^^^^^
+
+ * `NVIDIA open-gpu-doc repository`_ is official documentation from NVIDIA that
+   has been released to the public. The majority of this documentation comes in
+   the form of class headers which describe the class state registers.
+
+ * `NVIDIA open-gpu-kernel-modules repository`_ is the open-source kernel mode
+   driver that NVIDIA ships on Turing+ GPUs with GSP. The code here can provide
+   examples of how to use some hardware features. If open-gpu-doc is missing a
+   class header, sometimes there will be one here.
+
+ * Reverse-engineered command names from `envytools`_ are available in mesa
+   under eg. ``src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h``. These are no
+   longer updated. nvk instead uses the open-gpu-doc headers
+
+ * `envyhooks`_ is the modern way to dump command sequences from the proprietary
+   driver
+
+ * ``nv_push_dump`` is part of mesa and can disassemble command sequences (build
+   with ``-D tools=nouveau``, run ``src/nouveau/headers/nv_push_dump`` from the
+   build dir)
+
+ .. _NVIDIA open-gpu-doc repository: https://github.com/NVIDIA/open-gpu-doc
+ .. _NVIDIA open-gpu-kernel-modules repository: https://github.com/NVIDIA/open-gpu-kernel-modules
+ .. _envyhooks: https://gitlab.freedesktop.org/nouveau/envyhooks
+
+Shader ISA
+^^^^^^^^^^
+
+ * `NVIDIA PTX documentation`_ is NVIDIA documentation for CUDA's
+   intermediate representation. We don't use PTX directly, but this often has
+   hints about how underlying hardware instructions work. For example, the PTX
+   `redux` instruction is pretty much identical to the hardware instruction of
+   the same name.
+
+ * `CUDA Binary Utilities`_ is documentation for CUDA's disassembler,
+   `nvdisasm`. It includes a brief description of most hardware instructions.
+   There's also an `older version`_ that has older architectures (Kepler through
+   Volta).
+
+ * Kuter Dinel has reverse-engineered instruction encodings for the `Hopper
+   ISA`_ and `Ada ISA`_ which are autogenerated from his `nv_isa_solver`_
+   project.
+
+ * `nv-shader-tools`_ has some additional tools for disassembling and fuzzing
+   the hardware ISA
+
+ * Mel has dumped a `list of avaiable instructions`_ and their opcodes on recent
+   architectures by scraping nvdisasm error messages.
+
+ * The `Volta whitepaper`_ section "Independent Thread Scheduling" has an
+   overview of the control flow model used on Volta+ GPUs.
+
+ * `Dissecting the NVidia Turing T4 GPU via Microbenchmarking`_ has
+   reverse-engineered info about the Turing instruction encoding. See especially
+   section "2.1 Control information" for an overview of compiler-inserted delays
+   and waits on Maxwell and later.
+
+ * `Analyzing Modern NVIDIA GPU cores`_ has additional reverse-engineered info
+   about the semantics of compiler-inserted delays and waits.
+
+ * `Control Flow Management in Modern GPUs`_ has more detail about control flow
+   reconvergence on Volta+
+
+ * `maxas`_ has some reverse-engineered info on the Maxwell ISA
+
+ * `asfermi`_ has some reverse-engineered info on the older Fermi ISA
+
+ * Red Hat has some NDA'd documentation on instruction latencies from NVIDIA.
+   Bother karolherbst or airlied on irc if you're missing a latency class for an
+   instruction on recent architectures.
+
+ * Behavior of instructions are tested using the hardware tests in
+   ``src/nouveau/compiler/nak/hw_tests.rs`` and the corresponding ``Foldable``
+   implementations in ``src/nouveau/compiler/nak/ir.rs`` (build with ``-D
+   build-tests=true`` and run ``src/nouveau/compiler/nak hw_tests`` from the
+   build dir)
+
+ * NAK's instruction encodings are tested against nvdisasm using
+   ``src/nouveau/compiler/nak/nvdisasm_tests.rs`` (build with ``-D
+   build-tests=true`` and run ``src/nouveau/compiler/nak nvdisasm_tests`` from
+   the build dir)
+
+ * The old GL driver's compiler, under ``src/gallium/drivers/nouveau/codegen``,
+   has some information. This is especially useful for graphics-only
+   instructions, which are often not covered by other sources.
+
+ * `Compiler explorer`_ is a convenient tool to see what assembly NVIDIA
+   generates for a given CUDA program.
+
+ .. _NVIDIA PTX documentation: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html
+ .. _CUDA Binary Utilities: https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#instruction-set-reference
+ .. _older version: https://docs.nvidia.com/cuda/archive/11.8.0/cuda-binary-utilities/index.html#instruction-set-ref
+ .. _Hopper ISA: https://kuterdinel.com/nv_isa/
+ .. _Ada ISA: https://kuterdinel.com/nv_isa_sm89/
+ .. _nv_isa_solver: https://github.com/kuterd/nv_isa_solver
+ .. _nv-shader-tools: https://gitlab.freedesktop.org/nouveau/nv-shader-tools
+ .. _list of avaiable instructions: https://gitlab.freedesktop.org/mhenning/re/-/tree/main/opclass?ref_type=heads
+ .. _Volta whitepaper: https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf
+ .. _Dissecting the NVidia Turing T4 GPU via Microbenchmarking: https://arxiv.org/pdf/1903.07486
+ .. _Analyzing Modern NVIDIA GPU cores: https://arxiv.org/pdf/2503.20481
+ .. _Control Flow Management in Modern GPUs: https://arxiv.org/pdf/2407.02944
+ .. _maxas: https://github.com/NervanaSystems/maxas/wiki
+ .. _asfermi: https://github.com/hyqneuron/asfermi/wiki
+ .. _Compiler explorer: https://godbolt.org/z/1jrfhq5G7
+
+Misc
+^^^^
+
+ * `envytools`_ has reverse-engineered documentation for maxwell and earlier
+   hardware.
+ * The nvidia architecture whitepapers give a basic overview of what has changed
+   between hardware revisions. See eg. the `Blackwell whitepaper`_
+ * The nvidia architecture tuning guides often mention how details of a hardware
+   generation has changed, often with information about the memory subsystem or
+   occupancy. See eg. the `Blackwell tuning guide`_
+ * `The Nouveau wiki's CodeNames page`_ is useful for mapping NVIDIA marketing
+   names to engineering names
+ * `Matching CUDA arch and CUDA gencode for various NVIDIA architectures`_ has a
+   useful table comparing SM versions to engineering names
+
+ .. _envytools: https://envytools.readthedocs.io/en/latest/hw/index.html
+ .. _Blackwell whitepaper: https://images.nvidia.com/aem-dam/Solutions/geforce/blackwell/nvidia-rtx-blackwell-gpu-architecture.pdf
+ .. _Blackwell tuning guide: https://docs.nvidia.com/cuda/blackwell-tuning-guide/index.html
+ .. _The Nouveau wiki's CodeNames page: https://nouveau.freedesktop.org/CodeNames.html
+ .. _Matching CUDA arch and CUDA gencode for various NVIDIA architectures: https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/
author	Mel Henning <mhenning@darkrefraction.com>	2025-10-10 17:45:56 -0400
committer	Marge Bot <marge-bot@fdo.invalid>	2025-10-29 20:08:38 +0000
commit	0afd4bc831409a61546f0884bef53e6526611e13 (patch)
tree	729d417464c101dd21a7fe429f10ef38bcee8d92 /docs
parent	6da54821dab9aba6be3be1d6c5dba83dc806dfd7 (diff)