Updates in 2022.1
General
- Added support for the CUDA toolkit 11.6. 
- Added support for GA103 chips. 
- Added a new Range Replay mode to profile ranges of multiple, concurrent kernels. Range replay is available in the NVIDIA Nsight Compute CLI and the non-interactive Profile activity. 
- Added a new rule to detect non-fused floating-point instructions. 
- The Uncoalesced Memory access rules now show results in a dynamic table. 
- Unix Domain Sockets and Windows Named Pipes are used for local connection between the host and target processes on x86_64 Linux and Windows, respectively. 
- The NvRules API now supports querying action names using different function name bases (e.g. demangled). 
NVIDIA Nsight Compute
- The default report page is now chosen automatically when opening a report. 
- Added coverage for ECC (Error Correction Code) operations in the L2 Cache table of the Memory Analysis section. 
- Added a new L2 Evict Policies table to the Memory Analysis section. 
- The Occupancy Calculator now updates automatically when the input changes. 
- Added new metric Thread Instructions Executed to the Source page. 
- Added tooltips to the Register Dependency columns in the Source page to identify the associated register more conveniently. 
- Improved the selection of Sections and Sets in the Profile activity connection dialog. 
- NVLink utilization is shown in the NVLink Tables section. 
- NVLink links are colored according to the measured throughput. 
NVIDIA Nsight Compute CLI
- --kernel-regexand- --kernel-regex-baseoptions are no longer supported. Alternate options are- --kernel-nameand- --kernel-name-baserespectively, added in 2021.1.0.
- Added support to resolve CUDA source files in the - --page sourceoutput with the new- --resolve-source-filecommand line option.
- Added new option - --target-processes-filterto filter the processes being profiled by name.
- The CPU Stack Trace is shown in the NVIDIA Nsight Compute CLI output. 
Resolved Issues
- Fixed the calculation of aggregated average instruction execution metrics in non-SASS views on the Source page. 
- Fixed that atomic instructions are counted as both loads and stores in the Memory Analysis tables.