File: updates-2020-3.html

package info (click to toggle)
nvidia-cuda-toolkit 12.4.1-2
  • links: PTS, VCS
  • area: non-free
  • in suites: trixie
  • size: 18,505,836 kB
  • sloc: ansic: 203,477; cpp: 64,769; python: 34,699; javascript: 22,006; xml: 13,410; makefile: 3,085; sh: 2,343; perl: 352
file content (186 lines) | stat: -rw-r--r-- 10,370 bytes parent folder | download | duplicates (6)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
<!DOCTYPE html>
<html class="writer-html5" lang="en" >
<head>
  <meta charset="utf-8" /><meta name="generator" content="Docutils 0.17.1: http://docutils.sourceforge.net/" />

  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>Updates in 2020.3 &mdash; NsightCompute 12.4 documentation</title>
      <link rel="stylesheet" href="../../_static/pygments.css" type="text/css" />
      <link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
      <link rel="stylesheet" href="../../_static/design-style.b7bb847fb20b106c3d81b95245e65545.min.css" type="text/css" />
      <link rel="stylesheet" href="../../_static/omni-style.css" type="text/css" />
      <link rel="stylesheet" href="../../_static/api-styles.css" type="text/css" />
    <link rel="shortcut icon" href="../../_static/nsight-compute.ico"/>
  <!--[if lt IE 9]>
    <script src="../../_static/js/html5shiv.min.js"></script>
  <![endif]-->
  
        <script data-url_root="../../" id="documentation_options" src="../../_static/documentation_options.js"></script>
        <script src="../../_static/jquery.js"></script>
        <script src="../../_static/underscore.js"></script>
        <script src="../../_static/doctools.js"></script>
        <script src="../../_static/mermaid-init.js"></script>
        <script src="../../_static/design-tabs.js"></script>
        <script src="../../_static/version.js"></script>
        <script src="../../_static/social-media.js"></script>
    <script src="../../_static/js/theme.js"></script>
    <link rel="index" title="Index" href="../../genindex.html" />
    <link rel="search" title="Search" href="../../search.html" />
 


</head>

<body class="wy-body-for-nav"> 
  <div class="wy-grid-for-nav">
    <nav data-toggle="wy-nav-shift" class="wy-nav-side">
      <div class="wy-side-scroll">
        <div class="wy-side-nav-search" >


  <a href="../../index.html">
  <img src="../../_static/nsight-compute.png" class="logo" alt="Logo"/>
</a>

<div role="search">
  <form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
    <input type="text" name="q" placeholder="Search docs" />
    <input type="hidden" name="check_keywords" value="yes" />
    <input type="hidden" name="area" value="default" />
  </form>
</div>
        </div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
              <p class="caption" role="heading"><span class="caption-text">Nsight Compute</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../index.html">1. Release Notes</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../ProfilingGuide/index.html">2. Kernel Profiling Guide</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../NsightCompute/index.html">3. Nsight Compute</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../NsightComputeCli/index.html">4. Nsight Compute CLI</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Developer Interfaces</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../CustomizationGuide/index.html">1. Customization Guide</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../NvRulesAPI/index.html">2. NvRules API</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Training</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../Training/index.html">Training</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Release Information</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../Archives/index.html">Archives</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Copyright and Licenses</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../../CopyrightAndLicenses/index.html">Copyright and Licenses</a></li>
</ul>

        </div>
      </div>
    </nav>

    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
          <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
          <a href="../../index.html">NsightCompute</a>
      </nav>

      <div class="wy-nav-content">
        <div class="rst-content">
          <div role="navigation" aria-label="Page navigation">
  <ul class="wy-breadcrumbs">


<li><a href="../../index.html" class="icon icon-home"></a> &raquo;</li>
<li>Updates in 2020.3</li>

      <li class="wy-breadcrumbs-aside">
      </li>
<li class="wy-breadcrumbs-aside">


  <span>v2024.1.1 |</span>



  <a href="https://developer.nvidia.com/nsight-compute-history" class="reference external">Archive</a>


  <span>&nbsp;</span>
</li>

  </ul>
  <hr/>
</div>
          <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
           <div itemprop="articleBody">
             
  <section id="updates-in-2020-3">
<h1>Updates in 2020.3<a class="headerlink" href="#updates-in-2020-3" title="Permalink to this headline"></a></h1>
<p><strong>General</strong></p>
<ul class="simple">
<li><p>Added support for <em>derived metrics</em> in section files. Derived metrics can be used to create new metrics based on existing metrics and constants. See the <a class="reference external" href="../CustomizationGuide/index.html#section-derived-metrics">Customization Guide</a> for details.</p></li>
<li><p>Added a new <em>Import Source</em> (<code class="docutils literal notranslate"><span class="pre">--import-source</span></code>) option to the UI and command line to permanently import source files into the report, when available.</p></li>
<li><p>Added a new section that shows selected <em>NVLink</em> metrics on supported systems.</p></li>
<li><p>Added a new <code class="docutils literal notranslate"><span class="pre">launch__func_cache_config</span></code> metric to the <em>Launch Statistics</em> section.</p></li>
<li><p>Added new branch efficiency metrics to the <em>Source Counters</em> section, including <code class="docutils literal notranslate"><span class="pre">smsp__sass_average_branch_targets_threads_uniform.pct</span></code> to replace nvprof’s <code class="docutils literal notranslate"><span class="pre">branch_efficiency</span></code>, as well as instruction-level metrics <code class="docutils literal notranslate"><span class="pre">smsp__branch_targets_threads_divergent</span></code>, <code class="docutils literal notranslate"><span class="pre">smsp__branch_targets_threads_uniform</span></code> and <code class="docutils literal notranslate"><span class="pre">branch_inst_executed</span></code>.</p></li>
<li><p>A warning is shown if kernel replay starts staging GPU memory to CPU memory or the file system.</p></li>
<li><p>Section and rule files are deployed to a versioned directory in the user’s home directory to allow easier editing of those files, and to prevent modifying the base installation.</p></li>
<li><p>Removed support for NVLINK(<code class="docutils literal notranslate"><span class="pre">nvl*</span></code>) metrics due to a potential application hang during data collection. The metrics will be added back in a future version of the driver/tool.</p></li>
</ul>
<p><strong>NVIDIA Nsight Compute</strong></p>
<ul class="simple">
<li><p>Added support for <em>Profile Series</em>. Series allow you to profile a kernel with a range of configurable parameters to analyze the performance of each combination.</p></li>
<li><p>Added a new <em>Allocations</em> view to the <em>Resources</em> tool window which shows the state of all current memory allocations.</p></li>
<li><p>Added a new <em>Memory Pools</em> view to the <em>Resources</em> tool window which shows the state of all current memory pools.</p></li>
<li><p>Added coverage of peer memory to the <em>Memory Chart</em>.</p></li>
<li><p>The <em>Source</em> page now shows the number of excessive sectors requested from L1 or L2, e.g. due to uncoalesced memory accesses.</p></li>
<li><p>The <em>Source</em> column on the <em>Source</em> page can now be scrolled horizontally.</p></li>
<li><p>The kernel duration <code class="docutils literal notranslate"><span class="pre">gpu__time_duration.sum</span></code> was added as column on the <em>Summary</em> page.</p></li>
<li><p>Improved the performance of <em>application replay</em> when not all kernels in the application are profiled.</p></li>
</ul>
<p><strong>NVIDIA Nsight Compute CLI</strong></p>
<ul class="simple">
<li><p>Added a new <code class="docutils literal notranslate"><span class="pre">--app-replay-match</span></code> option to select the mechanism used for matching kernel instances across application replay passes.</p></li>
<li><p>An error is shown if <code class="docutils literal notranslate"><span class="pre">--nvtx-include/exclude</span></code> are used without <code class="docutils literal notranslate"><span class="pre">--nvtx</span></code>.</p></li>
</ul>
<p><strong>Resolved Issues</strong></p>
<ul class="simple">
<li><p>The <em>Grid Size</em> column on the <em>Raw</em> page now shows the CUDA grid size like the <em>Launch Statistics</em> section, rather than the combined grid and block sizes.</p></li>
<li><p>The <em>Branch Resolving</em> wap stall reason was added to the PC sampling metric groups and the <em>Warp State Statistics</em> section.</p></li>
<li><p>The <em>API Stream</em> tool window shows kernel names according to the selected Function Name Mode.</p></li>
<li><p>Fixed that an incorrect line could be shown after a heatmap selection on the <em>Source</em> page.</p></li>
<li><p>Fixed incorrect metric usage for system memory in the <em>Memory Chart</em>. Previously, all requested memory of L2 from system memory was reported instead of only the portion that missed in L2.</p></li>
</ul>
</section>


           </div>
          </div>
          <footer>

  <hr/>

  <div role="contentinfo">
    <p>&#169; Copyright 2018-2024, NVIDIA Corporation &amp; Affiliates. All rights reserved.
      <span class="lastupdated">Last updated on Mar 06, 2024.
      </span></p>
  </div>

   

</footer>
        </div>
      </div>
    </section>
  </div>
  <script>
      jQuery(function () {
          SphinxRtdTheme.Navigation.enable(true);
      });
  </script>
 



</body>
</html>