File: rsqrt.html

package info (click to toggle)
scipy 1.16.0-1exp7
  • links: PTS, VCS
  • area: main
  • in suites: experimental
  • size: 234,820 kB
  • sloc: cpp: 503,145; python: 344,611; ansic: 195,638; javascript: 89,566; fortran: 56,210; cs: 3,081; f90: 1,150; sh: 848; makefile: 785; pascal: 284; csh: 135; lisp: 134; xml: 56; perl: 51
file content (114 lines) | stat: -rw-r--r-- 14,158 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Reciprocal square root</title>
<link rel="stylesheet" href="../../math.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
<link rel="home" href="../../index.html" title="Math Toolkit 4.2.1">
<link rel="up" href="../powers.html" title="Basic Functions">
<link rel="prev" href="ct_pow.html" title="Compile Time Power of a Runtime Base">
<link rel="next" href="logaddexp.html" title="logaddexp and logsumexp">
<meta name="viewport" content="width=device-width, initial-scale=1">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table cellpadding="2" width="100%"><tr>
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../boost.png"></td>
<td align="center"><a href="../../../../../../index.html">Home</a></td>
<td align="center"><a href="../../../../../../libs/libraries.htm">Libraries</a></td>
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
<td align="center"><a href="../../../../../../more/index.htm">More</a></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="ct_pow.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../powers.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="logaddexp.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="math_toolkit.powers.rsqrt"></a><a class="link" href="rsqrt.html" title="Reciprocal square root">Reciprocal square root</a>
</h3></div></div></div>
<h5>
<a name="math_toolkit.powers.rsqrt.h0"></a>
        <span class="phrase"><a name="math_toolkit.powers.rsqrt.synopsis"></a></span><a class="link" href="rsqrt.html#math_toolkit.powers.rsqrt.synopsis">Synopsis</a>
      </h5>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">special_functions</span><span class="special">/</span><span class="identifier">rsqrt</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>

<span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span> <span class="special">{</span>

<span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">class</span> <span class="identifier">Real</span><span class="special">&gt;</span>
<span class="identifier">Real</span> <span class="identifier">rsqrt</span><span class="special">(</span><span class="identifier">Real</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">x</span><span class="special">);</span>

<span class="special">}</span> <span class="comment">// namespaces</span>
</pre>
<p>
        The function <code class="computeroutput"><span class="identifier">rsqrt</span></code> computes
        the reciprocal square root 1/√<span class="emphasis"><em>x</em></span>. Those in the game programming
        community might suspect this is a fast, low precision wrapper around the
        <a href="https://www.felixcloutier.com/x86/rsqrtss" target="_top">rsqrtss</a> instruction.
        This is not correct: We <span class="emphasis"><em>tried</em></span> this instruction, but
        found no performance benefit to using it. However, the <span class="emphasis"><em>trick</em></span>
        of computing a low precision reciprocal square root and then bootstrapping
        to higher precision via Newton's method <span class="emphasis"><em>does</em></span> work, but
        it only yields a performance benefit for quad and higher precision. We do
        of course allow you to use <code class="computeroutput"><span class="identifier">rsqrt</span></code>
        for <code class="computeroutput"><span class="keyword">float</span></code>, <code class="computeroutput"><span class="keyword">double</span></code>,
        and <code class="computeroutput"><span class="keyword">long</span> <span class="keyword">double</span></code>,
        but be aware there is no performance benefit to doing so. However, the savings
        for quad precision and higher are very significant.
      </p>
<p>
        The use is
      </p>
<pre class="programlisting"><span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">multiprecision</span><span class="special">::</span><span class="identifier">float128</span><span class="special">;</span>
<span class="identifier">float128</span> <span class="identifier">x</span> <span class="special">=</span> <span class="number">0.1</span><span class="identifier">Q</span><span class="special">;</span>
<span class="identifier">float128</span> <span class="identifier">y</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">rsqrt</span><span class="special">(</span><span class="identifier">x</span><span class="special">);</span>
</pre>
<p>
        The reciprocal square root of +∞ is zero, and the reciprocal square
        root of a NaN is a NaN.
      </p>
<p>
        <span class="inlinemediaobject"><object type="image/svg+xml" data="../../../graphs/rsqrt_quad_0_100.svg"></object></span>
      </p>
<p>
        Performance:
      </p>
<pre class="programlisting"><span class="identifier">Running</span> <span class="special">./</span><span class="identifier">reporting</span><span class="special">/</span><span class="identifier">performance</span><span class="special">/</span><span class="identifier">rsqrt_performance</span><span class="special">.</span><span class="identifier">x</span>
<span class="identifier">Run</span> <span class="identifier">on</span> <span class="special">(</span><span class="number">16</span> <span class="identifier">X</span> <span class="number">4300</span> <span class="identifier">MHz</span> <span class="identifier">CPU</span> <span class="identifier">s</span><span class="special">)</span>
<span class="identifier">CPU</span> <span class="identifier">Caches</span><span class="special">:</span>
  <span class="identifier">L1</span> <span class="identifier">Data</span> <span class="number">32</span> <span class="identifier">KiB</span> <span class="special">(</span><span class="identifier">x8</span><span class="special">)</span>
  <span class="identifier">L1</span> <span class="identifier">Instruction</span> <span class="number">32</span> <span class="identifier">KiB</span> <span class="special">(</span><span class="identifier">x8</span><span class="special">)</span>
  <span class="identifier">L2</span> <span class="identifier">Unified</span> <span class="number">1024</span> <span class="identifier">KiB</span> <span class="special">(</span><span class="identifier">x8</span><span class="special">)</span>
  <span class="identifier">L3</span> <span class="identifier">Unified</span> <span class="number">11264</span> <span class="identifier">KiB</span> <span class="special">(</span><span class="identifier">x1</span><span class="special">)</span>
<span class="identifier">Load</span> <span class="identifier">Average</span><span class="special">:</span> <span class="number">0.43</span><span class="special">,</span> <span class="number">0.49</span><span class="special">,</span> <span class="number">0.46</span>
<span class="special">----------------------------------------------------------------------------------</span>
<span class="identifier">Benchmark</span>                                        <span class="identifier">Time</span>             <span class="identifier">CPU</span>   <span class="identifier">Iterations</span>
<span class="special">----------------------------------------------------------------------------------</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="keyword">float</span><span class="special">&gt;</span>                                  <span class="number">1.35</span> <span class="identifier">ns</span>         <span class="number">1.35</span> <span class="identifier">ns</span>    <span class="number">503364351</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="keyword">double</span><span class="special">&gt;</span>                                 <span class="number">2.25</span> <span class="identifier">ns</span>         <span class="number">2.25</span> <span class="identifier">ns</span>    <span class="number">309753242</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="keyword">long</span> <span class="keyword">double</span><span class="special">&gt;</span>                            <span class="number">2.68</span> <span class="identifier">ns</span>         <span class="number">2.68</span> <span class="identifier">ns</span>    <span class="number">261382652</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">float128</span><span class="special">&gt;</span>                                <span class="number">182</span> <span class="identifier">ns</span>          <span class="number">182</span> <span class="identifier">ns</span>      <span class="number">3756956</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="number">100</span><span class="special">&gt;&gt;&gt;</span>         <span class="number">299</span> <span class="identifier">ns</span>          <span class="number">299</span> <span class="identifier">ns</span>      <span class="number">2494027</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="number">200</span><span class="special">&gt;&gt;&gt;</span>         <span class="number">412</span> <span class="identifier">ns</span>          <span class="number">412</span> <span class="identifier">ns</span>      <span class="number">1589284</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="number">300</span><span class="special">&gt;&gt;&gt;</span>         <span class="number">617</span> <span class="identifier">ns</span>          <span class="number">617</span> <span class="identifier">ns</span>      <span class="number">1067473</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="number">400</span><span class="special">&gt;&gt;&gt;</span>         <span class="number">812</span> <span class="identifier">ns</span>          <span class="number">812</span> <span class="identifier">ns</span>       <span class="number">830564</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="number">1000</span><span class="special">&gt;&gt;&gt;</span>       <span class="number">3183</span> <span class="identifier">ns</span>         <span class="number">3183</span> <span class="identifier">ns</span>       <span class="number">221079</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">cpp_bin_float_50</span><span class="special">&gt;</span>                       <span class="number">4321</span> <span class="identifier">ns</span>         <span class="number">4321</span> <span class="identifier">ns</span>       <span class="number">163243</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">cpp_bin_float_100</span><span class="special">&gt;</span>                      <span class="number">9393</span> <span class="identifier">ns</span>         <span class="number">9393</span> <span class="identifier">ns</span>        <span class="number">72967</span>
</pre>
</div>
<div class="copyright-footer">Copyright © 2006-2021 Nikhar Agrawal, Anton Bikineev, Matthew Borland,
      Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, Hubert Holin, Bruno
      Lalande, John Maddock, Evan Miller, Jeremy Murphy, Matthew Pulver, Johan Råde,
      Gautam Sewani, Benjamin Sobotta, Nicholas Thompson, Thijs van den Berg, Daryle
      Walker and Xiaogang Zhang<p>
        Distributed under the Boost Software License, Version 1.0. (See accompanying
        file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
      </p>
</div>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="ct_pow.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../powers.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="logaddexp.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
</body>
</html>