File: manual-additional-algorithms.html

package info (click to toggle)
viennacl 1.7.1%2Bdfsg1-6
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, bullseye, sid, trixie
  • size: 114,428 kB
  • sloc: sh: 454,206; cpp: 109,088; ansic: 2,103; perl: 104; makefile: 22
file content (303 lines) | stat: -rw-r--r-- 35,413 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
<meta name="generator" content="Doxygen 1.8.6"/>
<title>ViennaCL - The Vienna Computing Library: Additional Algorithms (Unstable)</title>
<link href="tabs.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="jquery.js"></script>
<script type="text/javascript" src="dynsections.js"></script>
<link href="navtree.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="resize.js"></script>
<script type="text/javascript" src="navtree.js"></script>
<script type="text/javascript">
  $(document).ready(initResizable);
  $(window).load(resizeHeight);
</script>
<link href="search/search.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="search/search.js"></script>
<script type="text/javascript">
  $(document).ready(function() { searchBox.OnSelectItem(0); });
</script>
<link href="doxygen.css" rel="stylesheet" type="text/css" />
</head>
<body>
<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
<div id="titlearea">
<table cellspacing="0" cellpadding="0">
 <tbody>
 <tr style="height: 56px;">
  <td style="padding-left: 0.5em;">
   <div id="projectname">ViennaCL - The Vienna Computing Library
   &#160;<span id="projectnumber">1.7.1</span>
   </div>
   <div id="projectbrief">Free open-source GPU-accelerated linear algebra and solver library.</div>
  </td>
   <td>        <div id="MSearchBox" class="MSearchBoxInactive">
        <span class="left">
          <img id="MSearchSelect" src="search/mag_sel.png"
               onmouseover="return searchBox.OnSearchSelectShow()"
               onmouseout="return searchBox.OnSearchSelectHide()"
               alt=""/>
          <input type="text" id="MSearchField" value="Search" accesskey="S"
               onfocus="searchBox.OnSearchFieldFocus(true)" 
               onblur="searchBox.OnSearchFieldFocus(false)" 
               onkeyup="searchBox.OnSearchFieldChange(event)"/>
          </span><span class="right">
            <a id="MSearchClose" href="javascript:searchBox.CloseResultsWindow()"><img id="MSearchCloseImg" border="0" src="search/close.png" alt=""/></a>
          </span>
        </div>
</td>
 </tr>
 </tbody>
</table>
</div>
<!-- end header part -->
<!-- Generated by Doxygen 1.8.6 -->
<script type="text/javascript">
var searchBox = new SearchBox("searchBox", "search",false,'Search');
</script>
</div><!-- top -->
<div id="side-nav" class="ui-resizable side-nav-resizable">
  <div id="nav-tree">
    <div id="nav-tree-contents">
      <div id="nav-sync" class="sync"></div>
    </div>
  </div>
  <div id="splitbar" style="-moz-user-select:none;" 
       class="ui-resizable-handle">
  </div>
</div>
<script type="text/javascript">
$(document).ready(function(){initNavTree('manual-additional-algorithms.html','');});
</script>
<div id="doc-content">
<!-- window showing the filter options -->
<div id="MSearchSelectWindow"
     onmouseover="return searchBox.OnSearchSelectShow()"
     onmouseout="return searchBox.OnSearchSelectHide()"
     onkeydown="return searchBox.OnSearchSelectKey(event)">
<a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(0)"><span class="SelectionMark">&#160;</span>All</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(1)"><span class="SelectionMark">&#160;</span>Classes</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(2)"><span class="SelectionMark">&#160;</span>Namespaces</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(3)"><span class="SelectionMark">&#160;</span>Files</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(4)"><span class="SelectionMark">&#160;</span>Functions</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(5)"><span class="SelectionMark">&#160;</span>Variables</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(6)"><span class="SelectionMark">&#160;</span>Typedefs</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(7)"><span class="SelectionMark">&#160;</span>Enumerations</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(8)"><span class="SelectionMark">&#160;</span>Enumerator</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(9)"><span class="SelectionMark">&#160;</span>Friends</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(10)"><span class="SelectionMark">&#160;</span>Macros</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(11)"><span class="SelectionMark">&#160;</span>Pages</a></div>

<!-- iframe showing the search results (closed by default) -->
<div id="MSearchResultsWindow">
<iframe src="javascript:void(0)" frameborder="0" 
        name="MSearchResults" id="MSearchResults">
</iframe>
</div>

<div class="header">
  <div class="headertitle">
<div class="title">Additional Algorithms (Unstable) </div>  </div>
</div><!--header-->
<div class="contents">
<div class="textblock"><p>The implementations of the algorithms described in this chapter are still not yet mature enough to be considered core-functionality, and/or are available with the OpenCL backend only.</p>
<dl class="section warning"><dt>Warning</dt><dd>The implementations of the algorithms described in this chapter are not considered mature yet.</dd></dl>
<h1><a class="anchor" id="manual-additional-algorithms-preconditioners"></a>
Additional Preconditioners</h1>
<p>In addition to the <a class="el" href="manual-algorithms.html#manual-algorithms-preconditioners">Preconditioners available in the ViennaCL core</a>, two more preconditioners are available with the OpenCL backend and are described in the following.</p>
<h2><a class="anchor" id="manual-additional-algorithms-preconditioners-spai"></a>
Sparse Approximate Inverses</h2>
<dl class="section note"><dt>Note</dt><dd>Sparse Approximate Inverse preconditioners are only available with the OpenCL backend and are experimental in ViennaCL. Interface changes as well as considerable performance improvements may be included in future releases!</dd>
<dd>
Sparse Approximate Inverse preconditioners depend on Boost.uBLAS.</dd></dl>
<p>An alternative construction of a preconditioner for a sparse system matrix <img class="formulaInl" alt="$ A $" src="form_1.png"/> is to compute a matrix <img class="formulaInl" alt="$ M $" src="form_2.png"/> with a prescribed sparsity pattern such that </p>
<p class="formulaDsp">
<img class="formulaDsp" alt="\[ \Vert AM - I \Vert_F \rightarrow \min \ , \]" src="form_3.png"/>
</p>
<p> where <img class="formulaInl" alt="$ \Vert \cdot \Vert_F $" src="form_4.png"/> denotes the Frobenius norm. This is the basic idea of sparse approximate inverse (SPAI) preconditioner. It becomes increasingly attractive because of their inherent high degree of parallelism, since the minimization problem can be solved independently for each column of <img class="formulaInl" alt="$ M $" src="form_2.png"/>. ViennaCL provides two preconditioners of this family: The first is the classical SPAI algorithm as described by Grote and Huckle <a class="el" href="citelist.html#CITEREF_grote:spai">[16]</a> , the second is the factored SPAI (FSPAI) for symmetric matrices as proposed by Huckle <a class="el" href="citelist.html#CITEREF_huckle:fspai">[17]</a> .</p>
<p>SPAI can be employed for a CPU matrix <code>A</code> of type <code>MatrixType</code> as follows: </p>
<div class="fragment"><div class="line"><span class="comment">// setup SPAI preconditioner, purely CPU-based</span></div>
<div class="line"><a class="code" href="classviennacl_1_1linalg_1_1spai__precond.html">viennacl::linalg::spai_precond&lt;MatrixType&gt;</a> spai_cpu(A, <a class="code" href="classviennacl_1_1linalg_1_1detail_1_1spai_1_1spai__tag.html">viennacl::linalg::spai_tag</a>(1e-3, 3, 5e-2));</div>
<div class="line"></div>
<div class="line"><span class="comment">//solve (e.g. using stab. Bi-conjugate gradient solver)</span></div>
<div class="line">vcl_result = <a class="code" href="namespaceviennacl_1_1linalg.html#a6e9b329b64ac782e6a5687ad2fc47a2a">viennacl::linalg::solve</a>(A, rhs,</div>
<div class="line">                                     <a class="code" href="classviennacl_1_1linalg_1_1bicgstab__tag.html">viennacl::linalg::bicgstab_tag</a>(),</div>
<div class="line">                                     spai_cpu);</div>
</div><!-- fragment --><p> The first parameter denotes the residual norm threshold for the full matrix, the second parameter the maximum number of pattern updates, and the third parameter is the threshold for the residual of each minimization problem.</p>
<p>For GPU-matrices, only parts of the setup phase are computed on the CPU, because compute-intensive tasks can be carried out on the GPU: </p>
<div class="fragment"><div class="line"><span class="comment">// setup SPAI preconditioner, GPU-assisted</span></div>
<div class="line"><a class="code" href="classviennacl_1_1linalg_1_1spai__precond.html">viennacl::linalg::spai_precond&lt;GPUMatrixType&gt;</a> spai_gpu(vcl_matrix, <a class="code" href="classviennacl_1_1linalg_1_1detail_1_1spai_1_1spai__tag.html">viennacl::linalg::spai_tag</a>(1e-3, 3, 5e-2));</div>
<div class="line"></div>
<div class="line"><span class="comment">//solve (e.g. using conjugate gradient solver)</span></div>
<div class="line">vcl_result = <a class="code" href="namespaceviennacl_1_1linalg.html#a6e9b329b64ac782e6a5687ad2fc47a2a">viennacl::linalg::solve</a>(vcl_matrix,</div>
<div class="line">                                     vcl_rhs,</div>
<div class="line">                                     <a class="code" href="classviennacl_1_1linalg_1_1bicgstab__tag.html">viennacl::linalg::bicgstab_tag</a>(),</div>
<div class="line">                                     spai_gpu);</div>
</div><!-- fragment --><p> The <code>GPUMatrixType</code> is typically a <code><a class="el" href="classviennacl_1_1compressed__matrix.html" title="A sparse square matrix in compressed sparse rows format. ">viennacl::compressed_matrix</a></code> type.</p>
<p>For symmetric matrices, FSPAI can be used with the conjugate gradient solver: </p>
<div class="fragment"><div class="line"><a class="code" href="classviennacl_1_1linalg_1_1fspai__precond.html">viennacl::linalg::fspai_precond&lt;MatrixType&gt;</a> fspai_cpu(A, <a class="code" href="classviennacl_1_1linalg_1_1detail_1_1spai_1_1fspai__tag.html">viennacl::linalg::fspai_tag</a>());</div>
<div class="line"></div>
<div class="line"><span class="comment">//solve (e.g. using stab. Bi-conjugate gradient solver)</span></div>
<div class="line">vcl_result = <a class="code" href="namespaceviennacl_1_1linalg.html#a6e9b329b64ac782e6a5687ad2fc47a2a">viennacl::linalg::solve</a>(A,</div>
<div class="line">                                     rhs,</div>
<div class="line">                                     <a class="code" href="classviennacl_1_1linalg_1_1cg__tag.html">viennacl::linalg::cg_tag</a>(),</div>
<div class="line">                                     fspai_cpu);</div>
</div><!-- fragment --><p> Our experience is that FSPAI is typically more efficient than SPAI when applied to the same matrix, both in computational effort and in terms of convergence acceleration of the iterative solvers.</p>
<dl class="section note"><dt>Note</dt><dd>At present, there is no GPU-accelerated FSPAI included in ViennaCL.</dd></dl>
<p>Note that FSPAI depends on the ordering of the unknowns, thus bandwidth reduction algorithms may be employed first, cf. <a class="el" href="manual-additional-algorithms.html#manual-additional-algorithms-bandwidth-reduction">Bandwidth Reduction</a>.</p>
<h1><a class="anchor" id="manual-additional-algorithms-eigenvalues"></a>
Additional Eigenvalue Routines</h1>
<p>Several routines for computing the eigenvalues of symmetric tridiagonal as well as dense matrices are provided with ViennaCL. The following routines are available for one or two compute backends, hence they are not considered to be part of the core. Also, use the implementations with care: Even though the routines pass the respective tests, they are not as extensively tested as core functionality.</p>
<h2><a class="anchor" id="manual-additional-algorithms-eigenvalues-bisection"></a>
Symmetric Tridiagonal Matrices: Bisection</h2>
<p>The bisection method for finding the eigenvalues of a symmetric tridiagonal matrix has been implemented within the CUDA SDK as an example. ViennaCL provides an extension of the implementation also suitable for eigenvalues with multiplicity higher than one. In addition, our implementation is not limited to CUDA and also available for OpenCL.</p>
<p>The interface in ViennaCL takes STL vectors as inputs, holding the diagonal and the sub/superdiagonal. The third argument is the vector to be overwritten with the computed eigenvalues. A sample code snippet is as follows: </p>
<div class="fragment"><div class="line">std::vector&lt;NumericT&gt; diagonal(mat_size);</div>
<div class="line">std::vector&lt;NumericT&gt; superdiagonal(mat_size);</div>
<div class="line">std::vector&lt;NumericT&gt; eigenvalues_bisect(mat_size);</div>
<div class="line"></div>
<div class="line"><span class="keywordtype">bool</span> bResult = <a class="code" href="namespaceviennacl_1_1linalg.html#a54d70c731aed90556e228b7f14ac3a52">viennacl::linalg::bisect</a>(diagonal, superdiagonal, eigenvalues_bisect);</div>
</div><!-- fragment --><p> The return value <code>bResult</code> is <code>false</code> if an error occurred, and <code>true</code> otherwise. Note that the sub/superdiagonal array <code>superdiagonal</code> is required to have an value of zero at index zero (indicating an element outside the matrix).</p>
<dl class="section note"><dt>Note</dt><dd>A fully working example is available in <code><a class="el" href="examples_2tutorial_2bisect_8cpp.html">examples/tutorial/bisect.cpp</a></code>.</dd></dl>
<h2><a class="anchor" id="manual-additional-algorithms-eigenvalues-tql2"></a>
Symmetric Tridiagonal Matrices: TQL2</h2>
<p>The bisection method allows for a fast computation of eigenvalues, but it does not compute eigenvectors directly. If eigenvectors are needed, the tql2-version of the QL algorithm as described in the Algol procedures can be used.</p>
<p>The interface to the routine <code><a class="el" href="namespaceviennacl_1_1linalg.html#a3fe3eb20d4dcda794259fd6a6d30afb6">tql2()</a></code> is as follows: The first argument is a dense ViennaCL matrix. The second and third argument are STL vectors representing the diagonal and the sub/superdiagonal, respectively. Thus, a minimal code example for a 20-by-20 tridiagonal matrix is as follows: </p>
<div class="fragment"><div class="line">std::size_t N = 20;</div>
<div class="line">std::vector&lt;NumericT&gt; d(N), e(N);</div>
<div class="line"><a class="code" href="classviennacl_1_1matrix.html">viennacl::matrix&lt;NumericT&gt;</a> Q = <a class="code" href="classviennacl_1_1identity__matrix.html">viennacl::identity_matrix&lt;NumericT&gt;</a>(N);</div>
<div class="line"></div>
<div class="line"><span class="comment">// fill d and e with data here</span></div>
<div class="line"></div>
<div class="line"><a class="code" href="namespaceviennacl_1_1linalg.html#a3fe3eb20d4dcda794259fd6a6d30afb6">viennacl::linalg::tql2</a>(Q, d, e);</div>
</div><!-- fragment --><p> Note that the sub/superdiagonal array <code>e</code> is required to have an value of zero at index zero (indicating an element outside the matrix).</p>
<dl class="section note"><dt>Note</dt><dd>A fully working example is available in <code><a class="el" href="tql2_8cpp.html">examples/tutorial/tql2.cpp</a></code>.</dd></dl>
<h2><a class="anchor" id="manual-additional-algorithms-eigenvalues-qr-method-symmetric"></a>
QR Method for Symmetric Dense Matrices</h2>
<p>Symmetric dense real-valued matrices have real-valued eigenvalues, hence the current lack of complex arithmetic in ViennaCL is not limiting. An implementation of the QR method for the symmetric case for computing eigenvalues and eigenvectors is provided.</p>
<p>The current implementation of the QR method for symmetric dense matrices in ViennaCL takes three parameters:</p>
<ul>
<li>The input matrix to find eigenvalues and eigenvectors from</li>
<li>A matrix in which the calculated eigenvectors will be stored</li>
<li>An STL vector in which the calculated eigenvalues will be stored</li>
</ul>
<p>A minimal example is as follows: </p>
<div class="fragment"><div class="line">std::size_t N = 20;</div>
<div class="line"><a class="code" href="classviennacl_1_1matrix.html">viennacl::matrix&lt;ScalarType&gt;</a> A_input(N,N);</div>
<div class="line"><a class="code" href="classviennacl_1_1matrix.html">viennacl::matrix&lt;ScalarType&gt;</a> Q(N, N);</div>
<div class="line">std::vector&lt;ScalarType&gt; eigenvalues(N);</div>
<div class="line"></div>
<div class="line"><a class="code" href="namespaceviennacl_1_1linalg.html#a51cbe06645207f9b3054e8725269cc71">viennacl::linalg::qr_method_sym</a>(A_input, Q, eigenvalues);</div>
</div><!-- fragment --><dl class="section note"><dt>Note</dt><dd>A fully working example is available in <code><a class="el" href="examples_2tutorial_2qr__method_8cpp.html">examples/tutorial/qr_method.cpp</a></code>.</dd></dl>
<h1><a class="anchor" id="manual-additional-algorithms-fft"></a>
Fast Fourier Transform</h1>
<p>Since there is no standardized complex type in OpenCL at the time of the release of ViennaCL, vectors need to be set up with real- and imaginary part before computing a fast Fourier transform (FFT). In order to store complex numbers <img class="formulaInl" alt="$ z_0 $" src="form_5.png"/>, <img class="formulaInl" alt="$ z_1 $" src="form_6.png"/>, etc. in a <code><a class="el" href="classviennacl_1_1vector.html">viennacl::vector</a></code>, say <code>v</code>, the real and imaginary parts are mapped to even and odd entries of <code>v</code> respectively: <code>v[0] = Real(z_0)</code>, <code>v[1] = Imag(z_0)</code>, <code>v[2] = Real(z_1)</code>, <code>v[3] = Imag(z_1)</code>, etc.</p>
<p>The FFT of <code>v</code> can then be computed either by writing to a second vector <code>output</code> or by directly writing the result to <code>v</code> </p>
<div class="fragment"><div class="line"><a class="code" href="fft__1d_8cpp.html#a56353420d188d1181ae49bf69283b497">viennacl::fft</a>(v, output);</div>
<div class="line">viennacl::inplace_fft(v);</div>
</div><!-- fragment --><p> Conversely, the inverse FFT is computed as </p>
<div class="fragment"><div class="line">viennacl::ifft(v, output);</div>
<div class="line">viennacl::inplace_ifft(v);</div>
</div><!-- fragment --><p>The second option for computing the FFT is with Bluestein algorithm. Currently, the implementation supports only input sizes less than <img class="formulaInl" alt="$ 2^{16} = 65536 $" src="form_7.png"/>. The Bluestein algorithm uses at least three-times more additional memory than another algorithms, but should be fast for any size of data. As with any efficient FFT algorithm, the sequential implementation has a complexity of <img class="formulaInl" alt="$ \mathcal{O}(n * \lg n) $" src="form_8.png"/>. To compute the FFT with the Bluestein algorithm from a complex vector <code>v</code> and store the result in a vector <code>output</code>, one uses the code </p>
<div class="fragment"><div class="line"><a class="code" href="namespaceviennacl_1_1linalg.html#a4da8975bfe53f8a05145c8b44410ab27">viennacl::linalg::bluestein</a>(v, output,batch_size);</div>
</div><!-- fragment --><dl class="section warning"><dt>Warning</dt><dd>In ViennaCL the FFT with complexity <img class="formulaInl" alt="$ N \log N $" src="form_9.png"/> is only computed for vectors with a size of a power of two. For other vector sizes, a standard discrete Fourier transform with complexity <img class="formulaInl" alt="$ N^2 $" src="form_10.png"/> is employed. This is subject to change in future versions.</dd></dl>
<p>Some of the FFT functions are also suitable for matrices and can be computed in 2D. The computation of an FFT for objects of type <code><a class="el" href="classviennacl_1_1matrix.html" title="A dense matrix class. ">viennacl::matrix</a></code>, say <code>mat</code>, require that even entries are real parts and odd entries are imaginary parts of complex numbers. In order to store complex numbers <img class="formulaInl" alt="$ z_0 $" src="form_5.png"/>, <img class="formulaInl" alt="$ z_1 $" src="form_6.png"/>, etc.~in <code>mat</code>: <code>mat(0,0) = Real(z_0)</code>, <code>mat(0,1) = Imag(z_0)</code>,<code>mat(0,2) = Real(z_0)</code>, <code>mat(0,3) = Imag(z_0)</code> etc.</p>
<p>Than user can compute FFT either by writing to a second matrix <code>output</code> or by directly writing the result to <code>mat</code> </p>
<div class="fragment"><div class="line"><a class="code" href="fft__1d_8cpp.html#a56353420d188d1181ae49bf69283b497">viennacl::fft</a>(mat, output);</div>
<div class="line">viennacl::inplace_fft(v);</div>
</div><!-- fragment --><dl class="section note"><dt>Note</dt><dd>In ViennaCL the FFT with complexity <img class="formulaInl" alt="$ N \log N $" src="form_9.png"/> is computed for matrices with a number of rows and columns a power of two only. For other matrix sizes, a standard discrete Fourier transform with complexity <img class="formulaInl" alt="$ N^2 $" src="form_10.png"/> is employed. This is subject to change in future versions.</dd></dl>
<p>There are two additional functions to calculate the convolution of two vectors. It expresses the amount of overlap of one function represented by vector <code>v</code> as it is shifted over another function represented by vector <code>u</code>. Formerly a convolution is defined by the integral </p>
<p class="formulaDsp">
<img class="formulaDsp" alt="\[ (v*u)(t) = \int_{-\infty}^\infty f(x)g(t-x) dx \]" src="form_11.png"/>
</p>
<p> Define <img class="formulaInl" alt="$ \mathcal{F} $" src="form_12.png"/> as the Fast Fourier Transform operator, there holds for a convolution of two infinite sequences <code>v</code> and <code>u</code> </p>
<p class="formulaDsp">
<img class="formulaDsp" alt="\[ \mathcal{F}\{v*u\} = \mathcal{F}\{u\} \mathcal{F}\{v\} \rightarrow v*u = \mathcal{F}^{-1}\{ \mathcal{F}\{v\} \mathcal{F}\{u\} \} \]" src="form_13.png"/>
</p>
<p> To compute the convolution of two complex vectors <code>v</code> and <code>u</code>, where the result will be stored in <code>output</code>, one can use </p>
<div class="fragment"><div class="line"><a class="code" href="fft__1d_8cpp.html#a74945fbb0428589ce1c9c4723865158e">viennacl::linalg::convolve</a>(v, u, output);</div>
</div><!-- fragment --><p> which does not modify the input. If a modification of the input vectors is acceptable, </p>
<div class="fragment"><div class="line"><a class="code" href="namespaceviennacl_1_1linalg.html#a7b596f98d234d40fe9460d0077d1fcc3">viennacl::linalg::convolve_i</a>(v, u, output);</div>
</div><!-- fragment --><p> can be used, reducing the overall memory requirements.</p>
<p>Multiplication of two complex vectors <code>u</code>, <code>v</code> where the result will be stored in <code>output</code>, is provided by </p>
<div class="fragment"><div class="line"><a class="code" href="namespaceviennacl_1_1linalg.html#ad89501246ebdf434e5181cfa83116cb1">viennacl::linalg::multiply_complex</a>(u, v, output);</div>
</div><!-- fragment --><p>For creating a complex vector <code>v</code> from a real vector <code>u</code> with suitable sizes (<code>size = <a class="el" href="namespaceviennacl_1_1traits.html#aa2344ea20469f55fbc15a8e9526494d0" title="Generic routine for obtaining the size of a vector (ViennaCL, uBLAS, etc.) ">v.size()</a> / 2</code>), use </p>
<div class="fragment"><div class="line"><a class="code" href="namespaceviennacl_1_1linalg.html#ab46594affa4b4f500660ea741cc1582a">viennacl::linalg::real_to_complex</a>(u, v, <a class="code" href="namespaceviennacl_1_1traits.html#aa2344ea20469f55fbc15a8e9526494d0">size</a>);</div>
</div><!-- fragment --><p> Conversely, computing a real vector <code>v</code> from a complex vector <code>u</code> with length <code>size = <a class="el" href="namespaceviennacl_1_1traits.html#aa2344ea20469f55fbc15a8e9526494d0" title="Generic routine for obtaining the size of a vector (ViennaCL, uBLAS, etc.) ">v.size()</a> / 2</code> is achieved through </p>
<div class="fragment"><div class="line"><a class="code" href="namespaceviennacl_1_1linalg.html#a45f767da9ec63fd7e37c1fae58dc2727">viennacl::linalg::complex_to_real</a>(u, v, <a class="code" href="namespaceviennacl_1_1traits.html#aa2344ea20469f55fbc15a8e9526494d0">size</a>);</div>
</div><!-- fragment --><p>To reverse a vector <code>v</code> inplace, use </p>
<div class="fragment"><div class="line"><a class="code" href="namespaceviennacl_1_1linalg.html#a65decb368dc002ff34334fa0918add2b">viennacl::linalg::reverse</a>(v);</div>
</div><!-- fragment --><h1><a class="anchor" id="manual-additional-algorithms-svd"></a>
Singular Value Decomposition</h1>
<p>The singular decomposition of a real-valued matrix <img class="formulaInl" alt="$ A \in \mathbb{R}^{M \times N} $" src="form_120.png"/> is a decomposition of the form </p>
<p class="formulaDsp">
<img class="formulaDsp" alt="\[ A = U \Sigma V^T \]" src="form_121.png"/>
</p>
<p> where <img class="formulaInl" alt="$ U \in \mathbb{R}^{M\times M} $" src="form_122.png"/> is a unitary matrix holding the left-singular vectors, <img class="formulaInl" alt="$ \Sigma \in \mathbb{R}^{M \times N} $" src="form_123.png"/> holds the singular values of <img class="formulaInl" alt="$ A $" src="form_1.png"/> on the diagonal, and <img class="formulaInl" alt="$ V \in \mathbb{R}^{N \times N} $" src="form_124.png"/> holds the right-singular vectors of <img class="formulaInl" alt="$ A $" src="form_1.png"/>. The singular vectors are stored column-wise in both <img class="formulaInl" alt="$ U $" src="form_25.png"/> and <img class="formulaInl" alt="$ V $" src="form_125.png"/>.</p>
<p>ViennaCL provides an implementation of the SVD method based on bidiagonalization and QR methods with shift. The API of the <code><a class="el" href="namespaceviennacl_1_1linalg.html#a7dd4119fc5dfef5c067af3b46bd86191" title="Computes the singular value decomposition of a matrix A. Experimental in 1.3.x. ">svd()</a></code> routine takes three parameters:</p>
<ul>
<li>The input matrix <img class="formulaInl" alt="$ A $" src="form_1.png"/>, which gets overwritten with <img class="formulaInl" alt="$ \Sigma $" src="form_126.png"/>,</li>
<li>The matrix <img class="formulaInl" alt="$ U $" src="form_25.png"/> for the left-singular vectors,</li>
<li>The matrix <img class="formulaInl" alt="$ V $" src="form_125.png"/> for the right-singular vectors.</li>
</ul>
<p>A minimal example is as follows: </p>
<div class="fragment"><div class="line">std::size_t M = 20;</div>
<div class="line">std::size_t N = 30;</div>
<div class="line"><a class="code" href="classviennacl_1_1matrix.html">viennacl::matrix&lt;NumericT&gt;</a> A(M, N), U(M, M), V(N, N);</div>
<div class="line"></div>
<div class="line"><span class="comment">// fill A with values here</span></div>
<div class="line"></div>
<div class="line"><a class="code" href="namespaceviennacl_1_1linalg.html#a7dd4119fc5dfef5c067af3b46bd86191">viennacl::linalg::svd</a>(A, U, V);</div>
</div><!-- fragment --><dl class="section note"><dt>Note</dt><dd>Have a look at <code><a class="el" href="svd_8cpp.html">tests/src/svd.cpp</a></code> for an example.</dd>
<dd>
There are known performance bottlenecks in the current implementation. Any contributions welcome!</dd></dl>
<h1><a class="anchor" id="manual-additional-algorithms-bandwidth-reduction"></a>
Bandwidth Reduction</h1>
<dl class="section note"><dt>Note</dt><dd>Bandwidth reduction algorithms are experimental in ViennaCL. Interface changes as well as considerable performance improvements may be included in future releases!</dd></dl>
<p>The bandwidth of a sparse matrix is defined as the maximum difference of the indices of nonzero entries in a row, taken over all rows. A low bandwidth may allows for the use of efficient banded matrix solvers instead of iterative solvers. Moreover, better cache utilization as well as lower fill-in in LU-factorization based algorithms can be expected.</p>
<p>For a given sparse matrix with large bandwidth, ViennaCL provides routines for renumbering the unknowns such that the reordered system matrix shows much smaller bandwidth. Typical applications stem from the discretization of partial differential equations by means of the finite element or the finite difference method. The algorithms employed are as follows:</p>
<ul>
<li>Classical Cuthill-McKee algorithm <a class="el" href="citelist.html#CITEREF_cuthill:reducing-bandwidth">[9]</a></li>
<li>Iterated Cuthill-McKee algorithm with different seeds <a class="el" href="citelist.html#CITEREF_cuthill:reducing-bandwidth">[9]</a></li>
<li>Gibbs-Poole-Stockmeyer algorithm <a class="el" href="citelist.html#CITEREF_lewis:gps-algorithm">[20]</a></li>
</ul>
<p>The iterated Cuthill-McKee algorithm applies the classical Cuthill-McKee algorithm to different starting nodes with small, but not necessarily minimal degree as root node into account. While this iterated application is more expensive in times of execution times, it may lead to better results than the classical Cuthill-McKee algorithm. A parameter <img class="formulaInl" alt="$ a \in [0,1] $" src="form_14.png"/> controls the number of nodes considered: All nodes with degree <img class="formulaInl" alt="$ d $" src="form_15.png"/> fulfilling </p>
<p class="formulaDsp">
<img class="formulaDsp" alt="\[ d_{\min} \leq d \leq d_{\min} + a(d_{\max} - d_{\min}) \]" src="form_16.png"/>
</p>
<p> are considered, where <img class="formulaInl" alt="$ d_{\min} $" src="form_17.png"/> and <img class="formulaInl" alt="$ d_{\max} $" src="form_18.png"/> are the miminum and maximum nodal degrees in the graph. A second parameter <code>gmax</code> specifies the number of additional root nodes considered.</p>
<p>The algorithms are called for a matrix <code>A</code> of a type compatible with <code>std::vector&lt; std::map&lt;int, double&gt; &gt;</code> by </p>
<div class="fragment"><div class="line">r = <a class="code" href="namespaceviennacl.html#af22338e452cee008bee5e38070ddc611">viennacl::reorder</a>(A, <a class="code" href="structviennacl_1_1cuthill__mckee__tag.html">viennacl::cuthill_mckee_tag</a>());</div>
<div class="line">r = <a class="code" href="namespaceviennacl.html#af22338e452cee008bee5e38070ddc611">viennacl::reorder</a>(A, <a class="code" href="classviennacl_1_1advanced__cuthill__mckee__tag.html">viennacl::advanced_cuthill_mckee_tag</a>(a, gmax));</div>
<div class="line">r = <a class="code" href="namespaceviennacl.html#af22338e452cee008bee5e38070ddc611">viennacl::reorder</a>(A, <a class="code" href="structviennacl_1_1gibbs__poole__stockmeyer__tag.html">viennacl::gibbs_poole_stockmeyer_tag</a>());</div>
</div><!-- fragment --><p> and return the permutation array. In ViennaCL, the user then needs to manually reorder the sparse matrix based on the permutation array. Example code can be found in <code><a class="el" href="bandwidth-reduction_8cpp.html">examples/tutorial/bandwidth-reduction.cpp</a></code>.</p>
<h1><a class="anchor" id="manual-additional-algorithms-nmf"></a>
Nonnegative Matrix Factorization</h1>
<p>In various fields such as text mining, a matrix <code>V</code> needs to be factored into factors <code>W</code> and <code>H</code> with the property that all three matrices have no negative elements, such that the function </p>
<p class="formulaDsp">
<img class="formulaDsp" alt="\[ f(W, H) = \Vert V - WH \Vert_{\mathrm{F}}^2 \]" src="form_19.png"/>
</p>
<p> is minimized. This can be achieved using the algorithm proposed by Lee and Seoung <a class="el" href="citelist.html#CITEREF_lee:nmf">[19]</a> with the following code: </p>
<div class="fragment"><div class="line"><span class="preprocessor">#include &quot;<a class="code" href="matrix_8hpp.html">viennacl/matrix.hpp</a>&quot;</span></div>
<div class="line"><span class="preprocessor">#include &quot;viennacl/linalg/nmf_operations.hpp&quot;</span></div>
<div class="line"></div>
<div class="line"><a class="code" href="classviennacl_1_1matrix.html">viennacl::matrix&lt;ScalarType&gt;</a> V(<a class="code" href="namespaceviennacl_1_1traits.html#aa756f5d6820722094cae0d8b9bb6d5e2">size1</a>, <a class="code" href="namespaceviennacl_1_1traits.html#a3658e7c29ac0f60a20cb5871f5b5fd98">size2</a>);</div>
<div class="line"><a class="code" href="classviennacl_1_1matrix.html">viennacl::matrix&lt;ScalarType&gt;</a> W(<a class="code" href="namespaceviennacl_1_1traits.html#aa756f5d6820722094cae0d8b9bb6d5e2">size1</a>, k);</div>
<div class="line"><a class="code" href="classviennacl_1_1matrix.html">viennacl::matrix&lt;ScalarType&gt;</a> H(k, <a class="code" href="namespaceviennacl_1_1traits.html#a3658e7c29ac0f60a20cb5871f5b5fd98">size2</a>);</div>
<div class="line"></div>
<div class="line"><a class="code" href="classviennacl_1_1linalg_1_1nmf__config.html">viennacl::linalg::nmf_config</a> conf;</div>
<div class="line"><a class="code" href="namespaceviennacl_1_1linalg.html#a1682d7a829cecf20106e407aed749a49">viennacl::linalg::nmf</a>(V, W, H, conf);</div>
</div><!-- fragment --><p>The fourth and last parameter to the function <code><a class="el" href="namespaceviennacl_1_1linalg_1_1cuda.html#ad9b529eaaf42d91af16200a3def8971c" title="The nonnegative matrix factorization (approximation) algorithm as suggested by Lee and Seung...">nmf()</a></code> is an object of type <code><a class="el" href="classviennacl_1_1linalg_1_1nmf__config.html" title="Configuration class for the nonnegative-matrix-factorization algorithm. Specify tolerances, maximum iteration counts, etc., here. ">viennacl::linalg::nmf_config</a></code> containing all necessary parameters of NMF routine:</p>
<ul>
<li><code>eps_</code>: Relative tolerance for convergence</li>
<li><code>stagnation_eps_</code>: Relative tolerance for the stagnation check</li>
<li><code>max_iters_</code>: Maximum number of iterations for the NMF algorithm</li>
<li><code>iters_</code>: The number of iterations of the last NMF run using this configuration object</li>
<li><code>print_relative_error_</code>: Flag specifying whether the relative tolerance should be printed in each iteration</li>
<li><code>check_after_steps_</code>: Number of steps after which the convergence of NMF should be checked (again)</li>
</ul>
<p>Multiple tests can be found in file <code>viennacl/test/src/nmf.cpp</code> and tutorial in file <code>viennacl/examples/tutorial/nmf.cpp</code> </p>
</div></div><!-- contents -->
</div><!-- doc-content -->
<!-- start footer part -->
<div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
  <ul>
    <li class="footer">Generated on Wed Jan 20 2016 22:32:44 for ViennaCL - The Vienna Computing Library by
    <a href="http://www.doxygen.org/index.html">
    <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.6 </li>
  </ul>
</div>
</body>
</html>