File: futhark-cuda.rst

package info (click to toggle)
haskell-futhark 0.25.32-2
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 18,236 kB
  • sloc: haskell: 100,484; ansic: 12,100; python: 3,440; yacc: 785; sh: 561; javascript: 558; lisp: 399; makefile: 277
file content (133 lines) | stat: -rw-r--r-- 3,549 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
.. role:: ref(emphasis)

.. _futhark-cuda(1):

==============
futhark-cuda
==============

SYNOPSIS
========

futhark cuda [options...] <program.fut>

DESCRIPTION
===========


``futhark cuda`` translates a Futhark program to C code invoking CUDA
kernels, and either compiles that C code with a C compiler to an
executable binary program, or produces a ``.h`` and ``.c`` file that
can be linked with other code. The standard Futhark optimisation
pipeline is used.

``futhark cuda`` uses ``-lcuda -lcudart -lnvrtc`` to link.  If using
``--library``, you will need to do the same when linking the final
binary.

The generated CUDA code can be called from multiple CPU threads, as it
brackets every API operation with ``cuCtxPushCurrent()`` and
``cuCtxPopCurrent()``.

OPTIONS
=======

Accepts the same options as :ref:`futhark-c(1)`.

ENVIRONMENT VARIABLES
=====================

``CC``

  The C compiler used to compile the program.  Defaults to ``cc`` if
  unset.

``CFLAGS``

  Space-separated list of options passed to the C compiler.  Defaults
  to ``-O -std=c99`` if unset.

EXECUTABLE OPTIONS
==================

Generated executables accept the same options as those generated by
:ref:`futhark-c(1)`. The ``-t`` option behaves as with
:ref:`futhark-opencl(1)`.

The following additional options are accepted.

-h, --help

  Print help text to standard output and exit.

--default-thread-block-size=INT

  The default size of thread blocks that are launched.  Capped to the
  hardware limit if necessary.

--default-num-thread-blocks=INT

  The default number of thread blocks that are launched.

--default-threshold=INT

  The default parallelism threshold used for comparisons when
  selecting between code versions generated by incremental flattening.
  Intuitively, the amount of parallelism needed to saturate the GPU.

--default-tile-size=INT

  The default tile size used when performing two-dimensional tiling
  (the workgroup size will be the square of the tile size).

--dump-cuda=FILE

  Don't run the program, but instead dump the embedded CUDA kernels to
  the indicated file.  Useful if you want to see what is actually
  being executed.

--dump-ptx=FILE

  Don't run the program, but instead dump the PTX-compiled version of
  the embedded kernels to the indicated file.

--load-cuda=FILE

  Instead of using the embedded CUDA kernels, load them from the
  indicated file.

--load-ptx=FILE

  Load PTX code from the indicated file.

--nvrtc-option=OPT

  Add an additional build option to the string passed to NVRTC.  Refer
  to the CUDA documentation for which options are supported.  Be
  careful - some options can easily result in invalid results.

ENVIRONMENT
===========

If run without ``--library``, ``futhark cuda`` will invoke a C
compiler to compile the generated C program into a binary.  This only
works if the C compiler can find the necessary CUDA libraries.  On
most systems, CUDA is installed in ``/usr/local/cuda``, which is
usually not part of the default compiler search path. You may need to
set the following environment variables before running ``futhark
cuda``::

  LIBRARY_PATH=/usr/local/cuda/lib64
  LD_LIBRARY_PATH=/usr/local/cuda/lib64/
  CPATH=/usr/local/cuda/include

At runtime the generated program must be able to find the CUDA
installation directory, which is normally located at
``/usr/local/cuda``.  If you have CUDA installed elsewhere, set any of
the ``CUDA_HOME``, ``CUDA_ROOT``, or ``CUDA_PATH`` environment
variables to the proper directory.

SEE ALSO
========

:ref:`futhark-opencl(1)`