File: futhark-opencl.rst

package info (click to toggle)
haskell-futhark 0.25.32-2
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 18,236 kB
  • sloc: haskell: 100,484; ansic: 12,100; python: 3,440; yacc: 785; sh: 561; javascript: 558; lisp: 399; makefile: 277
file content (133 lines) | stat: -rw-r--r-- 3,891 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
.. role:: ref(emphasis)

.. _futhark-opencl(1):

==============
futhark-opencl
==============

SYNOPSIS
========

futhark opencl [options...] <program.fut>

DESCRIPTION
===========


``futhark opencl`` translates a Futhark program to C code invoking
OpenCL kernels, and either compiles that C code with a C compiler to
an executable binary program, or produces a ``.h`` and ``.c`` file
that can be linked with other code. The standard Futhark optimisation
pipeline is used.

``futhark opencl`` uses ``-lOpenCL`` to link (``-framework OpenCL`` on
macOS).  If using ``--library``, you will need to do the same when
linking the final binary.

The GPU terminology used is derived from CUDA nomenclature (e.g.
"thread block" instead of "workgroup"), but OpenCL nomenclature is
also supported for compatibility.

OPTIONS
=======

Accepts the same options as :ref:`futhark-c(1)`.

ENVIRONMENT VARIABLES
=====================

``CC``

  The C compiler used to compile the program.  Defaults to ``cc`` if
  unset.

``CFLAGS``

  Space-separated list of options passed to the C compiler.  Defaults
  to ``-O -std=c99`` if unset.

EXECUTABLE OPTIONS
==================

Generated executables accept the same options as those generated by
:ref:`futhark-c(1)`.  For the ``-t`` option, The time taken to perform
device setup or teardown, including writing the input or reading the
result, is not included in the measurement. In particular, this means
that timing starts after all kernels have been compiled and data has
been copied to the device buffers but before setting any kernel
arguments. Timing stops after the kernels are done running, but before
data has been read from the buffers or the buffers have been released.

The following additional options are accepted.

--build-option=OPT

  Add an additional build option to the string passed to
  ``clBuildProgram()``.  Refer to the OpenCL documentation for which
  options are supported.  Be careful - some options can easily
  result in invalid results.

--default-thread-block-size=INT, --default-group-size=INT

  The default size of thread blocks that are launched. Capped to the
  hardware limit if necessary.

--default-num-thread-blocks, --default-num-groups=INT

  The default number of thread blocks that are launched.

--default-threshold=INT

  The default parallelism threshold used for comparisons when
  selecting between code versions generated by incremental flattening.
  Intuitively, the amount of parallelism needed to saturate the GPU.

--default-tile-size=INT

  The default tile size used when performing two-dimensional tiling
  (the workgroup size will be the square of the tile size).

-d, --device=NAME

  Use the first OpenCL device whose name contains the given string.
  The special string ``#k``, where ``k`` is an integer, can be used to
  pick the *k*-th device, numbered from zero.  If used in conjunction
  with ``-p``, only the devices from matching platforms are
  considered.

--dump-opencl=FILE

  Don't run the program, but instead dump the embedded OpenCL program
  to the indicated file.  Useful if you want to see what is actually
  being executed.

--dump-opencl-binary=FILE

  Don't run the program, but instead dump the compiled version of
  the embedded OpenCL program to the indicated file.  On NVIDIA
  platforms, this will be PTX code.

--load-opencl=FILE

  Instead of using the embedded OpenCL program, load it from the
  indicated file.

--load-opencl-binary=FILE

  Load an OpenCL binary from the indicated file.

-p, --platform=NAME

  Use the first OpenCL platform whose name contains the given string.
  The special string ``#k``, where ``k`` is an integer, can be used to
  pick the *k*-th platform, numbered from zero.

--list-devices

  List all OpenCL devices and platforms available on the system.

SEE ALSO
========

:ref:`futhark-test(1)`, :ref:`futhark-cuda(1)`, :ref:`futhark-c(1)`