File: faq.txt

package info (click to toggle)
theano 1.0.3%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 30,752 kB
  • sloc: python: 141,182; ansic: 9,505; makefile: 259; sh: 214; pascal: 81
file content (200 lines) | stat: -rw-r--r-- 7,688 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
:orphan:

.. _faq:

==========================
Frequently Asked Questions
==========================

Does Theano support Python 3?
------------------------------

We support both Python 2 >= 2.7 and Python 3 >= 3.4.

Output slight numerical difference
----------------------------------

Sometimes when you compare the output of Theano using different
Theano flags, Theano versions, CPU and GPU or with other software like
NumPy, you will see small numerical differences.

This is normal. Floating point numbers are approximations of real
numbers. This is why doing a+(b+c) vs (a+b)+c can give small
differences of value.  This is normal. For more details, see: `What
Every Computer Scientist Should Know About Floating-Point Arithmetic
<https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html>`_.


Faster gcc optimization
-----------------------

You can enable faster gcc optimization with the ``cxxflags`` option.
This list of flags was suggested on the mailing list::

    -O3 -ffast-math -ftree-loop-distribution -funroll-loops -ftracer

Use it at your own risk. Some people warned that the ``-ftree-loop-distribution`` optimization resulted in wrong results in the past.

In the past we said that if the ``compiledir`` was not shared by multiple
computers, you could add the ``-march=native`` flag. Now we recommend
to remove this flag as Theano does it automatically and safely,
even if the ``compiledir`` is shared by multiple computers with different
CPUs. In fact, Theano asks g++ what are the equivalent flags it uses, and re-uses
them directly.


.. _faster-theano-function-compilation:

Faster Theano Function Compilation
----------------------------------

Theano function compilation can be time consuming. It can be sped up by setting
the flag ``mode=FAST_COMPILE`` which instructs Theano to skip most
optimizations and disables the generation of any c/cuda code. This is useful
for quickly testing a simple idea.

If c/cuda code is necessary, as when using a GPU, the flag
``optimizer=fast_compile`` can be used instead. It instructs Theano to
skip time consuming optimizations but still generate c/cuda code.

Similarly using the flag ``optimizer_excluding=inplace`` will speed up
compilation by preventing optimizations that replace operations with a
version that reuses memory where it will not negatively impact the
integrity of the operation. Such optimizations can be time
consuming. However using this flag will result in greater memory usage
because space must be allocated for the results which would be
unnecessary otherwise. In short, using this flag will speed up
compilation but it will also use more memory because
``optimizer_excluding=inplace`` excludes inplace optimizations
resulting in a trade off between speed of compilation and memory
usage.

Alternatively, if the graph is big, using the flag ``cycle_detection=fast``
will speedup the computations by removing some of the inplace
optimizations. This would allow theano to skip a time consuming cycle
detection algorithm. If the graph is big enough,we suggest that you use
this flag instead of ``optimizer_excluding=inplace``. It will result in a
computation time that is in between fast compile and fast run.

Theano flag `reoptimize_unpickled_function` controls if an unpickled
theano function should reoptimize its graph or not. Theano users can
use the standard python pickle tools to save a compiled theano
function. When pickling, both graph before and after the optimization
are saved, including shared variables. When set to True, the graph is
reoptimized when being unpickled. Otherwise, skip the graph
optimization and use directly the optimized graph from the pickled
file. The default is False.

Faster Theano function
----------------------

You can set the Theano flag :attr:`allow_gc <config.allow_gc>` to ``False`` to get a speed-up by using
more memory. By default, Theano frees intermediate results when we don't need
them anymore. Doing so prevents us from reusing this memory. So disabling the
garbage collection will keep all intermediate results' memory space to allow to
reuse them during the next call to the same Theano function, if they are of the
correct shape. The shape could change if the shapes of the inputs change.

.. note::

   With :attr:`preallocate <config.gpuarray.preallocate>`, this isn't
   very useful with GPU anymore.

.. _unsafe_optimization:

Unsafe optimization
===================


Some Theano optimizations make the assumption that the user inputs are
valid. What this means is that if the user provides invalid values (like
incompatible shapes or indexing values that are out of bounds) and
the optimizations are applied, the user error will get lost. Most of the
time, the assumption is that the user inputs are valid. So it is good
to have the optimization being applied, but loosing the error is bad.
The newest optimization in Theano with such assumption will add an
assertion in the graph to keep the user error message. Computing
these assertions could take some time. If you are sure everything is valid
in your graph and want the fastest possible Theano, you can enable an
optimization that will remove those assertions with:
``optimizer_including=local_remove_all_assert``


Faster Small Theano function
----------------------------

.. note::

   For Theano 0.6 and up.

For Theano functions that don't do much work, like a regular logistic
regression, the overhead of checking the input can be significant. You
can disable it by setting ``f.trust_input`` to True.
Make sure the types of arguments you provide match those defined when
the function was compiled.

For example, replace the following

.. testcode:: faster

    import theano
    from theano import function

    x = theano.tensor.scalar('x')
    f = function([x], x + 1.)
    f(10.)

with

.. testcode:: faster

    import numpy
    import theano
    from theano import function

    x = theano.tensor.scalar('x')
    f = function([x], x + 1.)
    f.trust_input = True
    f(numpy.array([10.], dtype=theano.config.floatX))

Also, for small Theano functions, you can remove more Python overhead by
making a Theano function that does not take any input. You can use shared
variables to achieve this. Then you can call it like this: ``f.fn()`` or
``f.fn(n_calls=N)`` to speed it up. In the last case, only the last
function output (out of N calls) is returned.

You can also use the ``C`` linker that will put all nodes in the same C
compilation unit. This removes some overhead between node in the graph,
but requires that all nodes in the graph have a C implementation:

.. code-block:: python

    x = theano.tensor.scalar('x')
    f = function([x], (x + 1.) * 2, mode=theano.Mode(linker='c'))
    f(10.)

New GPU backend using libgpuarray
---------------------------------

The new theano GPU backend (:ref:`gpuarray`) uses ``config.gpuarray.preallocate`` for GPU memory allocation. 

Related Projects
----------------

We try to list in this `wiki page <https://github.com/Theano/Theano/wiki/Related-projects>`_ other Theano related projects.


"What are Theano's Limitations?"
--------------------------------

Theano offers a good amount of flexibility, but has some limitations too.
You must answer for yourself the following question: How can my algorithm be cleverly written
so as to make the most of what Theano can do?

Here is a list of some of the known limitations:

- *While*- or *for*-Loops within an expression graph are supported, but only via
  the :func:`theano.scan` op (which puts restrictions on how the loop body can
  interact with the rest of the graph).

- Neither *goto* nor *recursion* is supported or planned within expression graphs.