File: Debugging.rst

package info (click to toggle)
emscripten 2.0.12~dfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 108,440 kB
  • sloc: ansic: 510,324; cpp: 384,763; javascript: 84,341; python: 51,362; sh: 50,019; pascal: 4,159; makefile: 3,409; asm: 2,150; lisp: 1,869; ruby: 488; cs: 142
file content (316 lines) | stat: -rw-r--r-- 16,989 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
.. _Debugging:

=========
Debugging
=========

One of the main advantages of debugging cross-platform Emscripten code is that the same cross-platform source code can be debugged on either the native platform or using the web browser's increasingly powerful toolset — including debugger, profiler, and other tools.

Emscripten provides a lot of functionality and tools to aid debugging:

- :ref:`Compiler debug information flags <debugging-debug-information-g>` that allow you to preserve debug information in compiled code and even create source maps so that you can step through the native C++ source when debugging in the browser.
- :ref:`Debug mode <debugging-EMCC_DEBUG>`, which emits debug logs and stores intermediate build files for analysis.
- :ref:`Compiler settings <debugging-compilation-settings>` to enable runtime checking of memory accesses and common allocation errors.
- :ref:`debugging-manual-debugging` of Emscripten-generated code is also supported, which is in some ways even better than on native platforms.
- :ref:`debugging-autodebugger`, which automatically instruments LLVM bitcode to write out each store to memory.

This article describes the main tools and settings provided by Emscripten for debugging, along with a section explaining how to debug a number of :ref:`debugging-emscripten-specific-issues`.


.. _debugging-debug-information-g:

Debug information
=================

:ref:`Emcc <emccdoc>` strips out most of the debug information from :ref:`optimized builds <Optimizing-Code>` by default. Optimisation levels :ref:`-O1 <emcc-O1>` and above remove LLVM debug information, and also disable runtime :ref:`ASSERTIONS <debugging-ASSERTIONS>` checks. From optimization level :ref:`-O2 <emcc-O2>` the code is minified by the :term:`Closure Compiler` and becomes virtually unreadable.

The *emcc* :ref:`-g flag <emcc-g>` can be used to preserve debug information in the compiled output. By default, this option includes Clang / LLVM debug information in a DWARF format in the generated WebAssembly code and preserves white-space, function names and variable names in the generated JavaScript code.

The flag can also be specified with one of five levels: :ref:`-g0 <emcc-g0>`, :ref:`-g1 <emcc-g1>`, :ref:`-g2 <emcc-g2>`, :ref:`-g3 <emcc-g3>` (default level), and :ref:`-g4 <emcc-g4>`. Each level builds on the last to provide progressively more debug information in the compiled output. See :ref:`Compiler debug information flags <debugging-debug-information-g>` for more details.

The :ref:`-g4 <emcc-g4>` option provides the most debug information — it generates source maps that allow you to view and debug the *C/C++ source code* in your browser's debugger on Firefox, Chrome or Safari!

.. note:: Some optimizations may be disabled when used in conjunction with the debug flags. For example, if you compile with ``-O3 -g4`` some of the normal ``-O3`` optimizations will be disabled in order to provide the requested debugging information.

.. _debugging-EMCC_DEBUG:

Debug mode (EMCC_DEBUG)
=======================

The ``EMCC_DEBUG`` environment variable can be set to enable Emscripten's debug mode:

.. code-block:: bash

  # Linux or macOS
  EMCC_DEBUG=1 emcc tests/hello_world.cpp -o hello.html

  # Windows
  set EMCC_DEBUG=1
  emcc tests/hello_world.cpp -o hello.html
  set EMCC_DEBUG=0

With ``EMCC_DEBUG=1`` set, :ref:`emcc <emccdoc>` emits debug output and generates intermediate files for the compiler's various stages. ``EMCC_DEBUG=2`` additionally generates intermediate files for each JavaScript optimizer pass.

The debug logs and intermediate files are output to
**TEMP_DIR/emscripten_temp**, where ``TEMP_DIR`` is the OS default temporary
directory (e.g. **/tmp** on UNIX).

The debug logs can be analysed to profile and review the changes that were made in each step.

.. note:: The more limited amount of debug information can also be enabled by specifying the :ref:`verbose output <debugging-emcc-v>` compiler flag (``emcc -v``).


.. _debugging-compilation-settings:

Compiler settings
==================

Emscripten has a number of compiler settings that can be useful for debugging. These are set using the :ref:`emcc -s <emcc-s-option-value>` option, and will override any optimization flags. For example:

.. code-block:: bash

  emcc -O1 -s ASSERTIONS=1 tests/hello_world

Some important settings are:

  -
    .. _debugging-ASSERTIONS:

    ``ASSERTIONS=1`` is used to enable runtime checks for common memory allocation errors (e.g. writing more memory than was allocated). It also defines how Emscripten should handle errors in program flow. The value can be set to ``ASSERTIONS=2`` in order to run additional tests.

    ``ASSERTIONS=1`` is enabled by default. Assertions are turned off for optimized code (:ref:`-O1 <emcc-O1>` and above).

  -
    .. _debugging-SAFE-HEAP:

    ``SAFE_HEAP=1`` adds additional memory access checks, and will give clear errors for problems like dereferencing 0 and memory alignment issues.

    You can also set ``SAFE_HEAP_LOG`` to log ``SAFE_HEAP`` operations.

  -
    .. _debugging-STACK_OVERFLOW_CHECK:

    Passing the ``STACK_OVERFLOW_CHECK=1`` linker flag adds a runtime magic
    token value at the end of the stack, which is checked in certain locations
    to verify that the user code does not accidentally write past the end of the
    stack. While overrunning the Emscripten stack is not a security issue for
    JavaScript (which is unaffected), writing past the stack causes memory
    corruption in global data and dynamically allocated memory sections in the
    Emscripten HEAP, which makes the application fail in unexpected ways. The
    value ``STACK_OVERFLOW_CHECK=2`` enables slightly more detailed stack guard
    checks, which can give a more precise callstack at the expense of some
    performance. Default value is 1 if ``ASSERTIONS=1`` is set, and disabled
    otherwise.

  -
    .. _debugging-DEMANGLE_SUPPORT:

    ``DEMANGLE_SUPPORT=1`` links in code to automatically demangle stack traces, that is, emit human-readable C++ function names instead of ``_ZN..`` ones.

A number of other useful debug settings are defined in `src/settings.js <https://github.com/emscripten-core/emscripten/blob/master/src/settings.js>`_. For more information, search that file for the keywords "check" and "debug".

.. _debugging-sanitizers:

Sanitizers
==========

Emscripten also supports some of Clang's sanitizers, such as :ref:`sanitizer_ubsan` and :ref:`sanitizer_asan`.

.. _debugging-emcc-v:

emcc verbose output
===================

Compiling with the :ref:`emcc -v <emcc-verbose>` will cause Emscripten to output
the sub-command that it runs as well as passes ``-v`` to Clang.

.. _debugging-manual-debugging:

Manual print debugging
======================

You can also manually instrument the source code with ``printf()`` statements, then compile and run the code to investigate issues.

If you have a good idea of the problem line you can add ``print(new Error().stack)`` to the JavaScript to get a stack trace at that point. Also available is :js:func:`stackTrace`, which emits a stack trace and also tries to demangle C++ function names if ``DEMANGLE_SUPPORT`` is enabled (if you don't want or need C++ demangling in a specific stack trace, you can call :js:func:`jsStackTrace`).

Debug printouts can even execute arbitrary JavaScript. For example::

  function _addAndPrint($left, $right) {
    $left = $left | 0;
    $right = $right | 0;
    //---
    if ($left < $right) console.log('l<r at ' + stackTrace());
    //---
    _printAnInteger($left + $right | 0);
  }


Handling C++ exceptions from javascript
=======================================

C++ exceptions are thrown from WebAssembly using exception pointers, which means
that try/catch/finally blocks in JavaScript will only receive a number, which
represents a pointer into linear memory. In order to get the exception message,
the user will need to create some WASM code which will extract the meaning from
the exception. In the example code below we created a function that receives the
address of a ``std::exception``, and by casting the pointer
returns the ``what`` function call result.

.. code-block:: cpp

  #include <bind.h>

  std::string getExceptionMessage(intptr_t exceptionPtr) {
    return std::string(reinterpret_cast<std::exception *>(exceptionPtr)->what());
  }

  EMSCRIPTEN_BINDINGS(Bindings) {
    emscripten::function("getExceptionMessage", &getExceptionMessage);
  };

Once such a function has been created, exception handling code in javascript
can call it when receiving an exception from WASM. Here the function is used
in order to log the thrown exception.

.. code-block:: javascript

  try {
    ... // some code that calls WebAssembly
  } catch (exception) {
    console.error(Module.getExceptionMessage(exception));
  } finally {
    ...
  }

It's important to notice that this example code will work only for thrown
statically allocated exceptions. If your code throws other objects, such as
strings or dynamically allocated exceptions, the handling code will need to
take that into account. For example, if your code needs to handle both native
C++ exceptions and JavaScript exceptions you could use the following code to
distinguish between them:

.. code-block:: javascript

  function getExceptionMessage(exception) {
    return typeof exception === 'number'
      ? Module.getExceptionMessage(exception)
      : exception;
  }

.. _debugging-emscripten-specific-issues:

Emscripten-specific issues
==========================

Memory Alignment Issues
-----------------------

The :ref:`Emscripten memory representation <emscripten-memory-model>` is compatible with C and C++. However, when undefined behavior is involved you may see differences with native architectures, and also differences between Emscripten's output for asm.js and WebAssembly:

- In asm.js, loads and stores must be aligned, and performing a normal load or store on an unaligned address can fail silently (access the wrong address). If the compiler knows a load or store is unaligned, it can emulate it in a way that works but is slow.
- In WebAssembly, unaligned loads and stores will work. Each one is annotated with its expected alignment. If the actual alignment does not match, it will still work, but may be slow on some CPU architectures.

.. tip:: :ref:`SAFE_HEAP <debugging-SAFE-HEAP>` can be used to reveal memory alignment issues.

Generally it is best to avoid unaligned reads and writes — often they occur as the result of undefined behavior, as mentioned above. In some cases, however, they are unavoidable — for example if the code to be ported reads an ``int`` from a packed structure in some pre-existing data format. In that case, to make things work properly in asm.js, and be fast in WebAssembly, you must be sure that the compiler knows the load or store is unaligned. To do so you can:

- Manually read individual bytes and reconstruct the full value
- Use the :c:type:`emscripten_align* <emscripten_align1_short>` typedefs, which define unaligned versions of the basic types (``short``, ``int``, ``float``, ``double``). All operations on those types are not fully aligned (use the ``1`` variants in most cases, which mean no alignment whatsoever).


Function Pointer Issues
-----------------------

If you get an ``abort()`` from a function pointer call to ``nullFunc`` or ``b0`` or ``b1`` (possibly with an error message saying "incorrect function pointer"), the problem is that the function pointer was not found in the expected function pointer table when called.

.. note:: ``nullFunc`` is the function used to populate empty index entries in the function pointer tables (``b0`` and ``b1`` are shorter names used for ``nullFunc`` in more optimized builds).  A function pointer to an invalid index will call this function, which simply calls ``abort()``.

There are several possible causes:

- Your code is calling a function pointer that has been cast from another type (this is undefined behavior but it does happen in real-world code). In optimized Emscripten output, each function pointer type is stored in a separate table based on its original signature, so you *must* call a function pointer with that same signature to get the right behavior (see :ref:`portability-function-pointer-issues` in the code portability section for more information).
- Your code is calling a method on a ``NULL`` pointer or dereferencing 0. This sort of bug can be caused by any sort of coding error, but manifests as a function pointer error because the function can't be found in the expected table at runtime.

In order to debug these sorts of issues:

- Compile with ``-Werror``. This turns warnings into errors, which can be useful as some cases of undefined behavior would otherwise show warnings.
- Use ``-s ASSERTIONS=2`` to get some useful information about the function pointer being called, and its type.
- Look at the browser stack trace to see where the error occurs and which function should have been called.
- Build with :ref:`SAFE_HEAP=1 <debugging-SAFE-HEAP>`.
- :ref:`Sanitizers` can help here, in particular UBSan.

Another function pointer issue is when the wrong function is called. :ref:`SAFE_HEAP=1 <debugging-SAFE-HEAP>` can help with this as it detects some possible errors with function table accesses.


Infinite loops
--------------

Infinite loops cause your page to hang. After a period the browser will notify the user that the page is stuck and offer to halt or close it.

If your code hits an infinite loop, one easy way to find the problem code is to use a *JavaScript profiler*. In the Firefox profiler, if the code enters an infinite loop you will see a block of code doing the same thing repeatedly near the end of the profile.

.. note:: The :ref:`emscripten-runtime-environment-main-loop` may need to be re-coded if your application uses an infinite main loop.



.. _debugging-autodebugger:

AutoDebugger
============

The *AutoDebugger* is the 'nuclear option' for debugging Emscripten code.

.. warning:: This option is primarily intended for Emscripten core developers.

The *AutoDebugger* will rewrite the output so it prints out each store to memory. This is useful because you can compare the output for different compiler settings in order to detect regressions.

The *AutoDebugger* can potentially find **any** problem in the generated code, so it is strictly more powerful than the ``CHECK_*`` settings and ``SAFE_HEAP``. One use of the *AutoDebugger* is to quickly emit lots of logging output, which can then be reviewed for odd behavior. The *AutoDebugger* is also particularly useful for :ref:`debugging regressions <debugging-autodebugger-regressions>`.

The *AutoDebugger* has some limitations:

-  It generates a lot of output. Using *diff* can be very helpful for identifying changes.
-  It prints out simple numerical values rather than pointer addresses (because pointer addresses change between runs, and hence can't be compared). This is a limitation because sometimes inspection of addresses can show errors where the pointer address is 0 or impossibly large. It is possible to modify the tool to print out addresses as integers in ``tools/autodebugger.py``.

To run the *AutoDebugger*, compile with the environment variable ``EMCC_AUTODEBUG=1`` set. For example:

.. code-block:: bash

  # Linux or macOS
  EMCC_AUTODEBUG=1 emcc tests/hello_world.cpp -o hello.html

  # Windows
  set EMCC_AUTODEBUG=1
  emcc tests/hello_world.cpp -o hello.html
  set EMCC_AUTODEBUG=0


.. _debugging-autodebugger-regressions:

AutoDebugger Regression Workflow
---------------------------------

Use the following workflow to find regressions with the *AutoDebugger*:

- Compile the working code with ``EMCC_AUTODEBUG=1`` set in the environment.
- Compile the code using ``EMCC_AUTODEBUG=1`` in the environment again, but this time with the settings that cause the regression. Following this step we have one build before the regression and one after.
- Run both versions of the compiled code and save their output.
- Compare the output using a *diff* tool.

Any difference between the outputs is likely to be caused by the bug.

.. note::
    You may want to use ``-s DETERMINISTIC`` which will ensure that timing
    and other issues don't cause false positives.


Useful Links
============

- `Blogpost about reading compiler output <http://mozakai.blogspot.com/2014/06/looking-through-emscripten-output.html>`_.
- `GDC 2014: Getting started with asm.js and Emscripten <http://people.mozilla.org/~lwagner/gdc-pres/gdc-2014.html#/20>`_ (Debugging slides).

Need help?
==========

The :ref:`Emscripten Test Suite <emscripten-test-suite>` contains good examples of almost all functionality offered by Emscripten. If you have a problem, it is a good idea to search the suite to determine whether test code with similar behavior is able to run.

If you've tried the ideas here and you need more help, please :ref:`contact`.