File: debugging.md

package info (click to toggle)
gumbo-parser 0.13.0%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: sid, trixie
  • size: 1,900 kB
  • sloc: ansic: 30,496; cpp: 3,836; python: 890; makefile: 93; sh: 15
file content (107 lines) | stat: -rw-r--r-- 3,553 bytes parent folder | download | duplicates (7)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
These are a couple of debugging notes that may be helpful for anyone developing
Gumbo or trying to diagnose a tricky problem.  They will probably not be
necessary for normal clients of this library - Gumbo is relatively stable, and
bugs are often rare and obscure.  However, they're handy to have as a reference,
and may also provide useful Google fodder to people searching for these tools.

Standard disclaimer: I use all of these techniques on my Ubuntu 14.04 computer
with gcc 4.8.2, clang 3.4, and gtest 1.6.0, but make no warranty about them
working on other systems.  In particular, they're almost certain not to work on
Windows.

Debug output
============

Gumbo has a compile-time switch to dump lots of debug output onto stdout.
Compile with the GUMBO_DEBUG define enabled:

```bash
$ make CFLAGS='-DGUMBO_DEBUG'
```

Note that this spits *a lot* of debug information to the console and makes the
program run significantly slower, so it's usually helpful to isolate only the
specific HTML file or fragment that causes the bug.  It lets us trace the
operation of each of the tokenizer & parser's state machines in depth, though.

Unit tests
==========

As mentioned in the README, Gumbo relies on [googletest][] for unit tests.
Unzip the gtest ZIP distribution inside the Gumbo root and rename it 'gtest'.
'make check' runs the tests, as normal.

```bash
$ make check
$ cat test-suite.log
```

If you need to debug a core dump, you'll probably want to run the test binary
directly:

```bash
$ ulimit -c unlimited
$ make check
$ .libs/lt-gumbo_test
$ gdb .libs/lt-gumbo_test core
```

The same goes for core dumps in other example binaries.

To run only a single unit test, pass the --gtest_filter='TestName' flag to the
lt-gumbo_test binary.

Assertions
==========

Gumbo relies pretty heavily on assertions.  By default they're enabled at
run-time: to turn them off, define NDEBUG:

```bash
$ make CFLAGS='-DNDEBUG'
```

ASAN
====

Google's [address-sanitizer][] is a helpful tool that lets you find memory
errors with relatively low overhead: enough that you can often run it in
production.  Enabling it for C/C++ binaries is pretty standard and described on
the ASAN documentation pages.  It requires Clang >=3.1 or GCC >= 4.8.

```bash
$ make \
    CFLAGS='-fsanitize=address -fno-omit-frame-pointer -fno-inline' \
    LDFLAGS='-fsanitize=address'
```

ASAN can also be used when Gumbo is compiled as a shared library and linked into
a scripting language via FFI, but this use-case is unsupported by the ASAN
authors.  To do it, use LD_PRELOAD to ensure the ASAN runtime support is
included in the process:

```bash
$ LD_PRELOAD=libasan.so.0 python -c 'import gumbo; gumbo.parse(problem_text)'
```

Getting clean stack traces from this requires the use of the llvm-symbolizer
binary, included with clang:

```bash
$ export ASAN_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer-3.4
$ export ASAN_OPTIONS=symbolize=1
$ LD_PRELOAD=libasan.so.0 python -c \
  'import gumbo; gumbo.parse(problem_text)' 2>&1 | head -100
$ killall llvm-symbolizer-3.4
$ killall llvm-symbolizer-3.4
$ killall llvm-symbolizer-3.4
```

This use case is even less officially supported than using it with dynamic
shared objects; on my machine, it led to a recursive ASAN error about a
use-after-free in llvm-symbolizer, effectively fork-bombing the machine.  Have
the killalls ready, and avoid letting the process run for too long (eg. piping
it to 'less').

[googletest]: https://code.google.com/p/googletest/
[address-sanitizer]: https://code.google.com/p/address-sanitizer/