File: debugging.md

package info (click to toggle)
mtail 3.2.24-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 7,384 kB
  • sloc: yacc: 647; makefile: 226; sh: 78; lisp: 77; awk: 17
file content (118 lines) | stat: -rw-r--r-- 4,486 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
# Tips for debugging `mtail`


## Parser bugs

Run a test with logtostderr and mtailDebug up to 3, and parser_test_debug enabled to see any AST results.

```
go test -run TestParserRoundTrip/decrement_counter --logtostderr --mtailDebug=3 --parser_test_debug
```

`mtailDebug` at 2 dumps the parser states being traversed, and 3 includes the lexer token stream as well.

## Improving parser syntax error messages

You can use this to improve error messages in the `%error` section of [`parser.y`](../internal/runtime/compiler/parser/parser.y), if you compare the "error recovery pops" messages with the state machine in the generated [`y.output`](../internal/runtime/compiler/parser/y.output).


```
go generate && go test -run TestParseInvalidPrograms/statement_with_no_effect --logtostderr --mtailDebug=3  --parser_test_debug
```

error log from test:
```
...
state-14 saw LSQUARE
error recovery pops state 14
error recovery pops state 102
error recovery pops state 46
error recovery pops state 14
error recovery pops state 2
error recovery pops state 0
```

This log says the lexer sent a LSQUARE token, and the parser was in state 14 when it saw it.  The snippet below from `y.output` indicates state 14 is never expecting a LSQUARE, and the following lines in the log above show the state stack being popped -- 0, 2, 14, 49, 102, 14.

Walking backwards from state 0 (`$start`), we can get a list of nonterminal names to put in the state machine match expression used in the `%error` directive, and fill in the gaps with our knowledge of the intermediate states in our parse tree.

`y.output`:
```
state 14
	conditional_statement:  logical_expr.compound_statement ELSE compound_statement 
	conditional_statement:  logical_expr.compound_statement 
	logical_expr:  logical_expr.logical_op opt_nl bitwise_expr 

	AND  shift 47
	OR  shift 48
	MATCH  shift 49
	NOT_MATCH  shift 50
	LCURLY  shift 46
	.  error

	compound_statement  goto 44
	logical_op  goto 45
```

State 14 to state 46 shifts a LCURLY operator, follow state 46 and we will find ourselves in `compound_statement`.

Add to `parser.y` the names of the states that ended up at the unexpected token, followed by the error message:
```
%error stmt_list stmt conditional_statement logical_expr compound_statement conditional_statement logical_expr LSQUARE : "unexpected indexing of an expression"
```

and instead of "syntax error", the parser now emits "unexpected indexing of an expression".


## Fuzzer crashes

Build the fuzzer locally with clang and libfuzzer:

```
make vm-fuzzer fuzz CXX=clang CXXFLAGS=-fsanitize=fuzzer,address LIB_FUZZING_ENGINE=
```

Then we can run the fuzzer with our example crash; make sure it has no weird characters because the upstream fuzz executor doesn't shell-escape arguments.

```
./vm-fuzzer crash.mtail
```

If the crash is big, we can try to minimise it:

```
make fuzz-min CRASH=crash.mtail
```

Sometimes the minimiser will hit a local minima, but still look big; for example it doesn't know how to shrink variable names.

We can reformat the crash with [`cmd/mfmt`](../cmd/mfmt/main.go):

```
make mfmt
./mfmt --prog crash.mtail --write
```

so it's easier to read -- it'll be bigger cos of the whitespace and the minimiser should shrink it back to original size if everything is working well.

The formatted mtail program should help make it obvious what's happening and let you manually attempt to rename or remove parts of the program yourself -- perhaps a whole variable declaration and usage doesn't need to exist, but the minimiser will take a long time to figure that out.

Once we have the smallest program we can add it to the crash corpus in [`internal/runtime/fuzz/`](../internal/runtime/fuzz/) and running `make fuzz` should run and fail on it straight away.

Or, variants of the program can be added to the various `*Invalid` tests in parts of the `vm` module, e.g. [`parser_test.go`](../internal/runtime/compiler/parser/parser_test.go) or [`checker_test.go`](../internal/runtime/compiler/checker/checker_test.go) depending on where in the compiler the defect is occurring.

If the crash is in `vm.go` then we can dump the program to see what AST and types, and bytecode it generates.

```
make mtail
./mtail --logtostderr --dump_ast_types --dump_bytecode --mtailDebug=3 --compile_only --progs crash.mtail
```


### Fuzzer crashes, part 2

Run the fuzz-repro target with the CRASH variable set, it'll do all of the above:

```
make fuzz-repro CRASH=bug/20720.mtail
```