File: README.md

package info (click to toggle)
chromium 139.0.7258.127-1
  • links: PTS, VCS
  • area: main
  • in suites:
  • size: 6,122,068 kB
  • sloc: cpp: 35,100,771; ansic: 7,163,530; javascript: 4,103,002; python: 1,436,920; asm: 946,517; xml: 746,709; pascal: 187,653; perl: 88,691; sh: 88,436; objc: 79,953; sql: 51,488; cs: 44,583; fortran: 24,137; makefile: 22,147; tcl: 15,277; php: 13,980; yacc: 8,984; ruby: 7,485; awk: 3,720; lisp: 3,096; lex: 1,327; ada: 727; jsp: 228; sed: 36
file content (149 lines) | stat: -rw-r--r-- 7,154 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
# HangWatcher

HangWatcher is a mechanism for detecting hangs in Chrome, logging their
frequency and nature in UMA and uploading crash reports.

## Definition of a hang
In this document a hang is qualified as any scope that does not complete
within a certain wall-time allowance. A scope is defined by the lifetime
of a `WatchHangsInScope` object. The time-out value can be different for
each individual scope.

### Example 1
A task on the IO thread encounters a lock on which it blocks for 20s.
There is absolutely no progress made as the OS is bound to deschedule
the thread while the contention on the lock remains. This is a hang.

### Example 2
A small function that should execute relatively quickly spends 30s
burning CPU without making any outwardly visible progress. In this
case there is progress made by the thread in a sense, since the
[program counter](https://en.wikipedia.org/wiki/Program_counter)
is not static for the duration of the time-out. However, as far as
Chrome, and critically its user, is concerned we are stuck and not
making progress. This is a hang.

### Example 3
A message pump is busy pumping millions of tasks and dispatches
them quickly. The task at the end of the queue has to wait for up
to 30s to get executed. This is not a hang. This is congestion.
See //content/scheduler/responsiveness for more details.

## Design

Hangs are monitored by one thread per process. This is a thread in
the OS sense. It is not based on `base::Thread` and does not use
the task posting APIs.

Other threads that want to be monitored register with this watcher
thread. This can be done at thread creation or at any other time.

Monitored threads do not have any responsibilities apart from
marking the entering and leaving of monitored scopes. This is
done using a `WatchHangsInScope` object that is instantiated
on the stack, at the beginning of the scope.

### Example:

```
void FooBar(){
  WatchHangsInScope scope(base::TimeDelta::FromSeconds(5));
  DoWork();
}
```


The HangWatcher thread periodically traverses the list of
registered threads and verifies that they are not hung
within a monitored scope.

```
+-------------+       +-----------------+                       +-----------------+
| HangWatcher |       | WatchedThread1  |                       | WatchedThread2  |
+-------------+       +-----------------+                       +-----------------+
       |                       |                                         |
       | Init()                |                                         |
       |-------                |                                         |
       |      |                |                                         |
       |<------                |                                         |
       |                       |                                         |
       |            Register() |                                         |
       |<----------------------|                                         |
       |                       |                                         |
       |                       |                              Register() |
       |<----------------------------------------------------------------|
       |                       |                                         |
       |                       |                                         | SetDeadline()
       |                       |                                         |--------------
       |                       |                                         |             |
       |                       |                                         |<-------------
       |                       |                                         |
       |                       |                                         | ClearDeadline()
       |                       |                                         |----------------
       |                       |                                         |               |
       |                       |                                         |<---------------
       |                       |                                         |
       | Monitor()             |                                         |
       |---------------------->|                                         |
       |                       | ------------------------\               |
       |                       |-| No deadline, no hang. |               |
       |                       | |-----------------------|               |
       |                       |                                         |
       | Monitor()             |                                         |
       |---------------------------------------------------------------->|
       |                       |                                         | ------------------------\
       |                       |                                         |-| No deadline, no hang. |
       |                       |                                         | |-----------------------|
       |                       |                                         |
       |                       | SetDeadline()                           |
       |                       |--------------                           |
       |                       |             |                           |
       |                       |<-------------                           |
       |                       |                                         |
       | Monitor()             |                                         |
       |---------------------->| -------------------------------\        |
       |                       |-| Live expired deadline. Hang! |        |
       |                       | |------------------------------|        |
       |                       |                                         |
       | RecordHang()          |                                         |
       |-------------          |                                         |
       |            |          |                                         |
       |<------------          |                                         |
       |                       |                                         |
```

## Protections against non-actionable reports

### Ignoring normal long running code

There are cases where code is expected to take a long time to complete.
It's possible to keep such cases from triggering the detection of a hang.
Invoking `HangWatcher::InvalidateActiveExpectations()` from within a
scope will make sure that not hangs are logged while execution is within it.

### Example:

```
void RunTask(Task task) {
  // In general, tasks shouldn't hang.
  WatchHangsInScope scope(base::TimeDelta::FromSeconds(5));

  std::move(task.task).Run();  // Calls `TaskKnownToBeVeryLong`.
}

void TaskKnownToBeVeryLong() {
  // This particular function is known to take a long time. Never report it as a
  // hang.
  HangWatcher::InvalidateActiveExpectations();

  BlockWaitingForUserInput();
}
```

### Protections against wrongfully blaming code

TODO

### Ignoring system suspend

TODO