File: ceccomp.adoc

package info (click to toggle)
ceccomp 3.5-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 1,528 kB
  • sloc: ansic: 3,154; python: 653; makefile: 240; sh: 226
file content (428 lines) | stat: -rw-r--r-- 13,017 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
= ceccomp(1)
dbgbgtf <dudududumaxver@outlook.com>; RocketDev <ma2014119@outlook.com>
{VERSION}, {TAG_TIME}
:doctype: manpage
:docdatetime: {TAG_TIME}
:manmanual: Ceccomp Manual
:mansource: ceccomp {VERSION}
:imagesdir: images/
:stylesheet: boot-slate.css

== NAME

ceccomp - A tool to analyze seccomp filters

== SYNOPSIS

    usage: ceccomp <asm|disasm|emu|trace|probe|version|help> [FILE] [-q|--quiet]
                   [-f|--format FMT] [-a|--arch ARCH] [-p|--pid PID]
                   [-o|--output FILE] [-c|--color WHEN] ...

== CONCEPT

Kernel use BPF filters to limit syscall rules, applied via `seccomp` or `prctl`
syscall. For example, down below is a simple filter to block `execve` syscall in
hex format:

    1: 20 00 00 00 00 00 00 00     $A = $syscall_nr
    2: 15 00 00 01 3b 00 00 00     if ($A != execve) goto 4
    3: 06 00 00 00 00 00 00 00     return KILL
    4: 06 00 00 00 00 00 ff 7f     return ALLOW

The part presented in hex is what kernel received, and `ceccomp` take it to
disassemble back to human readable text. For instance the *lineno* in the left
and *statement* in the right.

IMPORTANT: Later I'll use _TEXT_ in short for BPF human readable text, and use
_RAW_ in short for BPF raw format, please keep that in mind.

== DESCRIPTION

`ceccomp` have 5 main functions, basically it's a C version of `seccomp-tools`,
however, there are some breaking changes you need to know, which will be
highlighted in each subcommand section.

=== asm - ASSEMBLE

    ceccomp asm [-c WHEN] [-a ARCH] [-f FMT] [TEXT]

Assemble _TEXT_ to _RAW_. Use it to embed hand written filter rules into C code
or to see the original code of some _TEXT_.

WHEN::
Determines when to display warnings and errors in color. If the value is _auto_,
ceccomp will display color when the output target is a "tty". Can be _auto_, _never_ or
_always_. The default value is _auto_.

ARCH::
Set to any architecture libseccomp supports. Will be used to determine
the actual syscall number behind the name (for example, on x86_64, you could write
`"execve"` instead of `59` like the basic example above). Your system arch will be
taken if not set via `uname`. The default value on your system is {ARCH}.

FMT::
Determines how `ceccomp` produces binary-format bpf code. Can be _hexfmt_,
_hexline_ or _raw_. You could find sample output in <<EXAMPLES>> section.
The default value is _hexline_.

TEXT::
Take a optional filename to determine which file containing _TEXT_ will
be assembled. Will read from _stdin_ if not set.

Please check out <<TEXT GRAMMAR REFERENCE>> section to see how to write a rule by
hand. Some examples will be displayed in <<EXAMPLES>> section.

|===
|Command|Difference

|`seccomp-tools asm`
|Use its own grammar to assemble, a bit script like

|`ceccomp asm`
|You can just take `disasm` output to `asm`, no new grammar is needed to learn;
take `stdin` as input by default
|===

=== disasm - DISASSEMBLE

    ceccomp disasm [-c WHEN] [-a ARCH] [RAW]

Disassemble _RAW_ to _TEXT_. Use it to see what does a filter do if you could not
access filter via `trace` and have to manually extract the filter out.

WHEN::
Argument description can be found in <<asm - ASSEMBLE>> section. `disasm` may print
more text in color including syntax highlighting for _TEXT_.

ARCH::
Set to any architecture libseccomp supports. Will be used to determine
how filtered syscall number in _RAW_ filter is translated to syscall name (for example,
on x86_64, the number `0x3b` is translated to `execve` if is comparing syscall_nr, see
the basic example above). The default value on your system is {ARCH}.

NOTE: ceccomp will try to resolve syscall number under an arch ONLY IF that at that line,
arch can be determined. On foreign arch (not equal to the arch you set), the foreign arch
will be prepended to syscall name. You may notice that in some cases, seccomp-tools is able
to resolve the name while ceccomp is not, that may be intended as the arch is not determined.

|===
|Command|Difference

|`seccomp-tools disasm`
|Disassembles in its format; never check if the filter is valid

|`ceccomp disasm`
|Disassembles in ceccomp format, and takes `stdin` as input by default; check arch strictly
and always display foreign arch name
|===

=== emu - EMULATE

    ceccomp emu [-c WHEN] [-a ARCH] [-q] [TEXT] SYSCALL_NAME/SYSCALL_NR [ARGS[0] ARGS[1] ... ARGS[5] PC]

Emulate what will happen if `syscall(SYSCALL_NR, ARGS[0], ARGS[1], ..., ARGS[5])`
from `PC` is called following rules described in _TEXT_. Use it to see the result
without actually running it in program or you don't want to examine the filter rule
manually. This subcommand can be used to automatically examining a filter.

WHEN::
Argument description can be found in <<asm - ASSEMBLE>> section. `emu` may print
more text in color including syntax highlighting for _TEXT_ and skipped statements.

SYSCALL_NAME/SYSCALL_NR::
If you set *SYSCALL_NAME* (like `execve`), it will be translated to *SYSCALL_NR*
under *ARCH* first. Or else set *SYSCALL_NR* directly (like `59`). Then the nr
will be tested against the bpf filter to see the result of that syscall. This
argument is NOT optional.

ARGS[0-5] and PC::
Register values when calling syscall. For example,
on x86_64, these are equivalent to `rdi`, `rsi`, `rdx`, `r10`, `r8`, `r9` and
`rip`. Their default value is 0.

ARCH::
Argument description can be found in <<asm - ASSEMBLE>> section.

TEXT::
Take a optional filename to determine which file containing _TEXT_ rule will
be tested. Will read from _stdin_ if not set.

|===
|Command|Difference

|`seccomp-tools emu`
|Take a _RAW_ as input

|`ceccomp emu`
|Take a _TEXT_ as input and take `stdin` as input by default; set *PC* is
possible
|===

=== trace - TRACE FILTER IN RUNTIME

    ceccomp trace [-c WHEN] [-o FILE] PROGRAM [program-args]
                  [-c WHEN] -p PID

The first line captures filters *PROGRAM* loads in runtime by tracing it;
the second line extract seccomp filters from *PID*; once fetched filters,
print them in _TEXT_. You can only choose one of the two formats above.
Use this if running the program is the simplest way to fetch bpf filters
or a program with seccomp filters installed is waiting for input.

WHEN::
Argument description can be found in <<asm - ASSEMBLE>> section. `trace` may print
more text in color including syntax highlighting for _TEXT_.

FILE::
May be useful when *PROGRAM* produces quite a lot output in _stderr_.
`ceccomp` allow user to close _stdin_ and _stdout_ to limit *PROGRAM*
input and output, so `ceccomp` use _stderr_ to print messages when running *PROGRAM*,
set *FILE* if you want to see _TEXT_ in some other file.

PROGRAM::
Set to the program you want to run, and *program-args* are its
arguments just like running shell command `exec PROGRAM program-args`.

PID::
Set to the pid you want to inspect. *PID* is conflict with *PROGRAM*;
you could either run a program dynamically or examine a pid in one command.

NOTE: To extract filters from *PID*, `CAP_SYS_ADMIN` is needed and
`CAP_SYS_PTRACE` may also be needed, the easiest way to acquire them is
calling `ceccomp` with `sudo`.

NOTE: Since _version 3.1_, multiple process tracing is introduced, and when tracee
forking/resolving/exiting, an extra INFO message is printed. You can discard
it by running command like `ceccomp trace -o $(tty) PROG 2>/dev/null`.

|===
|Command|Difference

|`seccomp-tools dump`
|Setting output format is possible; each filter can be output to a different
file; killing *PROGRAM* once *LIMIT* times of filters loaded; wrapping *PROGRAM*
in `sh -c`

|`ceccomp trace`
|All filters are output to a single file; never kill *PROGRAM*; *PROGRAM* is
launched directly, so `./` is not needed; explicitly print when forking
|===

=== probe - TEST COMMON SYSCALLS INSTANTLY

    ceccomp probe [-c WHEN] [-o FILE] PROGRAM [program-args]

Run *PROGRAM* with *program-args* to captures *FIRST* seccomp filter, and then
kill all children. Use it when a quick check against a program is needed,
and detect potential seccomp rule issues.

All argument descriptions can be found in <<trace - TRACE FILTER IN RUNTIME>> section.

The output for this subcommand is the emulating result of common syscalls
like `execve`, `open` and so on. If the filter itself is not capable of
blocking syscalls, you could know that with a glance.

Typical output for this subcommand is described below, more detailed example
could be found in <<EXAMPLES>> section.

    open      -> ALLOW
    read      -> ALLOW
    write     -> ALLOW
    execve    -> KILL
    execveat  -> KILL
    mmap      -> ALLOW
    mprotect  -> ALLOW
    openat    -> ALLOW
    sendfile  -> ALLOW
    ptrace    -> ERRNO(1)
    fork      -> ALLOW

NOTE: `seccomp-tools` don't have this subcommand.

== TEXT GRAMMAR REFERENCE

A valid _TEXT_ could only contain *statement* like `$A = $arch`, but adding
an extra *lineno* may help you much. *lineno* starts from 1, and always
bases 10.

BPF ops which are not described below are banned by kernel.

=== Optional Wrapper

`ceccomp disasm` displays a lot of things, but most of them are optional
for asm.

    Line  CODE  JT   JF      K
    ---------------------------------
    0001: 0x06 0x00 0x00 0x7fff0000 return ALLOW
    ---------------------------------

Only `return ALLOW`, the *statement* is needed.

NOTE: There are some slight difference between `ceccomp disasm` and
`seccomp-tools disasm`, down below is a general example. And some
statements are different, so don't pipe seccomp-tools output to ceccomp
blindly.

    line  CODE  JT   JF      K
    =================================
    0000: 0x06 0x00 0x00 0x7fff0000  return ALLOW

=== Assignment

`A` can be set to seccomp attributes directly. But `X` can not be assigned with
seccomp attributes directly due to kernel limit.

    $A = $arch
    $A = $syscall_nr

To assign `A` with those 64-bit long fields, `low_` or `high_` prefix is needed.

    $A = $low_pc
    $A = $high_pc
    $A = $low_args[0]
    $A = $high_args[0]
    ...
    $A = $low_args[5]
    $A = $high_args[5]

A special attribute is `sizeof(struct seccomp_data)`, that can be assigned to
`A` or `X` directly.

    $A = $scmp_data_len
    $X = $scmp_data_len

Temporary memory is 32-bit, to access them, you could use hex or dec as index.
Both `A` and `X` is assignable. Assigning immediate values to `A` or `X` accepts
any format of number if you imply the correct base by "0x" or "0b".

    $X = $mem[0]
    $A = $mem[0xf]
    $A = $mem[15] # both hex and dec index are OK
    $A = 0
    $X = 0x3b
    $A = 0b111
    $X = 0777

You could also assign `X` to `A` or in the reverse order. Assign `X` or `A` to
temporary memory is definitely okay.

    $A = $X
    $X = $A
    $mem[3] = $X
    $mem[0x4] = $A

=== Arithmetic Operations

Various operations can be applied to `A`.

    $A += 30
    $A -= 4
    $A *= 9
    $A /= 1
    $A &= 7
    $A >>= 6

The right value can be `X`.

    $A &= $X
    $A |= $X
    $A ^= $X
    $A <<= $X

And there is a way to negativate `A`.

    $A = -$A

=== Jump Downwards If ...

Unconditional jump:

    goto 3

Jump if:

    if ($A == execve) goto 3
    if ($A != 1234) goto 4
    if ($A & $X) goto 5
    if !($A & 7) goto 6
    if ($A <= $X) goto 7

If true jump to ... if false jump to...:
    
    if ($A > $X) goto 3, else goto 4
    if ($A >= 4567) goto 5, else goto 6

ONLY in conditions, you CAN replace number with syscall name. In example
above, `0x3b` is replaced by `execve`. All the syscall name will be
resolved to syscall number under your selected arch. If you want to resolve
a syscall name in foreign arch (not equal to your selected arch), please
prepend a arch and dot. For example, your arch is x86_64, and you are writing
_aarch64_ rules, then please write like:

    if ($A == aarch64.read) goto 5

Note that if you manually set arch to _aarch64_ with `-a aarch64`,
you can omit `aarch64.` in statement.

=== Return Code

Return value of register `A`:

    return $A

Or return a immediate value, with extra field in `()`. Actions including
`TRACE`, `TRAP` and `ERRNO` accept an extra field, without `()`, they are
treated as `action(0)`:

    return KILL
    return KILL_PROCESS
    return TRAP(123)
    return ERRNO(0)
    return TRACE
    return TRACE(3)
    return LOG
    return NOTIFY

== EXAMPLES

ifdef::backend-manpage[]
Manpage can not display images, so please check out html version of
this page to see examples.
endif::[]

ifndef::backend-manpage[]
=== asm example
image::asm.png[]
=== disasm example
image::disasm.png[]
=== emu example
image::emu.png[]
image::emu_quiet.png[]
=== trace example
Running program:

image::trace.png[]

If set `-o FILE`:

image::output_trick.png[]

Pid mode:

image::trace_pid.png[]

Completion for pid mode is available under zsh:

image::trace_completion.png[]

=== probe example
image::probe.png[]
endif::[]

== REPO

Visit https://github.com/dbgbgtf1/Ceccomp to find the code.
Pull Requests and Issues are welcome!

Copyright (C) 2025-present, distributed under GPLv3 or later.