File: ceccomp.adoc

package info (click to toggle)
ceccomp 4.0-1
  • links: PTS, VCS
  • area: main
  • in suites:
  • size: 1,604 kB
  • sloc: ansic: 6,470; python: 1,039; makefile: 248; sh: 145
file content (512 lines) | stat: -rw-r--r-- 16,677 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
= ceccomp(1)
dbgbgtf <dudududumaxver@outlook.com>; RocketDev <ma2014119@outlook.com>
:doctype: manpage
:docdatetime: {TAG_TIME}
:manmanual: Ceccomp Manual
:mansource: ceccomp {VERSION}
:manversion: {VERSION}
:imagesdir: images/
:stylesheet: boot-slate.css

== NAME

ceccomp - A tool to analyze seccomp filters

== SYNOPSIS

    usage: ceccomp <asm|disasm|emu|trace|probe|version|help> [FILE] [-q|--quiet]
                   [-f|--format FMT] [-a|--arch ARCH] [-p|--pid PID] [-s|--seize]
                   [-o|--output FILE] [-c|--color WHEN] ...

== CONCEPT

Kernel use BPF filters to limit syscall rules, applied via `seccomp` or `prctl`
syscall. For example, down below is a simple filter to block `execve` syscall in
hex format:

    1: 20 00 00 00 00 00 00 00     $A = $syscall_nr
    2: 15 00 00 01 3b 00 00 00     if ($A != execve) goto 4
    3: 06 00 00 00 00 00 00 00     return KILL
    4: 06 00 00 00 00 00 ff 7f     return ALLOW

The part presented in hex is what kernel received, and `ceccomp` take it to
disassemble back to human readable text. For instance the *lineno* in the left
and *statement* in the right.

IMPORTANT: Later I'll use _TEXT_ in short for BPF human readable text, and use
_RAW_ in short for BPF raw format, please keep that in mind.

== DESCRIPTION

`ceccomp` have 5 main functions, basically it's a C version of `seccomp-tools`,
however, there are some breaking changes you need to know, which will be
highlighted in each subcommand section.

=== asm - ASSEMBLE

    ceccomp asm [-c WHEN] [-a ARCH] [-f FMT] [TEXT]

Assemble _TEXT_ to _RAW_. Use it to embed hand written filter rules into C code
or to see the original code of some _TEXT_.

WHEN::
Determines when to display warnings and errors in color. If the value is _auto_,
ceccomp will display color when the output target is a "tty". Can be _auto_, _never_ or
_always_. The default value is _auto_.

ARCH::
Set to any architecture libseccomp supports. Will be used to determine
the actual syscall number behind the name (for example, on x86_64, you could write
`"execve"` instead of `59` like the basic example above). Your system arch will be
taken if not set via `uname`. The default value on your system is {ARCH}.

NOTE: Since _version 4.0_, endianness is considered. If target endianness *ARCH* is
different from machine endianness, the filters will be reversed (CODE and K) before
outputting.

FMT::
Determines how `ceccomp` produces binary-format bpf code. Can be _hexfmt_,
_hexline_ or _raw_. You could find sample output in <<EXAMPLES>> section.
The default value is _hexline_.

TEXT::
Take a optional filename to determine which file containing _TEXT_ will
be assembled. Will read from _stdin_ if not set. `-` is treated as _stdin_.

IMPORTANT: The assembly syntax was changed greatly since _version 4.0_,
please checkout grammar reference below!

Please check out <<TEXT GRAMMAR REFERENCE>> section to see how to write a rule by
hand. Some examples will be displayed in <<EXAMPLES>> section.

|===
|Command|Difference

|`seccomp-tools asm`
|Use its own grammar to assemble, a bit script like; can assemble invalid _TEXT_
which will be rejected by kernel

|`ceccomp asm`
|You can just take `disasm` output to `asm`, no new grammar is needed to learn;
take `stdin` as input by default
|===

=== disasm - DISASSEMBLE

    ceccomp disasm [-c WHEN] [-a ARCH] [RAW]

Disassemble _RAW_ to _TEXT_. Use it to see what does a filter do if you could not
access filter via `trace` and have to manually extract the filter out.

WHEN::
Argument description can be found in <<asm - ASSEMBLE>> section. `disasm` may print
more text in color including syntax highlighting for _TEXT_.

ARCH::
Set to any architecture libseccomp supports. Will be used to determine
how filtered syscall number in _RAW_ filter is translated to syscall name (for example,
on x86_64, the number `0x3b` is translated to `execve` if is comparing syscall_nr, see
the basic example above). The default value on your system is {ARCH}.

RAW::
A binary file with raw BPF codes. Takes _stdin_ as input if not set. Treat `-` as _stdin_.
The file is arch-revelent, so it may not be portable on different archs.

NOTE: Since _version 4.0_, endianness is considered. If target endianness *ARCH* is
different from machine endianness, the filters will be reversed (CODE and K) before
decoding.

NOTE: ceccomp will try to resolve syscall number under an arch ONLY IF that at that line,
arch can be determined. On foreign arch (not equal to the arch you set), the foreign arch
will be prepended to syscall name. You may notice that in some cases, seccomp-tools is able
to resolve the name while ceccomp is not, that may be intended as the arch is not determined.

|===
|Command|Difference

|`seccomp-tools disasm`
|Disassembles in its format; never check if the filter is valid

|`ceccomp disasm`
|Disassembles in ceccomp format, and takes `stdin` as input by default; check arch strictly
and always display foreign arch name
|===

=== emu - EMULATE

    ceccomp emu [-c WHEN] [-a ARCH] [-q] TEXT SYSCALL_NAME/SYSCALL_NR [ARGS[0] ARGS[1] ... ARGS[5] PC]

Emulate what will happen if `syscall(SYSCALL_NR, ARGS[0], ARGS[1], ..., ARGS[5])`
from `PC` is called following rules described in _TEXT_. Use it to see the result
without actually running it in program or you don't want to examine the filter rule
manually. This subcommand can be used to automatically examining a filter.

WHEN::
Argument description can be found in <<asm - ASSEMBLE>> section. `emu` may print
more text in color including syntax highlighting for _TEXT_ and skipped statements.

SYSCALL_NAME/SYSCALL_NR::
If you set *SYSCALL_NAME* (like `execve`), it will be translated to *SYSCALL_NR*
under *ARCH* first. Or else set *SYSCALL_NR* directly (like `59`). Then the nr
will be tested against the bpf filter to see the result of that syscall. This
argument is NOT optional.

ARGS[0-5] and PC::
Register values when calling syscall. For example,
on x86_64, these are equivalent to `rdi`, `rsi`, `rdx`, `r10`, `r8`, `r9` and
`rip`. Their default value is 0.

ARCH::
Argument description can be found in <<asm - ASSEMBLE>> section.

TEXT::
Take a filename to determine which file containing _TEXT_ rule will be tested.
Note that filename CAN NOT be ignored as ceccomp can not determine if a
positional argument is syscall or filename. Use `-` to refer to _stdin_.

-q, --quiet::
Only print the eval result of the filter. For example, if last statement emulated
is `return KILL`, then `KILL` is printed.

|===
|Command|Difference

|`seccomp-tools emu`
|Take a _RAW_ as input

|`ceccomp emu`
|Take a _TEXT_ as input and take `stdin` as input by default; set *PC* is
possible
|===

=== trace - TRACE FILTER IN RUNTIME

    ceccomp trace [-c WHEN] [-q] [-o FILE] PROGRAM [program-args]
                  [-c WHEN] [-q] -p PID [-s]

The first line captures filters *PROGRAM* loads in runtime by tracing it;
the second line extract seccomp filters from *PID*, or trace *PID* to capture
subsequent seccomp filters; once fetched filters, print them in _TEXT_.
You can only choose one of the two formats above. Use this if running the
program is the simplest way to fetch bpf filters or a program with seccomp
filters installed is waiting for input.

WHEN::
Argument description can be found in <<asm - ASSEMBLE>> section. `trace` may print
more text in color including syntax highlighting for _TEXT_.

FILE::
May be useful when *PROGRAM* produces quite a lot output in _stderr_.
`ceccomp` allow user to close _stdin_ and _stdout_ to limit *PROGRAM*
input and output, so `ceccomp` use _stderr_ to print messages when running *PROGRAM*,
set *FILE* if you want to see _TEXT_ in some other file. Treat `-` as _stdout_.

PROGRAM::
Set to the program you want to run, and *program-args* are its
arguments just like running shell command `exec PROGRAM program-args`.

PID::
Set to the pid you want to inspect. *PID* is conflict with *PROGRAM*;
you could either run a program dynamically or examine a pid in one command.
Without `-s` flag, trace pid will try to extract seccomp filter in *PID* via
`ptrace(PTRACE_SECCOMP_GET_FILTER)`, which may not be available in some systems.

-s, --seize::
*ONLY AVAILABLE FOR TRACE PID MODE.* Set this flag will override trace pid
behavior to attach to *PID* and keep tracing for seccomp filter loading like
trace prog mode. _This flag was introduced in version 4.0._

-q, --quiet::
Set to suppress the extra *[INFO]* prints when detect process forking, exiting or
seccomp filter loading. _This flag was introduced in version 4.0._

NOTE: To extract filters from *PID*, `CAP_SYS_ADMIN` is needed (without `-s` flag)
and `CAP_SYS_PTRACE` may also be needed, the easiest way to acquire
them is calling `ceccomp` with `sudo`.

NOTE: Since _version 3.1_, multiple process tracing is introduced, and when tracee
forking/resolving/exiting, an extra INFO message is printed. You can discard
it by running command like `ceccomp trace -q PROG 2>/dev/null`.

|===
|Command|Difference

|`seccomp-tools dump`
|Setting output format is possible; each filter can be output to a different
file; killing *PROGRAM* once *LIMIT* times of filters loaded; wrapping *PROGRAM*
in `sh -c`

|`ceccomp trace`
|All filters are output to a single file; never kill *PROGRAM*; *PROGRAM* is
launched directly, so `./` is not needed; explicitly print when forking;
able to attach to pid for dynamic seccomp filter capturing
|===

=== probe - TEST COMMON SYSCALLS INSTANTLY

    ceccomp probe [-c WHEN] [-o FILE] [-q] PROGRAM [program-args]

Run *PROGRAM* with *program-args* to captures *FIRST* seccomp filter, and then
kill all children. Use it when a quick check against a program is needed,
and detect potential seccomp rule issues.

All argument descriptions can be found in <<trace - TRACE FILTER IN RUNTIME>> section.

The output for this subcommand is the emulating result of common syscalls
like `execve`, `open` and so on. If the filter itself is not capable of
blocking syscalls, you could know that with a glance.

Typical output for this subcommand is described below, more detailed example
could be found in <<EXAMPLES>> section.

    open      -> ALLOW
    read      -> ALLOW
    write     -> ALLOW
    execve    -> KILL
    execveat  -> KILL
    mmap      -> ALLOW
    mprotect  -> ALLOW
    openat    -> ALLOW
    sendfile  -> ALLOW
    ptrace    -> ERRNO(1)
    fork      -> ALLOW

NOTE: `seccomp-tools` don't have this subcommand.

== TEXT GRAMMAR REFERENCE

IMPORTANT: The grammar changed greatly since _version 4.0_ as we
refactored lexer for better human readability. The wrapper now prefixed by
`#` as it's a comment now. And _line_ is replaced by _label_, so now lexer
depends on label declaration to decide where to jump, instead of lineno in
_TEXT_ file.

A valid _TEXT_ format is described in EBNF-like declaration here:
https://github.com/dbgbgtf1/Ceccomp/issues/17#issuecomment-3610531705.
If you have no interest to know what EBNF is, please keep reading for
examples.

BPF ops which are not described below are banned by kernel.

=== Comment and Label

`ceccomp disasm` displays a lot of things, but some of them are optional
for asm.

    #Label  CODE  JT   JF      K
    #---------------------------------
     L0001: 0x06 0x00 0x00 0x7fff0000 return ALLOW
    #---------------------------------

Any text after `#` will be discarded by asm like some script languages.

Empty lines are accepted.

Label declaration is an identifier at the beginning of line and suffixed
by `:` like `L0001`. An identifier is a string starts with alpha and contains with
only alphanumeric characters and underscore `_`. Label is only necessary if
it's the destination of `goto`, these redundant labels added by disasm are for
readability. E.g. in `if ($A == 0) goto somewhere`, `somewhere` is a label and
must be declared after the statement. Label declaration can take a line separately,
or be put in front of statement.

The `CODE`, `JT`, `JF` and `K` value generated by disasm will be discarded by asm,
asm only parse the effective statement after `K`.

NOTE: There are some slight difference between `ceccomp disasm` and
`seccomp-tools disasm`, down below is a general example. And some
statements are different, so don't pipe seccomp-tools output to ceccomp
blindly.

    line  CODE  JT   JF      K
    =================================
    0000: 0x06 0x00 0x00 0x7fff0000  return ALLOW

=== Assignment

`A` can be set to seccomp attributes directly. But `X` can not be assigned with
seccomp attributes directly due to kernel limit.

    $A = $arch
    $A = $syscall_nr

To assign `A` with those 64-bit long fields, `low_` or `high_` prefix is needed.

    $A = $low_pc
    $A = $high_pc
    $A = $low_args[0]
    $A = $high_args[0]
    ...
    $A = $low_args[5]
    $A = $high_args[5]

A special attribute is `sizeof(struct seccomp_data)`, that can be assigned to
`A` or `X` directly.

    $A = $scmp_data_len
    $X = $scmp_data_len

Temporary memory is 32-bit, to access them, you could use hex or dec as index.
Both `A` and `X` is assignable. Assigning immediate values to `A` or `X` accepts
any format of number if you imply the correct base by "0x" or "0b".

    $X = $mem[0]
    $A = $mem[0xf]
    $A = $mem[15] # both hex and dec index are OK
    $A = 0
    $X = 0x3b
    $A = 0b1111
    $A = 0333

You could also assign `X` to `A` or in the reverse order. Assign `X` or `A` to
temporary memory is definitely okay.

    $A = $X
    $X = $A
    $mem[3] = $X
    $mem[0x4] = $A

=== Arithmetic Operations

Various operations can be applied to `A`.

    $A += 30
    $A -= 4
    $A *= 9
    $A /= 1
    $A &= 7
    $A >>= 6

The right value can be `X`.

    $A &= $X
    $A |= $X
    $A ^= $X
    $A <<= $X

And there is a way to negativate `A`.

    $A = -$A

=== Jump Downwards If ...

Unconditional jump:

    goto L3

Jump if:

    if ($A == execve) goto L3
    if ($A != 1234) goto L4
    if ($A & $X) goto L5
    if !($A & 7) goto L6
    if ($A <= $X) goto L7

If true jump to ... if false jump to...:
    
    if ($A > $X) goto L3, else goto L4
    if ($A >= 4567) goto L5, else goto L6

ONLY in conditions, you CAN replace number with syscall name or arch name.
In example above, `0x3b` is replaced by `execve`. All the syscall name will be
resolved to syscall number under your selected arch. If you want to resolve
a syscall name in foreign arch (not equal to your selected arch), please
prepend a arch and dot. For example, your arch is x86_64, and you are writing
_aarch64_ rules, then please write like:

    if ($A == aarch64.read) goto 5

Note that if you manually set arch to _aarch64_ with `-a aarch64`,
you can omit `aarch64.` in statement.

=== Return Code

Return value of register `A`:

    return $A

Or return a immediate value, with extra field in `()`. Actions including
`TRACE`, `TRAP` and `ERRNO` accept an extra field, without `()`, they are
treated as `action(0)`.

    return 0x13371337
    return KILL
    return KILL_PROCESS
    return TRAP(123)
    return ERRNO(0)
    return TRACE
    return TRACE(3)
    return LOG
    return NOTIFY

=== Short Example

The following _TEXT_ is valid for asm, which blocks `execve` and `execveat`
for amd64 syscalls:

    $A = $syscall_nr
    if ($A == execve) goto forbid
    if ($A == execveat) goto forbid
    return ALLOW
    forbid: return KILL

== RESTRICTIONS

Ceccomp asm put some restrictions on _TEXT_ for better performance.

1. `'\0'` must not be found in _TEXT_ since it's a text file.
2. A line must be shorter than 384 *bytes*.
3. A _TEXT_ file must have less than 4096 lines.
4. A _TEXT_ file must be smaller than 1 MiB.

And for both asm and disasm, effective statements (that can be encoded or decoded
into BPF) must be less or equal than 1024, this is enforced by kernel.

A fun fact about ceccomp asm: any basic ANSI color in _TEXT_ file,
e.g., `\x1b[31m`, will be discarded when processing.

== EXAMPLES

ifdef::backend-manpage[]
Manpage can not display images, so please check out html version of
this page to see examples.
endif::[]

ifndef::backend-manpage[]
=== asm example
image::asm.webp[]
=== disasm example
image::disasm.webp[]
=== emu example
image::emu.webp[]
=== trace example
Running program:

image::trace.webp[]

If set `-o FILE`:

image::output_trick.webp[]

Trace pid mode:

image::trace_pid.webp[]

Seize pid mode:

image::trace_seize.webp[]

Completion for pid mode is available under zsh:

image::trace_completion.webp[]

=== probe example
image::probe.webp[]
endif::[]

== REPO

Visit https://github.com/dbgbgtf1/Ceccomp to find the code.
Pull Requests and Issues are welcome!

Copyright (C) 2025-present, distributed under GPLv3 or later.