File: README.md

package info (click to toggle)
bubblewrap 0.3.3-2
  • links: PTS, VCS
  • area: main
  • in suites: bullseye, sid
  • size: 848 kB
  • sloc: ansic: 3,090; sh: 2,258; xml: 353; perl: 100; makefile: 66; python: 28
file content (180 lines) | stat: -rw-r--r-- 8,350 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
Bubblewrap
==========

Many container runtime tools like `systemd-nspawn`, `docker`,
etc. focus on providing infrastructure for system administrators and
orchestration tools (e.g. Kubernetes) to run containers.

These tools are not suitable to give to unprivileged users, because it
is trivial to turn such access into to a fully privileged root shell
on the host.

User namespaces
---------------

There is an effort in the Linux kernel called
[user namespaces](https://www.google.com/search?q=user+namespaces+site%3Ahttps%3A%2F%2Flwn.net)
which attempts to allow unprivileged users to use container features.
While significant progress has been made, there are
[still concerns](https://lwn.net/Articles/673597/) about it, and
it is not available to unprivileged users in several production distributions
such as CentOS/Red Hat Enterprise Linux 7, Debian Jessie, etc.

See for example
[CVE-2016-3135](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-3135)
which is a local root vulnerability introduced by userns.
[This March 2016 post](https://lkml.org/lkml/2016/3/9/555) has some
more discussion.

Bubblewrap could be viewed as setuid implementation of a *subset* of
user namespaces.  Emphasis on subset - specifically relevant to the
above CVE, bubblewrap does not allow control over iptables.

The original bubblewrap code existed before user namespaces - it inherits code from
[xdg-app helper](https://cgit.freedesktop.org/xdg-app/xdg-app/tree/common/xdg-app-helper.c)
which in turn distantly derives from
[linux-user-chroot](https://git.gnome.org/browse/linux-user-chroot).

Security
--------

The maintainers of this tool believe that it does not, even when used
in combination with typical software installed on that distribution,
allow privilege escalation.  It may increase the ability of a logged
in user to perform denial of service attacks, however.

In particular, bubblewrap uses `PR_SET_NO_NEW_PRIVS` to turn off
setuid binaries, which is the [traditional way](https://en.wikipedia.org/wiki/Chroot#Limitations) to get out of things
like chroots.

Users
-----

This program can be shared by all container tools which perform
non-root operation, such as:

 - [Flatpak](http://www.flatpak.org)
 - [rpm-ostree unprivileged](https://github.com/projectatomic/rpm-ostree/pull/209)
 - [bwrap-oci](https://github.com/projectatomic/bwrap-oci)

We would also like to see this be available in Kubernetes/OpenShift
clusters.  Having the ability for unprivileged users to use container
features would make it significantly easier to do interactive
debugging scenarios and the like.

Usage
-----

bubblewrap works by creating a new, completely empty, mount
namespace where the root is on a tmpfs that is invisible from the
host, and will be automatically cleaned up when the last process
exits. You can then use commandline options to construct the root
filesystem and process environment and command to run in the
namespace.

There's a larger [demo script](./demos/bubblewrap-shell.sh) in the
source code, but here's a trimmed down version which runs
a new shell reusing the host's `/usr`.

```
bwrap --ro-bind /usr /usr --symlink usr/lib64 /lib64 --proc /proc --dev /dev --unshare-pid bash
```

This is an incomplete example, but useful for purposes of
illustration.  More often, rather than creating a container using the
host's filesystem tree, you want to target a chroot.  There, rather
than creating the symlink `lib64 -> usr/lib64` in the tmpfs, you might
have already created it in the target rootfs.

Sandboxing
----------

The goal of bubblewrap is to run an application in a sandbox, where it
has restricted access to parts of the operating system or user data
such as the home directory.

bubblewrap always creates a new mount namespace, and the user can specify
exactly what parts of the filesystem should be visible in the sandbox.
Any such directories you specify mounted `nodev` by default, and can be made readonly.

Additionally you can use these kernel features:

User namespaces ([CLONE_NEWUSER](http://linux.die.net/man/2/clone)): This hides all but the current uid and gid from the
sandbox. You can also change what the value of uid/gid should be in the sandbox.

IPC namespaces ([CLONE_NEWIPC](http://linux.die.net/man/2/clone)): The sandbox will get its own copy of all the
different forms of IPCs, like SysV shared memory and semaphores.

PID namespaces ([CLONE_NEWPID](http://linux.die.net/man/2/clone)): The sandbox will not see any processes outside the sandbox. Additionally, bubblewrap will run a trivial pid1 inside your container to handle the requirements of reaping children in the sandbox. This avoids what is known now as the [Docker pid 1 problem](https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/).


Network namespaces ([CLONE_NEWNET](http://linux.die.net/man/2/clone)): The sandbox will not see the network. Instead it will have its own network namespace with only a loopback device.

UTS namespace ([CLONE_NEWUTS](http://linux.die.net/man/2/clone)): The sandbox will have its own hostname.

Seccomp filters: You can pass in seccomp filters that limit which syscalls can be done in the sandbox. For more information, see [Seccomp](https://en.wikipedia.org/wiki/Seccomp).

Related project comparison: Firejail
------------------------------------

[Firejail](https://github.com/netblue30/firejail/tree/master/src/firejail)
is similar to Flatpak before bubblewrap was split out in that it combines
a setuid tool with a lot of desktop-specific sandboxing features.  For
example, Firejail knows about Pulseaudio, whereas bubblewrap does not.

The bubblewrap authors believe it's much easier to audit a small
setuid program, and keep features such as Pulseaudio filtering as an
unprivileged process, as now occurs in Flatpak.

Also, @cgwalters thinks trying to
[whitelist file paths](https://github.com/netblue30/firejail/blob/37a5a3545ef6d8d03dad8bbd888f53e13274c9e5/src/firejail/fs_whitelist.c#L176)
is a bad idea given the myriad ways users have to manipulate paths,
and the myriad ways in which system administrators may configure a
system.  The bubblewrap approach is to only retain a few specific
Linux capabilities such as `CAP_SYS_ADMIN`, but to always access the
filesystem as the invoking uid.  This entirely closes
[TOCTTOU attacks](https://cwe.mitre.org/data/definitions/367.html) and
such.

Related project comparison: Sandstorm.io
----------------------------------------

[Sandstorm.io](https://sandstorm.io/) requires unprivileged user
namespaces to set up its sandbox, though it could easily be adapted
to operate in a setuid mode as well. @cgwalters believes their code is
fairly good, but it could still make sense to unify on bubblewrap.
However, @kentonv (of Sandstorm) feels that while this makes sense
in principle, the switching cost outweighs the practical benefits for
now. This decision could be re-evaluated in the future, but it is not
being actively pursued today.

Related project comparison: runc/binctr
----------------------------------------

[runC](https://github.com/opencontainers/runc) is currently working on
supporting [rootless containers](https://github.com/opencontainers/runc/pull/774),
without needing `setuid` or any other privileges during installation of
runC (using unprivileged user namespaces rather than `setuid`),
creation, and management of containers. However, the standard mode of
using runC is similar to [systemd nspawn](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html)
in that it is tooling intended to be invoked by root.

The bubblewrap authors believe that runc and systemd-nspawn are not
designed to be made setuid, and are distant from supporting such a mode.
However with rootless containers, runC will be able to fulfill certain usecases
that bubblewrap supports (with the added benefit of being a standardised and
complete OCI runtime).

[binctr](https://github.com/jfrazelle/binctr) is just a wrapper for
runC, so inherits all of its design tradeoffs.

What's with the name?!
----------------------

The name bubblewrap was chosen to convey that this
tool runs as the parent of the application (so wraps it in some sense) and creates
a protective layer (the sandbox) around it.

![](bubblewrap.jpg)

(Bubblewrap cat by [dancing_stupidity](https://www.flickr.com/photos/27549668@N03/))