1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392
|
Linux kernel patch from the Openwall Project
----------------------------------------------
==========
Overview
==========
This patch is a collection of security-related features for the Linux
kernel, all configurable via the new 'Security options' configuration
section. In addition to the new features, some versions of the patch
contain various security fixes. The number of such fixes changes from
version to version, as some are becoming obsolete (such as because of
the same problem getting fixed with a new kernel release), while other
security issues are discovered.
Non-executable user stack area
--------------------------------
Most buffer overflow exploits are based on overwriting a function's return
address on the stack to point to some arbitrary code, which is also put
onto the stack. If the stack area is non-executable, buffer overflow
vulnerabilities become harder to exploit.
Another way to exploit a buffer overflow is to point the return address to
a function in libc, usually system(). This patch also changes the default
address that shared libraries are mmap()'ed at to make it always contain a
zero byte. This makes it impossible to specify any more data (parameters
to the function, or more copies of the return address when filling with a
pattern), -- in many exploits that have to do with ASCIIZ strings.
However, note that this patch is by no means a complete solution, it just
adds an extra layer of security. Many buffer overflow vulnerabilities
will remain exploitable a more complicated way, and some will even remain
unaffected by the patch. The reason for using such a patch is to protect
against some of the buffer overflow vulnerabilities that are yet unknown.
Also, note that some buffer overflows can be used for denial of service
attacks (usually in non-respawning daemons and network clients). A patch
like this cannot do anything against that.
It is important that you fix vulnerabilities as soon as they become known,
even if you're using the patch. The same applies to other features of the
patch (discussed below) and their corresponding vulnerabilities.
Restricted links in /tmp
--------------------------
I've also added a link-in-+t restriction, originally for Linux 2.0 only,
by Andrew Tridgell. I've updated it to prevent from using a hard link in
an attack instead, by not allowing regular users to create hard links to
files they don't own, unless they could read and write the file (due to
group permissions). This is usually the desired behavior anyway, since
otherwise users couldn't remove such links they've just created in a +t
directory (unfortunately, this is still possible for group-writable files)
and because of disk quotas.
Unfortunately, this may break existing applications.
Restricted FIFOs in /tmp
--------------------------
In addition to restricting links, you might also want to restrict writes
into untrusted FIFOs (named pipes), to make data spoofing attacks harder.
Enabling this option disallows writing into FIFOs not owned by the user in
+t directories, unless the owner is the same as that of the directory or
the FIFO is opened without the O_CREAT flag.
Restricted /proc
------------------
This was originally a patch by route that only changed the permissions on
some directories in /proc, so you had to be root to access them. Then
there were similar patches by others. I found them all quite unusable for
my purposes, on a system where I wanted several admins to be able to see
all the processes, etc, without having to su root (or use sudo) each time.
So I had to create my own patch that I include here.
This option restricts the permissions on /proc so that non-root users can
see their own processes only, and nothing about active network connections,
unless they're in a special group. This group's id is specified via the
gid= mount option, and is 0 by default. (Note: if you're using identd, you
will need to edit the inetd.conf line to run identd as this special group.)
Also, this disables dmesg(8) for the users. You might want to use this
on an ISP shell server where privacy is an issue. Note that these extra
restrictions can be trivially bypassed with physical access (without having
to reboot).
When using this part of the patch, most programs (ps, top, who) work as
desired -- they only show the processes of this user (unless root or in
the special group, or running with the relevant capabilities on 2.2), and
don't complain they can't access others. However, there's a known problem
with w(1) in recent versions of procps, so you should apply the included
patch to procps if this applies to you.
Special handling of fd 0, 1, and 2
------------------------------------
File descriptors 0, 1, and 2 have a special meaning for the C library and
lots of programs. Thus, they're often referenced by number. Still, it is
normally possible to execute a program with one or more of these fd's
closed, and any open(2) calls it might do will happily provide these fd
numbers. The program (or the libraries it is linked with) will continue
using the fd's for their usual purposes, in reality accessing files the
program has just opened. If such a program is installed SUID and/or SGID,
then we might have a security problem.
Enable this option to ensure that fd's 0, 1, and 2 are always open on
startup of a SUID/SGID binary. If any of the fd's is closed, "/dev/null"
will be opened for it (the device itself; you don't need to have /dev in
the filesystem for that to work, such as in a chroot). This part of the
patch is by Pavel Kankovsky, I've only ported it to Linux 2.2 (any errors
are mine, of course).
Enforce RLIMIT_NPROC on execve(2)
-----------------------------------
Linux lets you set a limit on how many processes a user can have, via a
setrlimit(2) call with RLIMIT_NPROC. Unfortunately, this limit is only
looked at when a new process is created on fork(2). If a process changes
its UID, it might exceed the limit for its new UID.
This is not a security issue by itself, as changing the UID is a privileged
operation. However, there're privileged programs that want to switch to a
user's context, including setting up some resource limits. The only fork(2)
required (if at all) is done before switching the UID, and thus doesn't
result in a check against RLIMIT_NPROC.
Enable this option to enforce RLIMIT_NPROC on execve(2) calls. (The Linux
2.0 version of this patch only checks the limit for processes that have
their "dumpable" flag reset, such as due to an UID change, to reduce the
performance impact.)
Note that there's at least one good reason I am not enforcing the limit
right after setuid(2) calls: some programs don't expect setuid(2) to fail
when running as root.
Destroy shared memory segments not in use
-------------------------------------------
Linux lets you set resource limits, including on how much memory a process
can consume, via setrlimit(2). Unfortunately, shared memory segments are
allowed to exist without association with any process, and thus might not
be counted against any resource limits.
This option automatically destroys shared memory segments when their attach
count becomes zero after a detach or a process termination. It will also
destroy segments that were created, but never attached to, on exit from the
process. (In case you're curious, the only use left for IPC_RMID is to
immediately destroy an unattached segment.)
Of course, this breaks the way things are defined, so some applications
might stop working. In particular, expect most commercial databases to
break. Apache and PostgreSQL are known to work, though. :-)
Note that this feature will do you no good unless you also configure your
resource limits (in particular, RLIMIT_AS and RLIMIT_NPROC). Most systems
don't need this.
Privileged IP aliases (Linux 2.0 only)
----------------------------------------
It is sometimes desirable not to let regular users put their services on
some of the IP addresses configured on the system. For example, this is
the case when providing web hosting services with shell and/or CGI access,
so that one user can't abuse the other domains hosted on the same system.
When this option is enabled, only root can bind sockets to addresses of
privileged aliased interfaces: those with slot numbers of the first half
of the allowed range. The default limit is also expanded to 2048 aliases,
so that the familiar slot numbers of 0 to 1023 become privileged.
================
How to install
================
Make sure you have the original kernel sources (as can be obtained from
ftp.kernel.org) installed in /usr/src/linux. Apply the patch:
cd /usr/src/linux
patch -p1 < PATCH-FILE
where PATCH-FILE is the full path and name of the linux-*-ow*.diff file.
In kernel configuration, go to the new 'Security options' section. Read
help for the suboptions, and configure them.
If desired, edit /etc/fstab to specify the group id for accessing /proc.
Also, make sure you have no extra procfs mount commands in the startup
scripts, as these might override your fstab settings; this is the case for
some distributions, including Red Hat. (Note that you won't be able to
specify the GID by remounting /proc on a running system. This is because
filesystem-specific options are not supported at that stage.)
Build the kernel and reboot.
You may also want to add the following line to your /etc/syslog.conf to
log [security] alerts separately:
kern.alert /var/log/alert
Additionally, you may do something like this (assuming the log file will
be empty most of the time):
> /var/log/alert
chown root.staff /var/log/alert
chmod 640 /var/log/alert
echo "less -XEU /var/log/alert" >> ~non-root/.bash_profile
Ensure that the non-executable stack part of the patch is working, using
stacktest.c for that purpose -- running './stacktest -e' should segfault,
and a message should get logged to /var/log/alert (if you've followed the
syslogd configuration described above). If you've enabled the support for
GCC trampolines, try running './stacktest -t', it should succeed. If you
have trampoline call emulation enabled on Linux 2.0, you should also try
'./stacktest -b', the simulated exploit attempt should fail even after a
trampoline call in the same process has succeeded.
If you enabled the link-in-+t restriction, you can also try to create a
symlink in /tmp (as a non-root user) pointing to a file that user has no
read access to, then switch to some other user that has the read access
(for example, root) and try to read the file via the link (such as, with
'cat /tmp/link'). This should fail, and a message should get logged.
Now, you can try to create a hard link as a non-root user to a file that
user doesn't own. This should also fail.
========
F.A.Q.
========
Q: Where can I find new versions of the patch?
Q: I only have the patch for Linux 2.0, where do I get a version for
Linux 2.2 (or vice versa)?
A: http://www.openwall.com/linux/
Q: Will you be updating the patch for the new kernel version 2.0.x?
A: I will likely support 2.0.39, then we'll see.
Q: What about 2.2.x?
A: I will definitely support future 2.2.x versions of the kernel.
Q: What about 2.4.x?
A: I will likely start supporting these kernels several (possibly 10)
revisions after 2.4.0. My advice is to use 2.2 on "production" systems
until then.
Q: Is the patch x86-specific?
A: Only the non-executable stack feature of the patch is x86-specific.
The patch has been tested and is used on other architectures as well.
In fact, I've released some minor updates of the patch after testing
them on Alpha only in the past.
Q: Are there any issues with the patch on SMP boxes?
A: None that I am currently aware of. I've been running all versions
of the patch since 2.0.33 on SMP.
Q: Why don't they make it into the standard kernel?
A: This is not a trivial question to answer. First, some parts of older
versions of the patch (or equivalent, but different, fixes) are in fact
in the kernel now. This is the reason the patch for 2.0.36 was smaller
than it used to be in the 2.0.33 days. Now the patch for 2.2.13 is once
again smaller than its last 2.2.12 version. :-) So, security problems in
the kernel itself are typically getting fixed. It is, however, true that
the security "hardening" features of the patch are not getting in. One
of the reasons for this, is that those features could result in a false
sense of security. Someone could then decide against fixing a hole on a
system they administer or in software they maintain just because of these
kernel features. If such things happen, the security is in fact relaxed,
not improved. The rlimit restrictions I have here are temporary hacks,
to be replaced with a real solution (beancounters), so I'm not trying to
get them into the kernel. Finally, there are some features here that I
think could get included (don't know of a good reason against doing so),
such as the fd 0-2 fix.
Q: I've applied the patch, and now my kernel doesn't compile?
A: Are you sure you've applied the patch exactly as shown in this README?
Please, try again with a known-clean source tree.
Q: Will GCC-compiled programs that use trampolines work with the non-exec
stack part of the patch?
A: Yes, if you enable the support.
Q: When I'm trying to use 'print f()' in gdb on a Linux 2.2 system with
your patch, my program crashes. What's going on?
A: The changes made in Linux 2.2 didn't let me port my old workaround for
this from the Linux 2.0 version of the patch. You'll have to use chstk.c
on the program you're debugging in order to use this feature of gdb.
Q: What does GCC use trampolines for?
A: Trampolines are needed to fully support one of the GNU C extensions,
nested functions. When a nested function is called from outside of the
one it was declared in (that is, via a function pointer), something needs
to provide the stack frame. The bad thing is that GCC puts trampolines
onto the stack (as they're generated at runtime). You can find an example
in stacktest.c, included with this patch.
Q: How do you differ a trampoline call from an exploit attempt?
A: Since most buffer overflow exploits overwrite the return address, the
instruction to pass control onto the stack has to be a RET. When calling
a trampoline, the instruction is a CALL. Note that in some cases such
autodetection can be fooled by RET'ing to a CALL instruction and making
this CALL pass control onto the stack (in reality, this also requires a
register to be set to the address, and only works this way on Linux 2.0).
Read help for the 'Autodetect GCC trampolines' configuration option.
Q: What about glibc and non-executable stack?
A: You have to enable trampoline autodetection when using glibc 2.0.x, or
the system won't even boot. If you're running Linux 2.0, you will likely
also want to enable trampoline call emulation to avoid running privileged
processes with executable stack.
Q: I've just compiled glibc on my system, but "make check" fails while
trying to load testobj1.so. What's going on? Will the newly compiled
glibc work with your patch in the kernel?
Q: What's the deal with glibc-2.1.3-dl-open.diff?
A: The non-executable stack part of the patch changes the default address
shared libraries are mmap()ed at. Unfortunately, some parts of (at least
some versions of) glibc depend on this address being above ELF sections.
This is a bug, and glibc maintainers are now aware of it. The good thing
is that the problem is only likely to show up with the little used ORIGIN
feature, and only when the dynamic linker is run as a standalone program.
It is thus highly unlikely that this will cause anything other than "make
check" to break. You can, however, use the workaround included with this
patch.
Q: What does the procps-2.0.6-ow1.diff patch do? Is it required for the
kernel patch to work?
A: This procps patch updates the stale utmp entry check, so that w(1) in
procps 2.0.x up to 2.0.6 works correctly on systems with the restricted
/proc option. If you don't experience any problems with w(1), you don't
need to install the procps patch.
Q: What is chstk.c for?
A: The patch adds an extra flag to ELF and a.out headers, which controls
whether the program will be allowed to execute code on the stack or not,
and chstk.c is what you should use to manage the flag. You might find it
useful if you choose to disable the GCC trampolines autodetection. BTW,
setting the flag also restores the original address shared libraries are
mmap()'ed at, just in case some program depends on that.
Q: What if an attacker uses chstk.c on a buffer overflow exploit?
A: Nothing changes. It's the vulnerable program being exploited that needs
executable stack, not the exploit. The attacker would need write access
to this program's binary to use chstk.c successfully.
Q: Do I have to reboot with an unpatched kernel to try out a new overflow
exploit to see if I'm vulnerable?
A: No, you can use chstk.c on the vulnerable program to temporarily allow
it to execute code on the stack. Just don't forget to reset the flag back
when you're done. Also, be careful with relying on such tests: typically,
they can't prove that you're not vulnerable, they can only sometimes prove
the opposite. Note that setting the flag on Linux 2.2 systems will change
the default stack location to be 8 MB lower than where many exploits expect
it to be.
Q: Are any applications known to require executable stack?
A: Yes. JDK 1.3 and XFree86 4.0.1 (only with the commercial nVidia GeForce
drivers) have been reported to not work unless chstk'ed.
Q: Why did you modify signal handler return code?
A: Originally the kernel put some code onto the stack to return from signal
handlers. Now signal handler returns are done via the GPF handler instead
(an invalid magic return address is put onto the stack).
Q: What to do if a program needs to follow a symlink in a +t directory for
its normal operation (without introducing a security hole)?
A: Usually such a link needs to be created only once, so create it as root
(or the directory owner, if it's not root). Such links are followed even
when the patch is enabled.
Q: What will happen if someone does:
ln -s /etc/passwd ~/link
ln -s ~/link /tmp/link
and the vulnerable program runs as root and writes to /tmp/link?
A: The patch is not looking at the target of the symlink in /tmp, it only
checks if the symlink itself is owned by the user that vulnerable program
is running as, and doesn't follow the link if not (like in this example).
Q: Is there some performance impact of using the patch?
A: Well, normally the only thing affected is signal handler returns. I
didn't want to modify the sigreturn syscall, so there is some extra code to
setup its stack frame. I don't think this has a noticable effect on the
performance (and my benchmarks prove that): saved context checks and other
signal handling stuff are taking much more time. Executing code on the
stack was not fast anyway. Also, programs using GCC trampolines will run
slower if trampoline calls are emulated. However, I don't know of any
program that uses trampolines where the performance is critical (would be a
stupid thing to do so anyway).
--
Solar Designer <solar@openwall.com>
|