1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308
|
<!--startcut ==========================================================-->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<title>Cleaning Up Your /tmp...The Safe Way Issue 18</title>
</HEAD>
<BODY BGCOLOR="#EEE1CC" TEXT="#000000" LINK="#0000FF" VLINK="#0020F0"
ALINK="#FF0000">
<!--endcut ============================================================-->
<H4>
"Linux Gazette...<I>making Linux just a little more fun!</I>"
</H4>
<P> <HR> <P>
<!--===================================================================-->
<center>
<H2>Cleaning Up Your /tmp...The Safe Way</H2>
<H4>By Guy Geens,
<a href="mailto:ggeens@iname.com">ggeens@iname.com</a></H4>
</center>
<P><HR><P>
<H3>Introduction</H3>
<P>Removing temporary files left over in your <TT>/tmp</TT> directory,
is not as easy as it looks like. At least not on a multi-user system that's
connected to a network.</P>
<P>If you do it the wrong way, you can leave your system open to attacks
that could compromise your system's integrity.</P>
<H3>What's eating my disk space?</H3>
<P>So, you have your Linux box set up. Finally, you have installed everything
you want, and you can have some fun! But wait. Free disk space is slowly
going down.</P>
<P>So, you start looking where this disk space is going to. Basically,
you will find the following disk hogs:</P>
<UL>
<LI>Formatted man pages in<TT>/var/catman</TT>; </LI>
<LI>The <TT>/tmp</TT> and <TT>/var/tmp</TT> hierarchies.</LI>
</UL>
<P>Of course, there are others, but in this article, I'll concentrate on
these three, because you normally don't lose data when you erase the contents.
At the most, you will have to wait while the files are regenerated.</P>
<H3>The quick and dirty solution</H3>
<P>Digging through a few man pages, you come up with something like this:
</P>
<P>find /var/catman -type f -atime 7 -print | xargs -- rm -f --</P>
<P>This will remove all formatted man pages that have not been read for
7 days. The <TT>find</TT> command makes a list of these, and sends them
to the <TT>xargs</TT>. x<TT>args</TT> puts these files on the command line,
and calls <TT>rm -f</TT> to delete them. The double dashes are there so
that any files starting with a minus will not be misinterpreted as options.</P>
<P>(Actually, in this case, find prints out full path names, which are
guaranteed to start with a /. But its better to be safe than sorry.)</P>
<P>This will work fine, and you can place this in your crontab file or
one of your start-up scripts.</P>
<P>Note that I used <TT>/var/catman</TT> in the previous example. You might
be thinking ``So, why not use it for <TT>/tmp</TT>?'' There is a good reason
for this. Let me start by elaborating on the difference between <TT>/var/catman</TT>
and <TT>/tmp</TT> directories. (The situation for <TT>/var/tmp</TT> is
the same as for /tmp. So you can change all instances of <TT>/tmp</TT>
by <TT>/var/tmp</TT> in the following text.)</P>
<H4>Why /var/catman is easy</H4>
<P>If you look at the files in <TT>/var/catman</TT>, you will notice that
all the files are owned by the same user (normally <TT>man</TT>). This
user is also the only one who has write permissions on the directories.
That is because the only program that ever writes to this directory tree
is <TT>man </TT>. Let's look at <TT>/usr/bin/man</TT>: </P>
<PRE>-rwsr-sr-x 1 man man 29716 Apr 8 22:14 /usr/bin/man*</PRE>
<P>(Notice the two letters `s' in the first column.)</P>
<P>The program is running setuid man, i.e., it takes the identity and privileges
of this `user'. (It also takes the group privileges, but that is not really
important in our discussion.) <TT>man</TT> is not a real user: nobody will
ever log in with this identity. Therefore, man (the program) can write
to directories a normal user cannot write to.</P>
<P>Because you know all files in the directory tree are generated by one
program, it is easy to maintain.</P>
<H4>And now /tmp</H4>
<P>In <TT>/tmp</TT>, we have a totally different situation. First of all,
the file permissions:</P>
<PRE>drwxrwxrwt 10 root root 3072 May 18 21:09 /tmp/</PRE>
<P>We can see that <B>everyone</B> can write to this directory: everyone
can create, rename or delete files and directories here.</P>
<P>There is one limitation: the `sticky bit' is switched on. (Notice the
t at the end of the first column.) This means a user can only delete or
rename files owned by himself. This prohibits users peskering each other
by removing the other one's temporary files.</P>
<P>If you were to use the simple script above, there are security risks
involved. Let me repeat the simple one-line script from above:</P>
<PRE>find /tmp -type f -atime 7 -print | xargs -- rm -f --</PRE>
<P>Suppose there is a file <TT>/tmp/dir/file</TT>, and it is older than
7 days. </P>
<P>By the time <TT>find</TT> passes this filename to <TT>xargs</TT>, the
directory might have been renamed to something else, and there might even
be another directory <TT>/tmp/dir</TT>.</P>
<p>(And then I didn't even mention the possibility of embedded
newlines. But that can be easily fixed by using -print0 instead of
-print.)
<P>All this could lead to a wrong file being deleted, Either
intentionally or by accident. By clever use of symbolic links, an
attacker can exploit this weakness to delete some important system
files.
<p>For an in-depth discussion of the problem, see the Bugtraq mailing
list archives. (Thread ``<a
href="http://www.geek-girl.com/bugtraq/1996_2/0054.html">[linux-security]
Things NOT to put in root's crontab''</a>).</P>
<P>This problem is inherently linked with find's algoritm: there can be
a long time between the moment when find generates a filename internally
and when it is passed on to the next program. This is because find recurses
subdirs before it tests the files in a particular directory.</P>
<H4>So how do we get around this?</H4>
<P>A first idea might be:</P>
<P>find ... -exec rm {} \;</P>
<P>but unfortunately, this suffers from the same problem, as the `exec'
clause passes on the full pathname.</P>
<P>In order to solve the problem, I wrote this <A HREF="cleantmp.html">perl
script </A>, which I named <TT>cleantmp</TT>.</P>
<P>I will explain how it works, and why it is safer than the aforementioned
scripts.</P>
<P>First indicate I'm using the File::Find module. After this statement,
I can call the &find subroutine.</P>
<PRE>use File::Find;
</PRE>
<P>Then do a chroot to <TT>/tmp</TT>. This changes the root directory for
the script to <TT>/tmp</TT>. It will make sure the script can't access
any files outside of this hierarchy.</P>
<P>Perl only allows a chroot when the user is root. I'm checking for this
case, to facilitate testing.</P>
<PRE># Security measure: chroot to /tmp
$tmpdir = '/tmp/';
chdir ($tmpdir) || die "$tmpdir not accessible: $!";
if (chroot($tmpdir)) { # chroot() fails when not run by root
($prefix = $tmpdir) =~ s,/+$,,;
$root = '/';
$test = 0;
} else {
# Not run by root - test only
$prefix = '';
$root = $tmpdir;
$test = 1;
}</PRE>
<P>Then we come to these lines:</P>
<PRE>&find(\&do_files, $root);
<P>&find(\&do_dirs, $root);</PRE>
<P>Here, I let the find subroutine recurse through all the subroutines
of /tmp. The functions do_files and do_dirs are called for each file found.
There are two passes over the directory tree: one for files, and one for
directories. </P>
<P>Now we have the function <TT>do_files</TT>.</P>
<PRE>sub do_files {
(($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) &&
(-f _ || -l _ ) &&
(int(-A _) > 3) &&
! /^\.X.*lock$/ &&
&removefile ($_) && push @list, $File::Find::name;
}</PRE>
<P>Basically, this is the output of the find2perl program, with a little
changes.</P>
<P>This routine is called with $_ set to the filename under inspection,
and the current directory is the one in which it resides. Now let's see
what it does. (In case you don't know perl: the && operator short-circuits,
just like in C.)</P>
<OL>
<LI>The first line gets the file's parameters from the kernel; </LI>
<LI>If that succeeds, we check if it is a regular file or a symbolic link
(as opposed to a directory or a special file); </LI>
<LI>Then, we test if the file is old enough to be deleted (older than 3
days); </LI>
<LI>The fourth line makes sure X's lockfiles (of the form <TT>/tmp/.X0-lock
</TT>are not removed; </LI>
<LI>The last line will remove the file, and keep a listing of all deleted
files. </LI>
</OL>
<P>The removefile subroutine merely tests if the $test flag is set, and
if not, deletes the file. </P>
<P>The do_dirs subroutine is very similar to this one, and I won't go into
the details. </P>
<H4>A few remarks</H4>
<P>I use the access time to determine the file's age. The reason for
this is simple. I sometimes unpack archives into my /tmp directory.
When it creates files, tar gives them the date they had in the archive
as the modification time. In one of my earlier scripts, I did test on
the mtime. But then, I was looking in an unpacked archive, at the same
time when cron started to clean up. (Hey?? Where did my files go?)
</P>
<P>As I said before, the script checks for some special files (and
also directories in do_dirs). This is because they are important for
the system. If you have a separate /tmp partition, and have quota
installed on it, you should also check for quota's support files -
quota.user and quota.group.</P>
<P>The script also generates a list of all deleted files and directories.
If you don't want this output, send the output to <TT>/dev/null</TT>. </P>
<H3>Why this is safe</H3>
<P>The main difference with the find constructions I have shown before
is this: the file to be deleted is not referenced by its full pathname.
If the directory is renamed while the script is scanning it, this doesn't
have any effect: the script won't notice this, and delete the right files.
</P>
<P>I have been thinking about weaknesses, and I couldn't find one. Now
I'm giving this to you for inspection. I'm convinced that there are no
hidden security risks, but if you do find one, <a
href="mailto:ggeens@iname.com">let me know</a>.</P>
<!--===================================================================-->
<P> <hr> <P>
<center><H5>Copyright © 1997, Guy Geens<BR>
Published in Issue 18 of the Linux Gazette, June 1997</H5></center>
<!--===================================================================-->
<P> <hr> <P>
<A HREF="./lg_toc18.html"><IMG ALIGN=BOTTOM SRC="../gx/indexnew.gif"
ALT="[ TABLE OF CONTENTS ]"></A>
<A HREF="../lg_frontpage.html"><IMG ALIGN=BOTTOM SRC="../gx/homenew.gif"
ALT="[ FRONT PAGE ]"></A>
<A HREF="./building.html"><IMG SRC="../gx/back2.gif"
ALT=" Back "></A>
<A HREF="./clueless.html"><IMG SRC="../gx/fwd.gif" ALT=" Next "></A>
<P> <hr> <P>
<!--startcut ==========================================================-->
</BODY>
</HTML>
<!--endcut ============================================================-->
|