File: UserModeLinux-HOWTO-13.html

package info (click to toggle)
user-mode-linux-doc 20060501%2Brepack0-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, sid, trixie
  • size: 2,384 kB
  • sloc: makefile: 38; sh: 13
file content (384 lines) | stat: -rw-r--r-- 14,104 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
 <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.21">
 <TITLE>User Mode Linux HOWTO : What to do when UML doesn't work</TITLE>
 <LINK HREF="UserModeLinux-HOWTO-14.html" REL=next>
 <LINK HREF="UserModeLinux-HOWTO-12.html" REL=previous>
 <LINK HREF="UserModeLinux-HOWTO.html#toc13" REL=contents>
</HEAD>
<BODY>
<A HREF="UserModeLinux-HOWTO-14.html">Next</A>
<A HREF="UserModeLinux-HOWTO-12.html">Previous</A>
<A HREF="UserModeLinux-HOWTO.html#toc13">Contents</A>
<HR>
<H2><A NAME="faq"></A> <A NAME="s13">13.</A> <A HREF="UserModeLinux-HOWTO.html#toc13">What to do when UML doesn't work</A></H2>

<P> </P>

<H2><A NAME="ss13.1">13.1</A> <A HREF="UserModeLinux-HOWTO.html#toc13.1">Strange compilation errors when you build from source</A>
</H2>

<P>As of test11, it is necessary to have &quot;ARCH=um&quot; in the environment or
on the make command line for all steps in building UML, including
clean, distclean, or mrproper, config, menuconfig, or xconfig, dep,
and linux.  If you forget for any of them, the i386 build seems to
contaminate the UML build.  If this happens, start from scratch with
<BLOCKQUOTE><CODE>
<PRE>
host% 
make mrproper ARCH=um
</PRE>
</CODE></BLOCKQUOTE>

and repeat the build process with ARCH=um on all the steps.</P>
<P> 
See 
<A HREF="UserModeLinux-HOWTO-2.html#compile">Compiling the kernel and modules</A>  for 
more details.</P>
<P> 
Another cause of strange compilation errors is building UML in
/usr/src/linux.  If you do this, the first thing you need to do is
clean up the mess you made.  The /usr/src/linux/asm link will now
point to /usr/src/linux/asm-um.  Make it point back to
/usr/src/linux/asm-i386.  Then, move your UML pool someplace else and
build it there.  Also see below, where a more specific set of symptoms
is described.</P>


<H2><A NAME="ss13.2">13.2</A> <A HREF="UserModeLinux-HOWTO.html#toc13.2">UML hangs on boot after mounting devfs</A>
</H2>

<P>The boot looks like this:
<BLOCKQUOTE><CODE>
<PRE>
VFS: Mounted root (ext2 filesystem) readonly.
Mounted devfs on /dev
</PRE>
</CODE></BLOCKQUOTE>

You're probably running a recent distribution on an old machine.  I
saw this with the RH7.1 filesystem running on a Pentium.  The shared
library loader, ld.so, was executing an instruction (cmove) which the
Pentium didn't support.  That instruction was apparently added later.
If you run UML under the debugger, you'll see the hang caused by one
instruction causing an infinite SIGILL stream.</P>
<P> 
The fix is to boot UML on an older filesystem.</P>


<H2><A NAME="ss13.3">13.3</A> <A HREF="UserModeLinux-HOWTO.html#toc13.3">A variety of panics and hangs with /tmp on a reiserfs  filesystem</A>
</H2>

<P>I saw this on reiserfs 3.5.21 and it seems to be fixed in 3.5.27.
Panics preceded by 
<BLOCKQUOTE><CODE>
<PRE>
Detaching pid nnnn
</PRE>
</CODE></BLOCKQUOTE>
 are
diagnostic of this problem.  This is a reiserfs bug which causes a
thread to occasionally read stale data from a mmapped page shared with
another thread.  The fix is to upgrade the filesystem or to have /tmp be
an ext2 filesystem.</P>


<H2><A NAME="ss13.4">13.4</A> <A HREF="UserModeLinux-HOWTO.html#toc13.4">The compile fails with errors about conflicting types for 'open', 'dup', and 'waitpid'</A>
</H2>

<P>This happens when you build in /usr/src/linux.  The UML build makes
the include/asm link point to include/asm-um.  /usr/include/asm points
to /usr/src/linux/include/asm, so when that link gets moved, files
which need to include the asm-i386 versions of headers get the
incompatible asm-um versions.  The fix is to move the include/asm link
back to include/asm-i386 and to do UML builds someplace else.</P>


<H2><A NAME="ss13.5">13.5</A> <A HREF="UserModeLinux-HOWTO.html#toc13.5">UML doesn't work when /tmp is an NFS filesystem</A>
</H2>

<P>This seems to be a similar situation with the resierfs problem above.  Some
versions of NFS seems not to handle mmap correctly, which UML depends on.
The workaround is have /tmp be non-NFS directory.</P>



<H2><A NAME="ss13.6">13.6</A> <A HREF="UserModeLinux-HOWTO.html#toc13.6">UML hangs on boot when compiled with gprof support</A>
</H2>

<P>If you build UML with gprof support and, early in the boot, it does this
<BLOCKQUOTE><CODE>
<PRE>
kernel BUG at page_alloc.c:100!
</PRE>
</CODE></BLOCKQUOTE>

you have a buggy gcc.  You can work around the problem by removing
UM_FASTCALL from CFLAGS in arch/um/Makefile-i386.  This will open up
another bug, but that one is fairly hard to reproduce.</P>


<H2><A NAME="ss13.7">13.7</A> <A HREF="UserModeLinux-HOWTO.html#toc13.7">syslogd dies with a SIGTERM on startup</A>
</H2>

<P>The exact boot error depends on the distribution that you're booting,
but Debian produces this:
<BLOCKQUOTE><CODE>
<PRE>
/etc/rc2.d/S10sysklogd: line 49:    93 Terminated
start-stop-daemon --start --quiet --exec /sbin/syslogd -- $SYSLOGD
</PRE>
</CODE></BLOCKQUOTE>

This is a syslogd bug.  There's a race between a parent process
installing a signal handler and its child sending the signal.  See
<A HREF="http://www.geocrawler.com/lists/3/SourceForge/709/0/6612801">this uml-devel post</A>  for the details.</P>


<H2><A NAME="ss13.8">13.8</A> <A HREF="UserModeLinux-HOWTO.html#toc13.8">TUN/TAP networking doesn't work on a 2.4 host</A>
</H2>

<P>There are a couple of problems which were 
<A HREF="http://www.geocrawler.com/lists/3/SourceForge/597/0/">http://www.geocrawler.com/lists/3/SourceForge/597/0/</A> name="pointed out">  by 
<A HREF="timro at trkr dot net">Tim Robinson</A> 
<UL>
<LI>It doesn't work on hosts running 2.4.7 (or thereabouts) or earlier.  The fix 
is to upgrade to something more recent and then read the next item.
</LI>
<LI>If you see 
<BLOCKQUOTE><CODE>
<PRE>
File descriptor in bad state
</PRE>
</CODE></BLOCKQUOTE>
 when you
bring up the device inside UML, you have a header mismatch between the
original kernel and the upgraded one.  Make /usr/src/linux point at
the new headers.  This will only be a problem if you build uml_net
yourself.
</LI>
</UL>
</P>


<H2><A NAME="ss13.9">13.9</A> <A HREF="UserModeLinux-HOWTO.html#toc13.9">You can network to the host but not to other machines on the net</A>
</H2>

<P>If you can connect to the host, and the host can connect to UML, but you 
can not connect to any other machines, then you may need to enable IP 
Masquerading on the host.  Usually this is only experienced when using 
private IP addresses (192.168.x.x or 10.x.x.x) for host/UML
networking, rather than the public address space that your host
is connected to.  UML does not enable IP Masquerading, so you will
need to create a static rule to enable it:
<BLOCKQUOTE><CODE>
<PRE>
host% 
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
</PRE>
</CODE></BLOCKQUOTE>

Replace eth0 with the interface that you use to talk to the
rest of the world.</P>
<P> 
Documentation on IP Masquerading, and SNAT, can be found at 
<A HREF="http://www.netfilter.org"> www.netfilter.org </A> .</P>
<P> 
If you can reach the local net, but not the outside Internet, then
that is usually a routing problem.  The UML needs a default route:
<BLOCKQUOTE><CODE>
<PRE>
UML# 
route add default gw gateway IP
</PRE>
</CODE></BLOCKQUOTE>

The gateway IP can be any machine on the local net that knows how to
reach the outside world.  Usually, this is the host or the local
network's gateway.</P>
<P> 
Occasionally, we hear from someone who can reach some machines, but
not others on the same net, or who can reach some ports on other
machines, but not others.  These are usually caused by strange
firewalling somewhere between the UML and the other box.  You track
this down by running tcpdump on every interface the packets travel
over and see where they disappear.  When you find a machine that
takes the packets in, but does not send them onward, that's the
culprit.</P>


<H2><A NAME="ss13.10">13.10</A> <A HREF="UserModeLinux-HOWTO.html#toc13.10">I have no root and I want to scream</A>
</H2>

<P>Thanks to Birgit Wahlich for telling me about this strange one.  It
turns out that there's a limit of six environment variables on the
kernel command line.  When that limit is reached or exceeded, argument
processing stops, which means that the 'root=' argument that UML
usually adds is not seen.  So, the filesystem has no idea what the
root device is, so it panics.</P>
<P> 
The fix is to put less stuff on the command line.  Glomming all your
setup variables into one is probably the best way to go.</P>


<H2><A NAME="ss13.11">13.11</A> <A HREF="UserModeLinux-HOWTO.html#toc13.11">UML build conflict between ptrace.h and ucontext.h</A>
</H2>

<P>On some older systems, /usr/include/asm/ptrace.h and 
/usr/include/sys/ucontext.h define the same names.  So, when they're
included together, the defines from one completely mess up the parsing
of the other, producing errors like:
<BLOCKQUOTE><CODE>
<PRE>
/usr/include/sys/ucontext.h:47: parse error before
`10'
</PRE>
</CODE></BLOCKQUOTE>

plus a pile of warnings.</P>
<P> 
This is a libc botch, which has since been fixed, and I don't see any
way around it besides upgrading.</P>


<H2><A NAME="ss13.12">13.12</A> <A HREF="UserModeLinux-HOWTO.html#toc13.12">The UML BogoMips is exactly half the host's BogoMips</A>
</H2>

<P>On i386 kernels, there are two ways of running the loop that is used
to calculate the BogoMips rating, using the TSC if it's there or using
a one-instruction loop.  The TSC produces twice the BogoMips as the
loop.  UML uses the loop, since it has nothing resembling a TSC, and
will get almost exactly the same BogoMips as a host using the loop.
However, on a host with a TSC, its BogoMips will be double the loop
BogoMips, and therefore double the UML BogoMips.</P>


<H2><A NAME="ss13.13">13.13</A> <A HREF="UserModeLinux-HOWTO.html#toc13.13">When you run UML, it immediately segfaults</A>
</H2>

<P>If the host is configured with the 2G/2G address space split, that's
why.  See 
<A HREF="UserModeLinux-HOWTO-4.html#2G-2G">UML on 2G/2G hosts</A>  for
the details on getting UML to run on your host.</P>


<H2><A NAME="ss13.14">13.14</A> <A HREF="UserModeLinux-HOWTO.html#toc13.14">xterms appear, then immediately disappear</A>
</H2>

<P>If you're running an up to date kernel with an old release of 
uml_utilities, the port-helper program will not work properly, so 
xterms will exit straight after they appear. The solution is to
upgrade to the latest release of uml_utilities.  Usually this problem
occurs when you have installed a packaged release of UML then
compiled your own development kernel without upgrading the
uml_utilities from the source distribution.</P>


<H2><A NAME="ss13.15">13.15</A> <A HREF="UserModeLinux-HOWTO.html#toc13.15">cannot set up thread-local storage</A>
</H2>


<P>This problem is fixed by the skas-hold-own-ldt patch that went into
2.6.15-rc1.</P>
<P> </P>
<P>The boot looks like this:
<BLOCKQUOTE><CODE>
<PRE>
cannot set up thread-local storage: cannot set up LDT for thread-local storage
Kernel panic - not syncing: Attempted to kill init!
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>Your UML kernel doesn't support Native Posix Thread Library (NPTL) and
the binaries you're running are being dynamically linked to it. Try
running in SKAS3 mode first.  You might be able to avoid the kernel
panic setting the  
<A HREF="http://people.redhat.com/drepper/assumekernel.html"> LD_ASSUME_KERNEL</A>  environment variable on the command line:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
 
./linux init=/bin/sh LD_ASSUME_KERNEL=2.4.1
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P> 
Many commands are very restrictive about what is preserved in the
environment when starting child processes, so relying on
LD_ASSUME_KERNEL to be globally set for all processes in the whole
system is generally not a good idea.  It's very hard to
guarantee. Thus it's better to move the NPTL libraries away:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
# mount root_fs mnt-uml/ -o loop
# mv mnt-uml/lib/tls mnt-uml/lib/tls.away
# umount mnt-uml
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>If you're running Debian, you might prefer to use dpkg-divert:
<BLOCKQUOTE><CODE>
<PRE>
# export LD_ASSUME_KERNEL=2.4.1
# mount root_fs mnt-uml/ -o loop
# chroot mnt-uml
# mkdir /lib/tls.off
# cd /lib/tls
# for f in *;
  do
       dpkg-divert --divert --local --rename --divert /lib/tls.off/$f --add /lib/tls/$f;
  done
# exit
# umount mnt-uml
# unset LD_ASSUME_KERNEL
</PRE>
</CODE></BLOCKQUOTE>
</P>


<H2><A NAME="ss13.16">13.16</A> <A HREF="UserModeLinux-HOWTO.html#toc13.16">Process segfaults with a modern (NPTL-using) filesystem</A>
</H2>

<P>These appear to be fixed with the tls patches from Blaisorblade that
are currently in my  
<A HREF="http://user-mode-linux.sourceforge.net/patches.html">patchset</A> .  You can apply the entire
patchset, or you can move /lib/tls in the image away, as described
above.</P>


<H2><A NAME="ss13.17">13.17</A> <A HREF="UserModeLinux-HOWTO.html#toc13.17">Any other panic, hang, or strange behavior</A>
</H2>

<P>If you're seeing truly strange behavior, such as hangs or panics that
happen in random places, or you try running the debugger to see what's
happening and it acts strangely, then it could be a problem in the
host kernel.  If you're not running a stock Linus or -ac kernel, then
try that.  An early version of the preemption patch and a 2.4.10 SuSE
kernel have caused very strange problems in UML.</P>
<P> 
Otherwise, let me know about it.  Send a message to one of the UML
mailing lists - either the developer list - user-mode-linux-devel at
lists dot sourceforge dot net (subscription
info) or the user list - user-mode-linux-user at lists dot
sourceforge do net (subscription info),
whichever you prefer.  Don't assume that everyone knows about it and
that a fix is imminent.</P>
<P> 
If you want to be super-helpful, read 
<A HREF="UserModeLinux-HOWTO-14.html#trouble">Diagnosing Problems</A> 
and follow the instructions contained therein.</P>







<HR>
<A HREF="UserModeLinux-HOWTO-14.html">Next</A>
<A HREF="UserModeLinux-HOWTO-12.html">Previous</A>
<A HREF="UserModeLinux-HOWTO.html#toc13">Contents</A>
</BODY>
</HTML>