1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
<META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.21">
<TITLE>User Mode Linux HOWTO : What to do when UML doesn't work</TITLE>
<LINK HREF="UserModeLinux-HOWTO-14.html" REL=next>
<LINK HREF="UserModeLinux-HOWTO-12.html" REL=previous>
<LINK HREF="UserModeLinux-HOWTO.html#toc13" REL=contents>
</HEAD>
<BODY>
<A HREF="UserModeLinux-HOWTO-14.html">Next</A>
<A HREF="UserModeLinux-HOWTO-12.html">Previous</A>
<A HREF="UserModeLinux-HOWTO.html#toc13">Contents</A>
<HR>
<H2><A NAME="faq"></A> <A NAME="s13">13.</A> <A HREF="UserModeLinux-HOWTO.html#toc13">What to do when UML doesn't work</A></H2>
<P> </P>
<H2><A NAME="ss13.1">13.1</A> <A HREF="UserModeLinux-HOWTO.html#toc13.1">Child nnnnn exited with signal 11</A>
</H2>
<P>This appears just after
<BLOCKQUOTE><CODE>
<PRE>
VFS: Mounted root (ext2 filesystem) readonly.
Mounted devfs on /dev
</PRE>
</CODE></BLOCKQUOTE>
Two causes for this - the old cause which isn't the problem unless
you've a seriously old host kernel
<UL>
<LI>It's caused by a bug in the host kernel introduced in 2.4.0-test8 and
backed out in test9. The fix is to run some kernel other than test8
as the host.
</LI>
</UL>
The other cause, which is far more likely these days
<UL>
<LI>You're running a recent distribution on an old machine. I saw this
with the RH7.1 filesystem running on a Pentium. The shared library
loader, ld.so, was executing an instruction (cmove) which the Pentium
didn't support. That instruction was apparently added later. If you
run UML under the debugger, you'll see the hang caused by one
instruction causing an infinite SIGILL stream.
</LI>
</UL>
I have another report of this which appears not to be caused by either
of the above.</P>
<H2><A NAME="ss13.2">13.2</A> <A HREF="UserModeLinux-HOWTO.html#toc13.2">Segfault in padzero</A>
</H2>
<P>You run UML under the kernel debugger and this appears in the debugger
window:
<BLOCKQUOTE><CODE>
<PRE>
Program received signal SIGSEGV, Segmentation fault.
0x10035830 in padzero (elf_bss=1073765049)
at /ext1/usermode/linux/include/asm/arch/string.h:418
418 __asm__ __volatile__(
</PRE>
</CODE></BLOCKQUOTE>
This is the normal faulting in of init. To having gdb stop every time
a page is faulted in, do this in the debugger window:
<BLOCKQUOTE><CODE>
<PRE>
handle SIGSEGV pass nostop noprint
</PRE>
</CODE></BLOCKQUOTE>
This isn't necessary after test10 is released since the kernel
debugger won't see SIGSEGV any more.</P>
<H2><A NAME="ss13.3">13.3</A> <A HREF="UserModeLinux-HOWTO.html#toc13.3">Out of pty's in getmaster</A>
</H2>
<P>When UML boots up, it panics like this:
<BLOCKQUOTE><CODE>
<PRE>
Initializing stdio console driver
Initializing software serial port version 0
Kernel panic: Out of pty's in getmaster
</PRE>
</CODE></BLOCKQUOTE>
Either your system is out of pseudo-terminals, in which case you need
to figure out why and fix it, or you're running devfs with no
old-style tty-pty pairs. Make a few, and this panic will go away.</P>
<P>
As of test10, this problem doesn't cause a panic. The serial line
driver just fails to initialize itself.</P>
<H2><A NAME="ss13.4">13.4</A> <A HREF="UserModeLinux-HOWTO.html#toc13.4">Can't set up the umn device : "Failed to set slip line discipline"</A>
</H2>
<P>On newer host kernels, the security on the slip device was tightened
so that you need to be root in order to set the slip line discipline
on a terminal. On recent versions of UML this isn't a problem since
the uml_net helper sets up the slip device.</P>
<H2><A NAME="ss13.5">13.5</A> <A HREF="UserModeLinux-HOWTO.html#toc13.5">Stack overflowed onto current_task page</A>
</H2>
<P>This panic was introduced in test9 to try to catch a real stack
overflow bug. It actually caught a lot of cases which weren't bugs.
It's fixed in test10 by the stack being twice as big and there being a
guard page between the stack and the task structure. This panic is
probably only seen on fairly recent 2.4.0 host kernels. So, a
workaround would be to run a 2.2 or a not-too-recent 2.3/2.4 kernel as
the host.</P>
<H2><A NAME="ss13.6">13.6</A> <A HREF="UserModeLinux-HOWTO.html#toc13.6">Strange compilation errors when you build from source</A>
</H2>
<P>As of test11, it is necessary to have "ARCH=um" in the environment or
on the make command line for all steps in building UML, including
clean, distclean, or mrproper, config, menuconfig, or xconfig, dep,
and linux. If you forget for any of them, the i386 build seems to
contaminate the UML build. If this happens, start from scratch with
<BLOCKQUOTE><CODE>
<PRE>
make mrproper ARCH=um
</PRE>
</CODE></BLOCKQUOTE>
and repeat the build process
with ARCH=um on all the steps.</P>
<P>
See
<A HREF="UserModeLinux-HOWTO-2.html#compile">Compiling the kernel and modules</A> for
more details.</P>
<P>
Another cause of strange compilation errors is building UML in
/usr/src/linux. If you do this, the first thing you need to do is
clean up the mess you made. The /usr/src/linux/asm link will now
point to /usr/src/linux/asm-um. Make it point back to
/usr/src/linux/asm-i386. Then, move your UML pool someplace else and
build it there. Also see below, where a more specific set of symptoms
is described.</P>
<H2><A NAME="ss13.7">13.7</A> <A HREF="UserModeLinux-HOWTO.html#toc13.7">UML hangs on boot after mounting devfs</A>
</H2>
<P>If you have the debugger running, it will always show
copy_mount_options on the stack. This is due to a bogus compiler.
You will have a kgcc on your system. Redo the UML build with
"CC=kgcc" on the make command line.</P>
<P>
This was a UML bug, not a compiler bug, and has since been fixed.</P>
<H2><A NAME="ss13.8">13.8</A> <A HREF="UserModeLinux-HOWTO.html#toc13.8">A variety of panics and hangs with /tmp on a reiserfs filesystem</A>
</H2>
<P>I saw this on reiserfs 3.5.21 and it seems to be fixed in 3.5.27.
Panics preceded by
<BLOCKQUOTE><CODE>
<PRE>
Detaching pid nnnn
</PRE>
</CODE></BLOCKQUOTE>
are
diagnostic of this problem. This is a reiserfs bug which causes a
thread to occasionally read stale data from a mmapped page shared with
another thread. The fix is to upgrade the filesystem or to have /tmp be
an ext2 filesystem.</P>
<H2><A NAME="ss13.9">13.9</A> <A HREF="UserModeLinux-HOWTO.html#toc13.9">The compile fails with errors about conflicting types for 'open', 'dup', and 'waitpid'</A>
</H2>
<P>This happens when you build in /usr/src/linux. The UML build makes
the include/asm link point to include/asm-um. /usr/include/asm points
to /usr/src/linux/include/asm, so when that link gets moved, files
which need to include the asm-i386 versions of headers get the
incompatible asm-um versions. The fix is to move the include/asm link
back to include/asm-i386 and to do UML builds someplace else.</P>
<H2><A NAME="ss13.10">13.10</A> <A HREF="UserModeLinux-HOWTO.html#toc13.10">UML doesn't work when /tmp is an NFS filesystem</A>
</H2>
<P>This seems to be a similar situation with the resierfs problem above. Some
versions of NFS seems not to handle mmap correctly, which UML depends on.
The workaround is have /tmp be non-NFS directory.</P>
<H2><A NAME="ss13.11">13.11</A> <A HREF="UserModeLinux-HOWTO.html#toc13.11">UML hangs on boot when compiled with gprof support</A>
</H2>
<P>If you build UML with gprof support and, early in the boot, it does this
<BLOCKQUOTE><CODE>
<PRE>
kernel BUG at page_alloc.c:100!
</PRE>
</CODE></BLOCKQUOTE>
you have a buggy gcc. You can work around the problem by removing
UM_FASTCALL from CFLAGS in arch/um/Makefile-i386. This will open up
another bug, but that one is fairly hard to reproduce.</P>
<H2><A NAME="ss13.12">13.12</A> <A HREF="UserModeLinux-HOWTO.html#toc13.12">syslogd dies with a SIGTERM on startup</A>
</H2>
<P>The exact boot error depends on the distribution that you're booting,
but Debian produces this:
<BLOCKQUOTE><CODE>
<PRE>
/etc/rc2.d/S10sysklogd: line 49: 93 Terminated
start-stop-daemon --start --quiet --exec /sbin/syslogd -- $SYSLOGD
</PRE>
</CODE></BLOCKQUOTE>
This is a syslogd bug. There's a race between a parent process
installing a signal handler and its child sending the signal. See
<A HREF="http://www.geocrawler.com/lists/3/SourceForge/709/0/6612801">this uml-devel post</A> for the details.</P>
<H2><A NAME="ss13.13">13.13</A> <A HREF="UserModeLinux-HOWTO.html#toc13.13">TUN/TAP networking doesn't work on a 2.4 host</A>
</H2>
<P>There are a couple of problems which were
<A HREF="http://www.geocrawler.com/lists/3/SourceForge/597/0/">http://www.geocrawler.com/lists/3/SourceForge/597/0/</A> name="pointed out"> by
<A HREF="timro at trkr dot net">Tim Robinson</A>
<UL>
<LI>It doesn't work on hosts running 2.4.7 (or thereabouts) or earlier. The fix
is to upgrade to something more recent and then read the next item.
</LI>
<LI>If you see
<BLOCKQUOTE><CODE>
<PRE>
File descriptor in bad state
</PRE>
</CODE></BLOCKQUOTE>
when you
bring up the device inside UML, you have a header mismatch between the
original kernel and the upgraded one. Make /usr/src/linux point at
the new headers. This will only be a problem if you build uml_net
yourself.
</LI>
</UL>
</P>
<H2><A NAME="ss13.14">13.14</A> <A HREF="UserModeLinux-HOWTO.html#toc13.14">You can network to the host but not to other machines on the net</A>
</H2>
<P>This is because of routing that's automatically set up, but which is
wrong for UML. You need to delete the network route and replace it
with a host route to the host IP. See the bottom of
the
<A HREF="http://user-mode-linux.sourceforge.net/networking.html#routing">networking page</A> for details.</P>
<P>
This has been fixed by UML setting up proxy arp differently so that
things work with the network route and the host route isn't needed.</P>
<H2><A NAME="ss13.15">13.15</A> <A HREF="UserModeLinux-HOWTO.html#toc13.15">I have no root and I want to scream</A>
</H2>
<P>Thanks to Birgit Wahlich for telling me about this strange one. It
turns out that there's a limit of six environment variables on the
kernel command line. When that limit is reached or exceeded, argument
processing stops, which means that the 'root=' argument that UML
usually adds is not seen. So, the filesystem has no idea what the
root device is, so it panics.</P>
<P>
The fix is to put less stuff on the command line. Glomming all your
setup variables into one is probably the best way to go.</P>
<H2><A NAME="ss13.16">13.16</A> <A HREF="UserModeLinux-HOWTO.html#toc13.16">UML build conflict between ptrace.h and ucontext.h</A>
</H2>
<P>On some older systems, /usr/include/asm/ptrace.h and
/usr/include/sys/ucontext.h define the same names. So, when they're
included together, the defines from one completely mess up the parsing
of the other, producing errors like:
<BLOCKQUOTE><CODE>
<PRE>
/usr/include/sys/ucontext.h:47: parse error before
`10'
</PRE>
</CODE></BLOCKQUOTE>
plus a pile of warnings.</P>
<P>
This is a libc botch, which has since been fixed, and I don't see any
way around it besides upgrading.</P>
<H2><A NAME="ss13.17">13.17</A> <A HREF="UserModeLinux-HOWTO.html#toc13.17">The UML BogoMips is exactly half the host's BogoMips</A>
</H2>
<P>On i386 kernels, there are two ways of running the loop that is used
to calculate the BogoMips rating, using the TSC if it's there or using
a one-instruction loop. The TSC produces twice the BogoMips as the
loop. UML uses the loop, since it has nothing resembling a TSC, and
will get almost exactly the same BogoMips as a host using the loop.
However, on a host with a TSC, its BogoMips will be double the loop
BogoMips, and therefore double the UML BogoMips.</P>
<H2><A NAME="ss13.18">13.18</A> <A HREF="UserModeLinux-HOWTO.html#toc13.18">When you run UML, it immediately segfaults</A>
</H2>
<P>If the host is configured with the 2G/2G address space split, that's
why. See
<A HREF="UserModeLinux-HOWTO-4.html#2G-2G">UML on 2G/2G hosts</A> for
the details on getting UML to run on your host.</P>
<H2><A NAME="ss13.19">13.19</A> <A HREF="UserModeLinux-HOWTO.html#toc13.19">Any other panic, hang, or strange behavior</A>
</H2>
<P>If you're seeing truly strange behavior, such as hangs or panics that
happen in random places, or you try running the debugger to see what's
happening and it acts strangely, then it could be a problem in the
host kernel. If you're not running a stock Linus or -ac kernel, then
try that. An early version of the preemption patch and a 2.4.10 SuSE
kernel have caused very strange problems in UML.</P>
<P>
Otherwise, let me know about it. Send a message to one of the UML
mailing lists - either the developer list - user-mode-linux-devel at
lists dot sourceforge dot net (subscription
info) or the user list - user-mode-linux-user at lists dot
sourceforge do net (subscription info),
whichever you prefer. Don't assume that everyone knows about it and
that a fix is imminent.</P>
<P>
If you want to be super-helpful, read
<A HREF="UserModeLinux-HOWTO-14.html#trouble">Diagnosing Problems</A>
and follow the instructions contained therein.</P>
<HR>
<A HREF="UserModeLinux-HOWTO-14.html">Next</A>
<A HREF="UserModeLinux-HOWTO-12.html">Previous</A>
<A HREF="UserModeLinux-HOWTO.html#toc13">Contents</A>
</BODY>
</HTML>
|