Site Home Page
What it's good for
Case Studies
Kernel Capabilities
Downloading it
Running it
Compiling
Installation
Building filesystems
Troubles
User Contributions
Related Links
The ToDo list
Projects
Diary
Thanks
Contacts
Tutorials
The HOWTO (html)
The HOWTO (text)
Host file access
Device inputs
Sharing filesystems
Creating filesystems
Virtual Networking
Management Console
Kernel Debugging
gprof and gcov
Running X
Diagnosing problems
Configuration
Installing Slackware
Porting UML
IO memory emulation
How you can help
Overview
Documentation
Utilities
Kernel bugs
Kernel projects
Screenshots
A virtual network
An X session
Transcripts
A login session
A debugging session
Slackware installation
Reference
Kernel switches
Slackware README
Papers
ALS 2000 paper (html)
ALS 2000 paper (TeX)
ALS 2000 slides
LCA 2001 slides
OLS 2001 paper (html)
OLS 2001 paper (TeX)
ALS 2001 paper (html)
ALS 2001 paper (TeX)
UML security (html)
LCA 2002 (html)
Fun and Games
Kernel Hangman
Disaster of the Month

Dear Diary

This page contains information about what's currently happening with the project. I may update it once in a while if I feel like it.

17 Feb 2002

I submitted a paper proposal for OLS yesterday. This time, I decided to rant and rave about how the host kernel needs to be fixed to better support virtual machines. We'll see how that goes. At least it's different from the standard UML song and dance I've been flogging for the last couple of years.

I redid the UML signal delivery code in order to fix the pthreads hang with the newer pthreads library. I prototyped this in a small standalone process in Incheon Airport (Seoul's international airport) on my way back from Brisbane and just got around to integrating it into UML. It massively cleaned up a bunch of code and opens the way for getting rid of some other problems that have existed for a while.

11 Feb 2002
My flight leaves tonight, so I spent the day wandering around central Brisbane. Ben LaHaise happened to take the same CityCat ferry (which is a system of catamarans used to ferry people up and down the Brisbane River) as me down to the city. He tried to blow up the boat with his umbrella, but, fortunately, an alert crew member stopped him.

Back to the Uni at the end of the afternoon, collect my bags, call a taxi to the airport, and wait for the flight to Seoul. Then it's another 14 hours to JFK and another couple back to CT and then I will be hopelessly confused about what time it is for a couple of days.

10 Feb 2002
LCA is over as of yesterday, and I'm heading home tomorrow.

The slides from my talk (with notes) are available here. It was pretty well attended and it seemed to be received fairly well. LCA always seems to do some innovative things, one of which is to rerun talks that a large number of people regretted missing. The three talks that were chosen this year were a virtual reality talk that a huge number of people wanted to see, Andrew van der Stock's talk on code auditing, and one other that I can't remember. This was good because I was one of the huge number of people that wanted to see the virtual reality talk. However, it turned out that my talk was the number four vote-getter, and this turned out to be relevant when Andrew was nowhere to be seen when the reruns were about to happen.

So, I was happily debugging the ppc UML build with Anton Blanchard when one of the organizers ran up and asked me if I could do mine again. I did so, except I skipped over some of the heavier parts of the talk to leave a good bit of time for a demo at the end.

This went well, except for the panics I got when I tried to have two UMLs mount the same filesystem. I demoed three UMLs running (one Debian, two Slackwares) with most of them (plus the host, I think) displaying on the X server of one of the Slackware UMLs. I showed various other aspects of UML like what it looks like from the host side.

1 Feb 2002
I'm off to Australia tomorrow for LCA 2002. This caps a fairly productive week of UML bug hunting:
  • mistral and blinky started seeing a panic in fork. mistral figured out how to reproduce it and tracked it down to the point where it had something to do with kernel threads reference counting their mm's. This was enough information for me to fix the bug, by giving kernel threads NULL mm's.
  • While I was at ISTS giving a talk, I spent some free time looking at the pthreads problem that people have been seeing for a while. It turns out to be a problem with UML signal delivery. I had assumed that the registers going into a signal handler didn't matter (except for the IP and SP, of course) and that only the stack frame mattered. So, the process registers at the start of a signal delivery are initialized with a set that was captured from a UML thread at boot time. This works fine usually, except that recent pthreads libraries store some thread-private data in %gs, and the %gs value has to be preserved in signal handlers. So, the UML signal delivery mechanism needs to be reworked again.
  • The UML IO hangs that were reported this week were tracked down and found to be a bug in the host's handling of SIGIO. It turns out to be possible, on an SMP host, for SIGIO to be queued to a process after that process has returned from the fcntl that registered a different process as the SIGIO recipient. This breaks UML badly because SIGIOs end up queued, but not delivered, to a process which is out of context and sleeping.
30 Jan 2002
I gave a talk at ISTS yesterday on the UML security work that I did last week. It's available both as html-ized slides and as the original Star Office presentation. These are intended to provide a starting point for anyone wanting to probe this for exploitable holes as well as anyone who's curious about what was done.

I released 2.4.17-10 today. It contains a pile of bug fixes and a bunch of changes which allow a UML patch to come close to compiling in both 2.4 and 2.5 pools. I reverted a change which is causing problems on SMP hosts. I decided that using sockets rather than pts devices to communicate between the IO thread and UML was a good idea because sockets are lighter weight and they're pretty much guaranteed to be supported on the host, whereas there are lots of systems without pts devices. However, there is some difference in how SIGIO is delivered which causes UML to lose interrupts once in a while. The effect is that it seems to hang on boot, but can be made to continue by banging on the keyboard.

MTD is in the configuration now. So, UML supports MTD devices and creating JFFS2 filesystems and mounting them seems to work, although there are some nasty-looking error messages along the way. They don't seem obviously related to UML though.

25 Jan 2002
The security work is largely done. The exception is the lcall prevention fix that's needed on the host. So, I released 2.4.17-9 today. It also contains a number of fixes and patches from other people, the largest being the latest set of James McMechan's ubd changes.
23 Jan 2002
I released 2.4.17-8 yesterday (and 2.4.17-7 earlier this week without dignifying it with a diary entry). I spent a fair amount of time tracking down some old debugging problems. The strace recipe on the debugging page hasn't worked for a while, so I mostly fixed it. strace still doesn't see system calls from new processes until they receive a signal.

I also figured out what was happening with using the gdb under ddd as an external debugger. ddd periodically calls wait on gdb, and when UML attaches it, gdb gets reparented away from ddd. wait starts returning ECHILD, and ddd reacts by shutting down gdb's input, and gdb, in turn, exits. To fix this, I think I'm going to have to have a 'gdb-parent=' switch that will make UML attach to the parent and fake normal return values from wait.

In other news, there have been a couple of articles about UML recently. NewsForge ran one yesterday. This is a followup on the article last week about Linux virtual machines which completely failed to mention UML. Bill Stearns also noticed this article on using UML as the basis of a honeynet.

I've started finishing off the security work needed to make UML a secure root jail. The 'jail' switch now checks for config options which would make the UML inherently insecure and refuses to run if any of them are enabled. Currently, the proscribed options are CONFIG_MODULES, CONFIG_HOSTFS, and CONFIG_SMP. CONFIG_MODULES is fairly obvious. If modules are enabled, then root can insert any code at all into UML, and a nasty root would insert code that execs a shell or something on the host. CONFIG_HOSTFS is forbidden to prevent accidentally providing access to the host filesystem. This is a bit dubious because hostfs is not inherently insecure, and I may relax this one at some point. CONFIG_SMP is non-obvious. Lennert Buytenhek noticed the relationship between SMP and security. 'jail' is implemented by unprotecting kernel memory (by making it writable) on entering the UML kernel, and write-protecting it on kernel exit. If a process were to have two threads, one busy-waiting in userspace, and the other sleeping in the kernel, kernel memory would be writable because the sleeping thread would be in the kernel. So, the spinning thread would wait for that to happen and write on whatever part of kernel memory would let it escape. This will be fixed when UML gets a separate address space for the kernel.

I've stared at /proc and /dev to find devices that provide access to kernel memory. The only two that I spotted were /dev/mem and /dev/kmem. These are disabled with a trick that someone on #kernelnewbies told me about. Access to them is controlled by CAP_SYS_RAWIO, so removing that capability from the bounding capability set makes it impossible for any process to ever get it. So, no process can ever open those devices. Even better, in my limited testing, nothing seems to break badly as a result.

/proc/kcore looks suspicious, but it's a readonly file that seems to fit out a memory image wrapped in an ELF header, so it's OK, security-wise.

18 Jan 2002
2.4.17-6 is out. There are a lot of driver cleanups and bug fixes in this patch. The IRQ hang is fixed. The default console and serial line channel initialization strings are now configurable.

In other news, NewsForge ran an article about Linux virtual machines which covered everything relevant, including some things that weren't virtual machines, except for UML. They got some complaints about that, and later that day, I got a piece of email from the guy who wrote the article wanting to write a followup about UML. So, it looks like UML will be getting another nice bit of publicity.

13 Jan 2002
I made another patch against the 2.5.2-pre tree again today. This one is against 2.5.2-pre11. It has been sent off to Linus so he can drop it in his bit bucket.

In another development, it turns out that the O(1) scheduler breaks UML by holding IRQs disabled across context switches. This results in SIGIO (i.e. from disk IO completions) to be trapped in a process that has gone out of context, and can't be woken up until something else notices that the IO has completed, and of course it won't because the SIGIO has been delivered to the wrong process.

11 Jan 2002
After spending five days tracking down a swap corruption bug, I discovered that Rodrigo de Castro had explained to me exactly what it was about a month ago. Unfortunately, at the time, I didn't know enough about the swap code to decide whether he was making sense. Of course he was, and I discovered that at the end of the great bug hunt.

So, with that fix, Lennert's latest SMP changes, and a bunch of smaller stuff, I'm releasing 2.4.17-5 today.

Linus saw fit to silently drop UML into the bit bucket again, so I'll make another patch soon and send it in.

5 Jan 2002
I released the 2.4.17-3 and 2.4.17-4 patches this week. The biggest change has been the merging of Lennert's SMP fixes.

I made another attempt to get UML into the Linus tree. This patch is against 2.5.2-pre9 and is the 2.4.17-4 patch. We'll see how well this attempt fares.

30 Dec 2001
I announced the full 2.4.17 release and the 2.4.17-2 patch today. The patch is largely changes to allow UML to calculate current from the stack. This is to make life easier for Lennert, who's trying to get SMP working. It also contains a bunch of fixes for bugs that crop up when host devices get closed from under UML consoles or serial lines.

Iain Young is making a decent stab at a UML/sparc64 port. He's got the boilerplate filled in (albeit with some skeptical comments about their correctness...) and he's trying to get the whole thing to compile.

Linus released 2.5.2-pre4 today with no sign of UML in the changelog. I grabbed the patch to make sure it wasn't there. It wasn't. Grrr. I'll give him another patch and then I'll spam him with it again.

28 Dec 2001
OK, so the diary has taken a bit of a holiday break. I released the 2.4.17 UML patch yesterday. The full release will be forthcoming. I also ported that patch into 2.5.1, created the 2.5.1 patch, and sent it to Linus. Hopefully, he will put it in without my having to resend it too many times. I sent a little note off to LKML announcing this, which got some favorable reaction, both on and off the list. Alan, being his usual taciturn self sent a reply, which read, in its entirety, "Cool".

This release had a lot of accumulated stuff in it. The biggest item are the port channel, which let you attach any number of UML consoles and serial lines to a host port, at which point you can access them by telnetting to that port. I also redid the context switching mechanism after thinking of a much simpler way of doing it. This should also be much faster since it doesn't involve signals flying around, so context switches are now invisible to the tracing thread.

8 Dec 2001
I released 2.4.15-3 today. It contains some fixes to previous patches and a lot of changes to the gdb support. gdb now sees ^C immediately, rather than an arbitrary amount of time after it's typed. I also cleaned up that code quite a bit. This knocks a couple of items off the todo list. It also sets me up to fix the gdb shell hang, but I'll let these changes gel for a bit before dealing with that.
7 Dec 2001
I got the Sysadmin Disaster of the Month contest going about a week later than I should have. The thing that happened a week earlier was the publication of an article that I wrote for O'Reilly on using UML to simulate and recover from disasters.

This month's disaster is a trashed root superblock. It involves booting UML, zeroing out the superblock, and figuring out how to fix the filesystem. I had some trouble coming up with a good example to use for the contest. So, if I have similar troubles at the end of the month, I might just trash a filesystem, make it available for download, and the contest will be to figure out what's wrong with it and fix it.

4 Dec 2001
Back from Linux-Kongress. Back on US/Eastern. I think I never left it, which made life (the staying awake part of it) difficult in Holland.

As for highlights of the conference, we have:

  • I and the UML project seem to have some name recognition. Everyone I talked to seemed to have heard of both me and UML, which is very cool.
  • The talk went reasonably well. I gave the same talk as I did at ALS. I had forgotten that it was somewhat tailored for ALS (i.e. I tried to avoid talking about stuff that I had talked about at the previous ALS), and it would have been somwhat different if I had prepared a new talk for Linux-Kongress.
  • I met a pile of cool people, like
    • Lennert Buytenhek (who, in recognition of his contributions to both projects, was embarassed by both me and Rusty by being asked to stand up for some audience appreciation, which must be a new Linux-Kongress record)
    • Roman Zippel (who told me about a couple of ubd driver bugs, one of which I knew of (a subtle rounding error), and one of which I didn't (Greg Lonnon and I went to some trouble to put the COW header in network byte order, but forgot to do the same for the block bitmap (grrrr))
    • Bruce Walker (who's in charge of the Compaq SSI project, and who asked me clustering questions during my talk, at which point, I (correctly) guessed who he worked for and why he was asking)
    • Fabio Olive Leite (a Conectivite who gave me a nice Conectiva filesystem image a while back, and whose name I unaccented to get it through my XSL processor)
    • Philipp Reisner (who threatened an Alpha UML port a while back, but gave up, so he's less cool than the others :-)
  • The organizers took the speakers back to Amsterdam after the conference to spend the day bumming around the city. We broke up into small groups and went our separate ways. Our group spent much of our time in two smallish cafe-type places just talking about random stuff.
  • The train trip from Schiphol airport (Amsterdam) to Enschede (near the German border) was interesting for a number of people. It was exactly as described for me (2+ hours, direct train, no problem), but some track maintenance in Amsterdam's central station caused subsequent trains to be cancelled, so other people had nightmares involving 5 trains and a bus totalling 5 hours.
26 Nov 2001
I released 2.4.14-6 today in the interest of clearing the decks for the 2.4.15 release. Before leaving for Thanksgiving, I had redone the mconsole protocol to be packet-oriented. This allows a lot more flexibility in what can be done with the protocol. As a result, you'll need the new mconsole client for this release.

While down in CT, I redid the host channel support. It is all much cleaner now, and makes it a lot simpler to knock a bunch of related items off the todo list.

I decided to knock off the 2.4.15 patch today as well. It went cleanly, aside from a ptrace cleanup and a new way of generating /proc/cpuinfo which I had to support. I also put in the file corruption fix after forgetting to and discovering that a boot/halt caused fsck to complain a little.

18 Nov 2001
While I'm waiting for the meteors to arrive, I'm chasing and stomping UML bugs. I cleaned up and released the proxy arp fixes that I did on planes and in airports on my way to Oakland. Before, uml_net would blindly add an arp entry to eth0 and nothing else. This is wrong if there is no eth0, and it's also wrong if eth0 doesn't connect to the local net or if there are other interfaces also attached to the local net. uml_net now looks at the routing table and puts an arp entry on every interface that talks to the local net.

I also noticed that slip support wasn't up to date, so I modernized it and cleaned up the code while I was at it. You can now change the IP address of a slip-based interface and the host configuration will be updated just like the other transports.

I added some RT signal support. SA_SIGINFO is now supported, which will hopefully fix some of the strange process behaviors that have cropped up lately. If this fix doesn't do it, I chased down another bug which was causing rt_sigsuspend and sigsuspend to return incorrect values. This was causing the libc sigsuspend to hang, and its process with it. This fixes the pthread_create hang that Greg Lonnon noticed, plus the gdb hang, I think. I haven't checked that yet.

Those fixes are in 2.4.14-3um which I just released. You'll need the latest utilities in order to use the network, since I bumped the uml_net version again.

14 Nov 2001
OK, I'm back from ALS. My talk was on the first day, and it was reasonably well attended considering the somewhat dismal overall number of attendees. It was a half-hour talk, so about an hour beforehand, I took my OLS slides, threw out more than half of them, and updated the rest. That worked out reasonably well, but 30 minutes makes for a very short talk without much detail.

Daniel Phillip's talk was the last of the conference, and was somewhat interesting. He had a pile of raw data that he needed to turn into slides, and all of the KDE presentation tools blew up on him in one way or another. So, in the break before his slot, he grabbed Stephen Tweedie, and they plus me and another guy went off to a local dim-sum place. Daniel and Stephen sweated over Star Office on Stephen's laptop making slides. In the event, they turned out rather well.

I left just after Daniel's talk, so I missed out on some of the socializing that afternoon and evening. It turned out that Daniel and Larry McVoy were talking about his clustering ideas (MetaLinux, or ML), and it occurred to them that UML was not only a good simulation tool for ML, but that it actually implements a good part of what Larry has in mind. I found out about this later, and had a long talk with Larry on Monday, in which he explained his plans. I had heard various mumblings about it, and saw a slide show that Larry has, and remained unenlightened. It turns out, that as far as I can tell, the only way to find out what he's thinking is to have him explain it in person. Anyway, I became enlightened after our chat, and it looks like this could be a whole new area that UML could branch into.

In actual development news, I fully released 2.4.14 today. En route to Oakland, I fixed uml_net so it's smarter about doing proxy arp. It figures out what devices are connected to the local net and only sets proxy arp on those. As a side-effect, if the host is totally isolated, then you don't get scary-looking error messages when it tries to set proxy arp on eth0 and it turns out not to exist.

This happened to me at OLS when I tried to demo it after my talk. I got this nasty message which convinced me that the network all of a sudden didn't work, and I was all apologetic and had no idea what happened. In reality, the network was fine, and I could have demoed it if I had retained a bit more presence of mind.

This isn't in the 2.4.14 release because I'm not happy about the cleanliness of the change. I'll probably clean it up for the next 2.4.14 patch.

On a much-delayed train from San Francisco to Mountain View (a supposedly 1:13 hour trip that in reality required about 1:50 and two trains), I also figured out why you can't talk to eth1 from the host if you configure both an eth0 and eth1. It turned out to be the same bug that other people had noticed causing dropped packets. I was checking errno incorrectly. I had code that did this:

                
n = read(...);
if(errno == EAGAIN) return(0);

              
forgetting that successful system calls don't necessarily set errno to zero. So, the eth1 read was succeeding, but errno was still EAGAIN from the eth0 read.

In other news, beware of kernels built with gcc 3.0.2. I got a complaint from Jens Axboe today about UML leaving all kinds of not-quite-zombie processes lying around. I looked at it a bit and guessed that the host kernel was messed up somehow. He looked at that, decided I was right, and that the culprit was the latest gcc. The interesting thing was that, until he ran UML on that kernel, it looked just fine to him.

6 Nov 2001
In preparation for fixing the problem of the console driver losing output, I ported the SIGIO handler to use poll instead of select. This was mostly what 2.4.13-4 was. I later discovered a bug in it, which is fixed in -5.

I then decided to fix the problem of UML not being able to be interrupted and backgrounded. The problem was that all UML processes are in the same process group, with all of them stopped except for the one that's actually running. The problem is that when UML is backgrounded, the shell sends a SIGCONT to the process group, which wakes up every UML process, which is very bad.

I did some failed experiments with setpgrp/setsid and friends, and discovered that a separate process group wouldn't work because then those threads can't write to the terminal because they're in the wrong process group.

So, I decided that out-of-context processes should be asleep rather than stopped. This required redoing the task switching code. They were stopped because the tracing thread intercepted a signal from them when they went out of context and never continued them. Having them sleep would require that the tracing thread stop doing that and that the threads involved in a context switch arrange the transfer themselves.

So, what is now done is that non-running processes are asleep in sigsuspend, and they are woken up by the going-out-of-context process sending a SIGTERM. Races are avoided by having the SIGTERM sent inside a section of code that has blocked SIGTERM. SIGTERM is re-enabled atomically with the sleep with sigsuspend.

So, that plus the poll fix is the contents of -5.

3 Nov 2001
Time to patch-bomb Alan again. I sent in ten patches to get the ac tree current with CVS. Here they are:
2 Nov 2001
That last patch went into -ac6, so the ac UML builds and works again. The next job is to get the ac tree up to date.

I released a new utilities tarball today. uml_net should now do proxy arp correctly. uml_mconsole is now able to take a command on its command line and execute it, rather than being strictly a command line tool.

30 Oct 2001
I decided to make the -ac UML build again, so I made this patch and sent it off to Alan. The rest of the updates will be forthcoming.
29 Oct 2001
Today is 2.4.14-3 day. I decided to remove the code in fix_range which unmaps pages whose ptes say they're not present. That basically caused it to try to uselessly unmap all of its unused address space. So, I did that and it uncovered a bug. It turns out that swapped-out pages weren't marked as needing to be remapped. Everything worked a lot better with that fixed, and context switching should be a bit faster now.
28 Oct 2001
I released 2.4.14-2 today. This contains the fix for the process segfaults and the gdb problems people have been having. It also turns on morlock's context switch optimization which I disabled until I figured out the segfaults.
26 Oct 2001
I finished releasing 2.4.13 today.

After some prodding from Greg Lonnon, and after he did some investigation, I figured out what the problem with gdb inside UML is. The signal handlers don't save their registers in the thread struct. This means that when a SIGTRAP from a breakpoint comes in and it gets forwarded to gdb, when it gets the registers to find out what the ip is, it gets an old, bogus value. So, it doesn't recognize that as a breakpoint and complains about a spurious SIGTRAP instead.

25 Oct 2001
I spent the last few days chasing a process segfault problem. I finally tracked it down today. It turns out that my rewrite of the process signal delivery code was broken in the case of a signal being delivered from an interrupt handler rather than a system call. It grabs the process registers from the thread structure, saves them away on the stack, and then restores them to the process when the handler finishes.

However, interrupts don't save their registers in the thread structure, so those registers represent the last system call, which has already finished. And restoring those causes great confusion in the process.

20 Oct 2001
I released 2.4.12-3um and 2.4.12-4um over the last week. -3 fixed a couple of problems with -2, and -4 adds some miscellaneous fixes to that. The major ones are that physical memory protection is optional (controlled by the 'jail' switch) and that the network driver backends now collect uml_net commands and output and nicely printk them instead of having the output just dumped to the terminal. To support this, uml_net now hangs on to the commands it runs and the output they produce and send them back to UML. This required that the uml_net interface be incremented, so it's now at 3. The new drivers require the new uml_net, so if you grab the UML patch, also get the latest utilities tarball too.
13 Oct 2001
I released 2.4.12-2um today. It's almost entirely changes sent in by other people, dominated by Adam Heath's cleanups. There were also some ppc fixes from Chris Emerson, and small fixes from other people.

I also released a new utilities tarball. The one change was to uml_net, which does proxy arp in a different, and apparently more robust way than it used to.

11 Oct 2001
Linus released 2.4.11 and 2.4.12 two days apart. I had the 2.4.11 patch uploaded, and had started releasing packages when 2.4.12 came out. So, 2.4.12 is out there and I'm doing the packages again.
8 Oct 2001
I should have mentioned the latest -ac patches already since they've been in Alan's tree for a few days, but I didn't, so here they are In other news, with the help of Paul, I tracked down an ancient console driver bug that held on to a struct tty after it had been freed and subsequently caused panics.

I released 2.4.10-7um today with that fix and some other minor changes.

5 Oct 2001
Paul Larson found a test case for the signal problems that was reproducable for me. So, with that in hand, I tracked down the bug, and released 2.4.10-6um.

The bug turned out to be a result of moving where state is saved before a signal is delivered to a process. The process registers and some other things need to be saved on the process stack so they can be restored later. The way it used to work is that

  • handle_signal would figure out what the interrupted system call eventually returns
  • that value is passed up the stack and stored in the process registers stored in its task structure
  • the process would be sent a signal so it starts running on its process stack
  • the UML signal handler copies the register state from the task structure to its own stack
  • it calls the process signal handler
  • and restores the registers back to the task struct
What I implemented did this
  • handle_signal figures out what the interrupted system call eventually returns and constructs the process stack frame, copying the registers from the task struct onto the stack
  • that value is passed up the stack and stored in the process registers stored in its task structure
The bug is that the second step happened too late. The registers saved on the stack hold a bogus return value, and it's that value which the system call eventually returns.
3 Oct 2001
I decided to profile a stretch of UML thrashing. So, I took the 2.4.10-2ac UML (which I updated to the latest stuff, and which I'll be sending to Alan shortly), gave it 128M of memory and 1G of swap, and let a 'make -j' kernel build run for a couple of hours. These are the results. All of the system calls show up as <spontaneous>. Somehow it wasn't linked against the profiling libc. I'll try to figure out why not.

Some highlights:

  • Protecting kernel memory from userspace seems to be expensive - mprotect is the top item on the list.
  • wait4 is number two, which I don't entirely understand. That's the tracing thread. It sleeps in wait, and wakes up when there's something that needs doing, so I don't understand why it shows up, unless it's somehow being charged for all of the context switching that UML causes.
  • Then we have other low-level VM things, fix_range and flush_tlb_kernel_vm. These manually walk address spaces to update them. These two are unnecessarily inefficient and can probably be knocked far down the list pretty easily.
  • Finally, we get into generic kernel things, which show clear signs of heavy swapping - page_launder, swap_out_pmd, do_anonymous_page.
  • The first system call which shows up is sys_brk, way down the list, followed distantly by sys_read and sys_close.
  • There were 312175 system calls total. The most frequently called were sys_brk, sys_read, sys_open, sys_newfstat, and sys_stat64, not unexpected for a kernel build.
  • kmalloc was called most often from load_elf_binary, select_bits_alloc, and load_elf_interp. __get_free_pages was called most often from handle_mm_fault, pipe_poll, and do_fork. It called _alloc_pages, which was called most frequently from read_swap_cache_async, do_anonymous_page, and do_wp_page.
1 Oct 2001
I discovered a new way of breaking UML. A 'make -j' kernel build drives the load above 150, and on 2.4.10 causes essentially a livelock. I eventually regained control by sending SIGILL to all the processes from the host. Plus, I got all kinds of interesting illegal instruction and bus error deaths. These were absent on -ac2, probably because that wasn't a totally up-to-date UML, so it was missing the most recent bugs that I added. I'm going to track those bugs down by updating UML in the -ac tree bit by bit and seeing which bit causes these nasty little problems.

Speaking of -ac2, it needs a little fix to ptrace in order to build.

25 Sep 2001
Today I redid the signal delivery code. Now all the saving and restoring of state happens in kernelspace rather than on the process stack like before. This allows the task structure to be protected from processes. Since that was the only hole in the protection of physical memory, that is now fully protected against being changed from userspace.
24 Sep 2001
I made a .deb and an RPM in preparation for releasing 2.4.10, and Jacques Nilo reported that yesterday's fix wasn't enough. I had forgot about an instance of MAP_SHARED | MAP_ANONYMOUS. So, I fixed that, and that is 2.4.10-3um. And that is the basis of the official 2.4.10 UML release.
24 Sep 2001
I thought of an easy fix for the stack capturing problem that prevented UML from booting on 2.2 hosts. Basically, a new process is created which stops itself, and when that happens, the parent grabs a copy of the stack and uses it to create a context for future threads to run in. On 2.2, the parent used ptrace to extract the contents of the stack from the child word by word. I looked at that code and decided it would be much easier to map the stack MAP_SHARED so it would be shared between parent and child and the parent could just memcpy it to a safe place rather than ptracing it out.

What I forgot was that, while 2.4 supports MAP_SHARED | MAP_ANONYMOUS, 2.2 doesn't. So, on 2.2 hosts, UML wouldn't even begin to boot.

The easy solution was to go back to MAP_PRIVATE | MAP_ANONYMOUS, but clone the new process with CLONE_VM, making it a thread, which allows the parent to copy the stack directly, since they're both in the same address space.

This fix makes 2.4.10 usable, so I've released another patch and updated CVS.

23 Sep 2001
Linux released 2.4.10 today, so I updated UML as well. I decided not to base this on the latest UML patch, since that it not entirely healthy at the moment. My sigaltstack fixes broke UML totally on 2.2 hosts. So, 2.4.10-1um is 2.3.9-8um updated to 2.4.10.

CVS is not updated, but will be once I have the sigaltstack thing fixed and that pool updated to 2.4.10.

22 Sep 2001
The last set of -ac patches went into -ac14. That brings Alan's tree reasonably up-to-date.

I released 2.4.9-8um yesterday and 2.4.9-9um today. -8 was some bug fixes and cleanup. -9 was fixing sigaltstack and doing a lot of cleanup and rearrangement of the signal delivery code. This sets me up to redo the entire signal delivery mechanism so I can finish protecting all of the kernel's physical memory from userspace.

18 Sep 2001
That last batch of patches went into -ac12. So, the next batch is off to him, plus one from Andrea Arcangeli which fixes a declaration which is needed to compile UML successfully.

Once these are in, the -ac tree is almost up to date. It'll be one CVS release behind, which is OK because there are some tweaks I want to make to the address space reorg. So, I'll get that right and send it in rather than sending it in two pieces.

15 Sep 2001
I released 2.4.9-6um last night. It contains the already-mentioned COW header changes. It also occurred to me that I can fix the mlockall bug by sticking UML at the top of the address space where it's supposed to be anyway. So, I went ahead and did that. This allowed me to get rid of the vmas that UML needed to stick in each mm to prevent mmap from reallocating areas of virtual memory that UML is living in. This, plus the fact that these vmas had no ptes, caused mlockall to cause major damage to UML by trying to unmap it. Putting UML above TASK_SIZE causes it to be ignored by mmap, and the problem just disappears. This also let me get rid of the nasty address space reservation code that was needed in order to prevent libc from mapping stuff in where UML wanted to put stuff.

In other news, I'm back in the ac patch business. First is a patch that I've been sitting on all week which defined hz_to_std and allows UML to build again. Then, we have

These are now all off to Alan. I've got some more which will wait till those have gone in, in order to minimize conflicts:
14 Sep 2001
Greg Lonnon and I have been fiddling with the COW file header format. I had already discovered that blindly copying the backing file path provided by the user into the header is a problem when it is a relative path. That COW file won't be usable by a UML run in a different level of the directory hierarchy because, from there, the relative path stored in the header doesn't refer to the backing file. The fix is to write an absolute pathname into the header.

Greg had a couple of other good ideas which we thought should be implemented earlier rather than later

  • The header should be able to hold a MAXPATHLEN-sized backing file name rather than the current 256 bytes.
  • It should be in network byte order. This will allow COW files to be moved between big-endian and little-endian hosts. Whether the underlying filesystem can be mounted in UML after the move depends on whether the filesystem has its metadata byte-swapped correctly. But, at least the COW header won't prevent it from working.
These two are not backward compatible, so we bumped the COW header version and made these changes in the version 2 header. The driver can read both V1 and V2 headers but it will only write V2 headers.

The absolute pathname change is in 2.4.9-5um since it was small and backward compatible. The other two will be introduced in 2.4.9-6um.

The uml-user list had a couple of interesting posts from UML users today

  • Martin Volf did a Slackware 8.0 installation inside UML and wrote a page describing how he did it.
  • Tim Robinson had some problems with the TUN/TAP transport and posted a nice diagnosis of them.
10 Sep 2001
Been playing with the tools and website lately. I added a bunch of new features to the mconsole client (and promptly had to fix it), and fixed uml_net building on 2.2.

I also restructured the web site build somewhat to make it more manageable.

6 Sep 2001
I tracked down the process segfault problem. It was caused by a newly forked child inheriting some pages that were swapped out, but hadn't been unmapped. The code that it ran on its first quantum didn't update its address space correctly, so those pages remained mapped.

Having chased that problem down, I'm releasing 2.4.9-4um with that fix plus Chris Emerson's latest ppc changes.

1 Sep 2001
After much ado, I revamped the UML download page. It essentially replaces the Sourceforge project download page. I did this in order to be able to let people select the mirror they want to download from and to be able to put explanatory information on the same page as the download link. If it is missing stuff that you'd like to see, regardless of whether it's on the SF download page, I'd like to know about it.

There are a couple things that aren't working right now - the 'Changelog's don't link to anything, and most of the SourceForge root filesystem links don't work. I'm in the process of copying the filesystems over to SF to fix this.

It's now pretty trivial for me to add mirrors, so if you have a box available (particularly if it's in a part of the world not well-covered by the UML global mirror system), let me know.

30 Aug 2001
Thanks to what looks like an all-night debugging session on the part of Yon Uriarte, the TUN/TAP backend now works. You'll need the latest uml_net for this. It wasn't setting IFF_NO_PI, which was causing extra cruft to be stuck on the front of the packet, which probably required the broken nastiness I had to add to the driver. Adding that and backing out all the skbuff fiddling made everything work a lot better.

So, I released 2.4.9-3um with the fixed driver in it, plus new entries in config.release, defconfig, and Configure.help.

28 Aug 2001
I implemented a TUN/TAP backend for the network driver. It involved more work than I expected. A lot of it was due to restructuring other code in order to keep the code relatively clean.

I haven't done any stessing or timing of it, but I did happen to notice that pings over TUN/TAP are about 10x faster than pings over ethertap. The absence of the helper handling each packet on the way to the kernel is no doubt a big piece of that. At some point, I'll do some bandwidth measurements against ethertap to see how much better it is. Hopefully a lot.

26 Aug 2001
UML development took a bit of a break while I got busy with other stuff.

In UML news, I started work on my ALS paper, got a first draft ready, and sent it off for review. I also did a bunch of web site work. I've been letting things fall behind for lack of time to deal with them, so I decided to swallow my pride and start asking for help. This necessarily involves describing what needs doing, so I wrote most of it up, and the results are here, here, here, here, and here.

I also made a pass over the site, fixing a bunch of hopelessly outdated and wrong things, and probably leaving some things which are only moderately outdated and wrong.

16 Aug 2001
I've been having fun playing with crashme. It's a great little tool. It generates buffers full of random data and then executes them. It runs differently on UML than on the host, which it shouldn't. The problems I've tracked down so far are signal handling bugs. UML wasn't handling write faults correctly when the accessed memory was readonly, and it wasn't properly segfaulting processes to which signals couldn't be delivered (because their stack pointers were garbage). This last was the bug I was chasing a couple days ago. There are still problems. The first process (crashme +2000 666 100) runs just as it does on the host, but the next one (crashme +2000 667 100) doesn't. On the host, the segfault handler somehow gets bus errors in libc, which I don't understand, and that doesn't happen under UML.

On IRC yesterday, Lennert Buytenhek clued me in on how to reliably segfault processes and crash UML. He was running 8 "du /". That didn't work for me, but 16 of them does. The segfaults are on pages that are mapped in but shouldn't be (their ptes say that they should be mapped out, and somehow that didn't happen). So, those pages were presumably allocated for something else, and contain garbage from the perspective of the process that should have unmapped them, and so it segfaults.

The panic looks like memory corruption. I turned on slab debugging, and it looks like that makes the panic go away.

Well, Linus released 2.4.9 today, so it's time for me to go into my UML release routine. When I did the obligatory kernel build on 2.4.9, one of the crashme fixes turned out to be bogus. It was doing the segfault-during-signal-delivery check too early, so it caught fixable segfaults that happened because the stack needed extending or was readonly.

14 Aug 2001
2.4.8-2um is out as of yesterday. I made the freshmeat announcement of 2.4.8 this morning.

I chased the crashme bug a little. Somehow, a signal is marked as being pending, but it's never actually delivered and reset. So, no further signals can be delivered to that process from then on. This makes it unkillable and unstoppable.

I also took the first step towards making UML secure against nasty users. UML physical memory, except for the task structure and kernel stack, are protected from userspace access. I still need to protect the task structure and kernel virtual memory. The task structure is a bit tricky because of the signal delivery code. It runs on the process stack and is considered to be userspace code. However, it needs to be able to modify the task structure to restore state that it saved before the signal delivery. So, if the task structure isn't writable, this isn't possible. Further thought on the subject is necessary.

13 Aug 2001
Well, Linus released 2.4.8 just as I was heading up north for a weekend of camping and climbing mountains. He does this on purpose. He released 2.4.3 when I had just arrived in San Jose for the Kernel Summit.

Anyway, this was a relatively simple patch. It just dropped in and worked, except that hostfs was already broken. My calculation of the stat64 inode field was wrong. It looked at the kernel version to decide what was in the userspace headers. I discovered the error of my ways when I booted up a Debian UML to produce the 2.4.8 .deb. This is a 2.2 filesystem (with .st_ino in stat64) with a 2.4 kernel (which implied .__st_ino in stat64). hostfs did not build. I changed the Makefile to just grep the appropriate header instead.

So, this fix will be the substance of 2.4.8-2um.

9 Aug 2001
The remaining differences between my pool and the ac tree are a couple of patches that didn't go in for some reason, cleanups of printks and some includes.

Daemonizing UML does work. I just checked it, and the only case where it does something strange is if you background it without nohupping it and log out. The tracing thread dies from the SIGHUP, but all the other threads survive.

I released 2.4.7-5um today. It contains a few recent patches from other people. I figured out how to turn -fno-common back on. I tried all kinds of linker tricks to throw errno.o out of the binary. Then I discovered that the linking that had already taken place had destroyed any notion of what objects anything originally came from. So, instead, I added -Derrno=kernel_errno to all the kernelspace gcc lines, which translates all the kernel uses of errno to kernel_errno, and leaves the libc errno alone. That's actually a better solution than throwing out one of the errnos because that would leave open the possibility that the kernel and userspace uses of libc could step on each other. Now that they're using different symbols, that's not a problem.

7 Aug 2001
Yesterday's patches are off to Alan.

In other news, daemonizing UML seems to be broken again. Grrr. That seems to break now and then for no apparent reason.

ac9 is out with my patches in it. So, time to make the final diff between Alan's stuff and mine to get him totally caught up with me.

6 Aug 2001
Yesterday's patches are in ac8. So, two more patches will bring the ac tree completely up to date:
  • A network driver update which adds the ability for the drivers to tell the helper about any IP address changes. This allows the host configuration (routing and proxy arp) to stay in sync with the interface address changing inside UML. If you're in the habit of getting UML from the ac tree, you'll need the latest uml_net in order to use the network when this patch goes in because it makes an incompatible change in the helper interface.
  • Another batch of (surprise!) miscellaneous fixes , including some cleanup of stack permission setting, the apparently gratuitous locals that are needed to pursuade -pg to work properly, a couple of symbol exports for GFS, a fix that ensures that the pid file contains the correct pid, and yet another squashed warning.
With these in, I'll be able to diff the ac tree against mine to see what divergences there are. I know there are some, because I occasionally see patches fail to apply because of context conflicts which shouldn't be there. So, there will be one more patch to clean those up, and Alan will be completely in sync with me.
5 Aug 2001
Patch time again. This time I'm making them up ahead of ac7 coming out. So, we have
  • A hostfs update , which brings the ac tree completely up-to-date. Normally, I bundle a couple of cvs updates into a small number of patches and send them off to Alan. With hostfs, I decided to give him the latest stuff, since there have been a bunch of changes spread over a number of cvs updates. This is fairly easy since hostfs is a completely self-contained piece of code.
  • A network driver update , which fixes a crash and makes net devices pluggable via the mconsole. There's some restructuring and cleanup in this patch. Also, mconsole actions move into keventd context from softirq context. This is because alloc_netdevice does a GFP_KERNEL kmalloc, which has to be done in process context.
  • Yet another batch of miscellaneous fixes , including renaming CONFIG_IOMEM to CONFIG_MMAPPER, some cleanup in the ubd driver, and removal of a number of warnings.
  • The complete merge of the ppc port , which reorganizes the headers somewhat. For some headers, there are now header.h, which is a symlink to header-$(SUBARCH).h, which includes header-generic.h and is allowed to do whatever it wants before and after. This provides the flexibility needed to do things like undef stuff after the include and rename things beforehand.
These all will bring Alan up to my 2.4.7 release, except for hostfs, which will be completely up to date. Since I'm up to 2.4.7-4um, and 2.4.7-2um was just a hostfs fix, I might be able to bring the ac tree up to date with one more set of patches.

Alan released ac7 this afternoon, as I prophesied, so those patches are off to him. I'll be looking for them in ac8.

I failed to resist temptation. I looked at the diffs between the ac tree once those patches are in and my current stuff and I noticed a big wad of documentation. So, I rolled that up and sent it to Alan.

4 Aug 2001
OK, I'm back in the business of sending Alan patches. I sent in a small patch which fixes the things that broke when 2.4.7 came out. So, UML now builds and works in the -ac tree again. It made 2.4.7-ac6 an hour or so after I sent it over.

Also, in the interest of getting the ac tree more caught up with my CVS, I sent Alan a batch of fixes which bring him up to 2.4.6-4um:

  • umid fixes from Henrik Nordstrom which create a directory based on the umid rather than having that be the pid file. The pid file and the mconsole socket are now in that directory.
  • Another batch of small fixes - a Makefile fix, mconsole cleanups and an update to create the socket in the umid directory.
  • Some config changes , also from Henrik Nordstrom. These change the network config names to be more explicitly UML-specific. The config.in is also cleaned up so that it resembles the i386 config more closely.
  • Greg Lonnon's example iomem driver , plus a couple of generic UML fixes that were needed in order to make it work.
  • A uaccess fix which required a surprising amount of surgery to fix. The copy_{to,from}_macros previously regarded a fault location of 0 as meaning that the copy has succeeded without faulting. When the address passed into the kernel was NULL, this of course broke badly. It had a very interesting side-effect in the case I saw. After running the command that exercised the bug, every command on the system started failing to start because libc was corrupted. This was something of a head-scratcher. I eventually figured out that I was causing the command to open NULL, the fault went undetected, and the buffer that was supposed to have had the filename copied into it had the filename of libc in it from a previous use. So, libc was opened for writing with fairly severe results.
3 Aug 2001
The deb build problem turned out to be me accidentally redefining VERSION in the upper layers of the build process. That value overrode a VERSION in the kernel build, which resulted in a totally bogus KERNELRELEASE, which confused a macro which tested it badly enough that it broke the build. Simple to fix once I figured it out.

I discovered another hostfs bug on my way back from OLS. ls didn't work and I found two bugs as a result. The easy one was that hostfs_readdir was filling in the directory inode rather than the file inode for every directory entry it passed back to vfs. This was fixed by having read_dir pass the inode back up so it could be use to fill in the entry properly.

The more interesting one is that there was a source-incompatible change made in the stat64 struct between 2.2 and 2.4. The st_ino field changed its name to __st_ino and a new st_ino field was added at the end. The inode appears in the same place (the st_ino/__st_ino field) making it binary compatible. So, after changing to use the 2.4 field name (and breaking hostfs on 2.2), I changed the hostfs build to figure out what name to use and passing that in on the compile line to hostfs_user.c.

In other news, we (me, Rodrigo de Castro, and Livio Baldini Soares) have decided that -pg support in gcc is broken in multiple ways. rcastro and livio complained a couple weeks ago about UML's gprof support not working. I finally had a look at it, and found that it was broken, but not in the way they described.

UML crashed in a very inconvient place, and when I finally got in there enough to figure out what was happening, it turned out that mcount was segfaulting when it dereferenced ebp because ebp was NULL. The reason for that turned out to be that in some procedures, mcount is absolutely the first thing they do. Everything else calls mcount after the new stack frame has been set up and ebp has a valid value in it (the old esp). When the procedure is the main procedure for a thread, then ebp turns out to be NULL.

The difference between the two sets of procedures seems to be that the good ones have local variables and the bad ones don't. So, to work around this bug, I added a useless, but non-optimizable, local to the affected trampolines.

Having done that, rcastro and livio were still complaining about UML crashing. So, I looked at it with rcastro using gdbbot (and livio did so later and discovered the same thing). -pg was trashing edx for some reason. A constant (which varies from procedure to procedure) is dumped into it. This suggests that it's used for the profiling bookkeeping somehow, but looking at the assembly, we don't see how. mcount carefully pushes it and restores it, which is not typical of something that is going to be used for something. The problem is that FASTCALL procedures (which are regparam(3)) pass arguments in eax, edx, and ecx. So, dumping this constant into edx trashes the second argument to the procedure. A workaround for this bug would seem to be to disable FASTCALL (and I guess that gprof support stopped working when I enabled FASTCALL to fix a different bug).

I released 2.4.7-4um today. The main new thing is that you can change the IP address of a ethertap eth0 device and the host configuration will change to match. This required a bit of infrastructure which I wanted for other reasons. The uml_net interface is now versioned, which I've been meaning to do for a while. uml_net now goes away cleanly when UML is killed messily. Before, it would hang around, occupying the tap device, and when UML was rerun, the new uml_net would emit non-intuitive error messages.

I also made hostfs build and run again on 2.2 with a bit of Makefile hackery.

28 Jul 2001
That hostfs problem turned out to be different than I thought. Livio Soares started chasing the problem and found that the hostfs_user close_file didn't actually close anything. It took a pointer to a file descriptor and closed the pointer (or at least tried to) rather than the descriptor that it pointed to. Fixing that made hostfs behave a lot better.

Having fixed that, I finished the page cache work for UML and it can now successfully do the deb build through hostfs without getting the md5sum mismatches it was getting before. Having said that, I've started seeing a compilation problem when building UML through hostfs that I don't get on the host.

On to OLS. My talk was in the second slot of the first day, which was nice. It's good to get your talk over early so you can do the rest of the conference without worrying about it. It went pretty well. I had hoped to fit a demo in at the end, but the talk basically went the full 90 minutes. So I did a real short watch-it-boot-up demo afterwards while most of the crowd was filing out of the room.

There was a talk on porting Linux to the i-series IBM boxes (aka AS-400) which was fairly interesting. They ported Linux/ppc to a hypervisor running on OS-400, making it fairly similar to the UML port, being a port to an OS rather than to bare hardware. Dave Boucher, who gave the talk, made a number of comments comparing it to UML, which was nice. He also grabbed me during lunch today to quiz me about the COW ubd driver. It turns out that he can't do that so easily because OS-400 doesn't have sparse files, so he can't drop blocks down in the same location in the COW file as in the backing file because that would allocate space. I suggested a block directory instead of a bitmap at the beginning of the COW file and dropping changed blocks down sequentially, but he seemed unconvinced for some reason.

A number of people told me either they or people they knew were using UML for various things. The FreeS/WAN project as a whole seems extremely interested in UML for running tests on their stuff over a virtual network. A PPPoE maintainer complained about the ethertap transport not being intuitively obvious on 2.4. And there were a bunch of other people who were less specific about what their interest in UML was who were either using it or were intending to.

In other news, I discovered that the mcast network transport didn't work when the box had no ethernet card in it. Being at OLS, I showed this to Harald Welte and we stared at the code a bit, then asked Andi Kleen about it. The underlying problem turned out to be that there was no route to any multicast address because there was no interface on the system that supported multicast. The fix seems to be to add multicast support to the loopback device, preferably, and if that's not possible for some reason, to the dummy device.

22 Jul 2001
2.4.7-1um is released. A change which made kernel threads sychronize with the parent at startup caused a hang at boot. The cause was a long-standing bug which caused initdata not to be shared between processes. Andrea noticed the problem as well, and found the fix.

That bug was fixed and I released everything. I released it with a fairly big hostfs problem that I didn't notice until the middle of the release process. I changed how it opens and closes files, with the result that it closes them later than it used to. So, it isn't too hard to get hostfs very confused by running UML out of file descriptors.

21 Jul 2001
2.4.7 appeared yesterday. I'm looking it over to see what's new. One interesting thing is that Alan is sending over some bits of UML which change the generic kernel. These don't affect anything besides UML, so they're harmless. On the other hand, they eliminate some generic files from my patch, which is nice. It makes the UML patch appear purer.
16 Jul 2001
Yesterday's patches are in 2.4.6-ac5. So, time to send in another batch. This will get him up to my 2.4.6-2um. Today's batch contains another batch of random fixes and Greg Lonnon's ubd COW patch. See this page for more information on the ubd COW driver.

I also checked in all the userspace stuff, including the deb builder, recent changes to the tools, and the website, which I hadn't checked in for quite a while.

15 Jul 2001
Those two pesky patches finally made it to Alan OK and were included in 2.4.6-ac4. This gets the ac tree up to 2.4.5-8um. The next batch will bring him up to 2.4.5-10um. It includes a bunch of miscellaneous fixes, the first merge of the iomem patch, and an mconsole update which makes gdb and the ubd driver hot-pluggable and runs mconsole stuff from a tasklet rather than inside the interrupt.

With some more symlink abuse, I merged the last of Chris Emerson's ppc port patch.

14 Jul 2001
Two of the three patches I sent to Alan were broken again. However, I figured out why. My devious little mail reader was breaking lines when it sent out the mail, which was way too late for me to eyeball it to make sure it wasn't messing up. Turning off this behavior results in much better patches at the other end of the line.

I played with the UML .deb builder and got a workable .deb out of it. I think I figured out why the process gets a checksum error at the very end - it's on hostfs, and hostfs reads through the page cache, but doesn't write through it. I'll have to check with a filesystem guru on this, but it sounds right to me. Putting the process on a normal block device results in good checksums.

13 Jul 2001
I put out a couple more patches. Highlights include
9 Jul 2001
Thanks to Simon Blake, I tracked down an interesting bug last night. The UML build turns off __i386__ in order to throw out some very hardware-specific code that UML definitely doesn't want. This also turns off the i386 definition of FASTCALL, which invokes an in-register parameter passing convention that gcc supports. This wouldn't be a problem, except that UML borrows code from the i386 port which assumes that this convention is being used.

In the case that I was looking at, rw_down_write_failed() was getting its semaphore address from the wrong place and using a random userspace address as its semaphore. This could cause all kinds of interesting side-effects, like kernel corruption from two threads using two different random addresses as the same semaphore or process memory corruption from the kernel writing semaphore stuff into its memory. Hopefully, this fix will eliminate some of the strange crashes that people ocassionally see with UML.

Here is a more detailed description of the bug and its side-effects.

I sent the two broken patches (the mconsole and 64-bit patches, see the 30 Jun 2001 entry for descriptions) to Alan again. Hopefuly they aren't broken this time. Also, I fixed a few build problems that turned up lately.

7 Jul 2001
I integrated Greg Lonnon's ubd COW patch today. It allows multiple UMLs to share a filesystem read-write by storing the changes in a private file. This private file can be considered to overlay the read-only shared file. All writes go into the private file, and reads come from the private file if it has a valid block and from the shared file if not.

This allows a huge savings in disk space for people running many UMLs with large filesystems. It probably will help performance, since the caching requirements on the host are similarly reduced.

4 Jul 2001
Two of the last three patches I sent Alan somehow got corrupted. I suspect that what happened was I added spaces accidentally while reading the patch in my mail composition window by trying to page it with the space bar, then messed by the patch when deleting the spaces. So, I'll send them in again.

Rik van Riel has been visiting for the last couple of days. He was in Boston for Usenix, and was visiting EMC and MCLX (where a number of my former coworkers from DEC now work) after the show. Since I live a couple of hours north of Boston, I invited him up. In doing so, I acquired the responsibility of getting him to Logan airport at the same time that 2M people were going into Boston to see the 4th of July fireworks and concert. I ended up putting him on a bus that ran from outside the city straight to the airport. I haven't heard anything from him since, so I suppose that's good news.

2.4.6 was released last night. It turns out to be a piece of cake. The well-known softirq fix is the only thing that needed changing. I stuck that in, and it built and ran through my tests without a problem. The patch is released, and I'll probably finish the rest tomorrow.

30 Jun 2001
I sent Alan patches which will bring him up to 2.4.5-8um, which include:
  • a collection of small fixes , including ^S/^Q support for the console, some ubd driver cleanups, and the TASK_UNINTERRUPTIBLE fix
  • Lennert's reimplementation of the 64-bit file support - the first try used libc's magic support for popping the 64-bit interfaces under the 32-bit names. That broke UML modules badly. This version explicitly uses the 64-bit interfaces and seems a lot healthier.
  • Lennert's management console patch. This version has support for getting the kernel version, halting and rebooting the system, and turning the debugger on and off.

Last night, the f00f bug was bugging me, so I fixed it. It turned out that the tracing thread was routing SIGILL and SIGBUS incorrectly. Fixing that causes f00f to SIGILL properly.

29 Jun 2001
I found and fixed the TASK_UNINTERRUPTIBLE hang last night. It turned out to be caused by an interrupted write in the block driver. The driver didn't check the return value, so didn't notice that an IO request it sent to the IO thread didn't go anywhere. That shut down the disk IO system, which ultimately results in the whole system being deadlocked waiting for IO that's never going to happen.

That, plus a few other things, are checked in as 2.4.5-11um.

In other news, Bill Stearns, who's always looking for more devious things to inflict on UML, happened across the Linux Test Project and decided to run it on UML. UML did pretty well. There were three failures, two of which also fail on the host. The other is the f00f test, which causes UML to hang. I applied the obvious fix of relaying SIGILL from UML to the process. That fixed the hang, but after a long pause, the test's SIGILL handler apparently gets called twice.

26 Jun 2001
Those last two patches made it into ac19. Time to start thinking about bringing the ac tree up to -7um.

I finally got Greg Lonnon's iomem match into UML. This allows a process outside UML to communicate with one inside (or with a UML driver) through a mmapped file.

I've also been chasing the TASK_UNINTERRUPTIBLE hang that a few people have been seeing. It happens most easily under UML, apparently. I'm using a recipe discovered by mistral to reproduce it (two infinite loops each diffing two kernel pools). The longest it's taken to reproduce is about 30 minutes. It hung on boot once. The others have been in the 5-10 minute range.

I had a long chat with Al Viro last night with him telling me what he wanted to see from gdb and me providing it. He ended up being puzzled about what was happening. Following a suggestion from Daniel Phillips, I've started instrumenting buffer_heads and pages to see what happened to the ones involved in the hang.

25 Jun 2001
It's -ac patch time again. I boiled the -5um to -6um changes down to two patches:
  • a miscellaneous fixes which adds some IP address sanity checking to the ethertap backend, fixes a couple of process signal delivery races, cleans up the associated thread data a little, fixes a swap bug (which caused swapped-out pages to never be unmapped from their processes), and gets rid of the last vestige of the mm_changes code.
  • a timer patch which attempts to eliminate missing clock ticks by never disabling the timer and keeps track of ticks which happen when it's not safe to call the timer IRQ. This improves things, but it doesn't eliminate missing ticks under load.
22 Jun 2001
Some time around ac16 or ac17, someone added a call to linux_booted_ok() which the ports have to implement. So, I sent the patch to Alan today.

And a short bit later, I got a reply saying not to bother. The linux_booted_ok thing was a temporary test that's going to be removed. So, it won't appear in my pool, but if you absolutely want to run the ac16/ac17 UML, apply that patch.

I spent the better part of the afternoon in IRC trying to figure out the hang that mistral is seeing. No joy, but I did learn more about the problem. I'll attack it again later.

21 Jun 2001
gdbbot got its first test yesterday when I looked at the problem that Chris Emerson is having with UML/ppc hanging during boot. I didn't find the problem, but was able to check that signal delivery (which was what I thought was broken) was working fine. The next step will be to do a post-mortem on the hang.
20 Jun 2001
I wrote a IRC gateway for gdb. This allows a gdb (like the UML kernel debugger) to be controlled from an IRC channel. The intent is that if someone sees a bug that I can't reproduce, but want to look at, that person's UML gdb can be attached to an IRC channel where I can poke around and see what's going on.

I also integrated Lennert's management console patch. This is a very low-level interface to the kernel (like the i386 SysRq interface). The main use for it right now is to hot-plug devices. At this point, only the ubd driver and gdb support this. So, you can add and remove block devices from your UML without having to reboot it. You can switch gdb in and out the same way. I will also do the consoles, serial lines, and network interfaces at some point as well.

15 Jun 2001
The two patches I sent to Alan yesterday are in 2.4.5-ac15. Alan horribly mangled Harald Welte's name, unfortunately.

Today was a patch bashing day. I merged in a good number of the patches in my queue.

Today was also the (extended) deadline for abstracts for ALS2001. So, I sent one in. This is the most explicit that I've been so far about my future development plans for UML. So, if you want to see how wierd things are going to get, read all about it here.

14 Jun 2001
IBM put out a Linux security whitepaper in which UML gets a pretty lame mention (down towards the bottom, there's some prose which is basically lifted from my site). Thanks to Bill Stearns for spotting it.

I'm finally getting around to sending off the latest stuff to Alan. The ac tree is now two cvs updates behind. The first set will be the -5um update, which is basically

  • the mcast transport plus some other network cleanup
  • some random fixes , including an updated defconfig, making the console xterms go away when the machine shuts down, making a read-only hostfs really read-only, hooking up a couple of new system calls, allowing UML to boot on hosts with a 2G/2G address space split
12 Jun 2001
Banged out a bunch of bugs. I started booting UML with 24 megs and plenty of swap, and running a whole bunch of stuff on it to overload it and put it heavily into swap. This turned up a couple of signal delivery races and a swapping bug. The signal races would cause various strange behavior. Mostly what I saw was hangs with an infinite sequence of sigreturns. The swap bug caused pages not to be unmapped when they were swapped out. Obviously, this is very bad. With the help of rcastro, I fiddled my page table macros to fix this. I'm still seeing process segfaults. It looks like pages are being swapped out and swapped back in with the wrong data.
8 Jun 2001
I fixed a bunch of buglets, like the console xterms not going away, readonly hostfs not being readonly, merged Harald Welte's mcast network transport, and a few other things, and checked them into CVS. I also checked in the tools, so everything ought to be up-to-date and consistent at this point.

I also have the .deb build procedure working, I think. The uncertainty is due to the fact that I think there's a hostfs data corruption problem. My development box runs Red Hat, and I couldn't find RPMs for the Debian tools, so I just installed them in my Debian filesystem (apt-get rocks, BTW :-), mount the source pool inside a Debian UML via hostfs, and run the debian build procedure there. The problem is that the gzipped source tarball has its md5sum recorded at the beginning of the build and checked again at the end, and they don't match. I also ran md5sum three times in a row on that file while the builder was running, and got three different answers. So, it looks like I have some debugging to do there.

3 Jun 2001
True to yesterday's promise, I sent Alan three more patches
2 Jun 2001
From the changelog, it looks like yesterday's patches are in ac7. Time to start generating more...
1 Jun 2001
Another day, another set of patches for Alan. Today, the lucky winners are
30 May 2001
Alan put yesterday's patch into ac5, so UML should build and run again. Thanks to Arjan van de Ven for telling me about that.
29 May 2001
It turns out that I messed up the patches somewhat. So, this is the patch for ac4 . With it, UML will build and run again.
28 May 2001
Alan apparently put all 10 of yesterday's patches in 2.4.5-ac3 (but seems not to have dignified the added comment with a changelog entry).

I wrote up the new networking. Check it out here.

27 May 2001
I got 2.4.5 merged in, and mostly released. I'll probably finish it and announce it tomorrow.

I also synced up with Alan by sending him a whole pack of patches, to wit:

26 May 2001
I got the networking cleaned up enough that I'm happy for the general public to use it. There are three host transports, ethertap, the routing daemon, and slip. You can have the helper do the host setup for you or not. If you do, then getting the network running is a matter of a command line switch, ifconfiging the device, and setting routes inside UML. This is a huge usability improvement over the previous situation.

This is all checked in, and I'm currently building 2.4.5, which I'll release in the next day or two.

18 May 2001
I fixed the slip interface, cleaned out some unused code which had become a portability problem, and fixed the fix for the crash caused by someone typing at the console too soon. It is all checked in to CVS.
17 May 2001
I grabbed 2.4.4-ac11 to see if Henrik's patch was in there, and it was. So I don't have to worry about it any more. I guess it made ac9, but Henrik didn't get credit for it in the ac changelog.

In other news, the ethertap interface is working reasonably well. It couldn't do HTTP until I figured out that the mtu on host tap device needed to be 16 bytes less than the UML eth0 mtu. The helper is now more helpful. In order to talk to the rest of the world through it, you basically just have to ifconfig the device inside UML and add a route to the outside world, and you're done. Much better than what we had before.

13 May 2001
Five of yesterday's six patches made it into 2.4.4-ac9. The lonely exception was Henrik's hostfs blocksize fix.
12 May 2001
I decided to clean out my patch backlog a bit. So, I merged and sent to Alan the following patches:
11 May 2001
Chris Emerson got UML/ppc booting to a shell prompt! His uml-devel post is here . This is the first UML port, and it showed me how to make UML portable. There aren't really all that many non-portable things in UML, so a port doesn't take all that much code. Based on his work, I'm going to write up a UML porting guide, which will be found here when it's done. If that link is dead, keep trying until I have something to put there.

In other news, I fiddled the ethertap driver backend so that the read hang has gone. With some help from Bill Stearns, I also figured out how to talk to the rest of my network through the ethertap device.

9 May 2001
I got the ethertap backend to the network driver working today and I submitted it to CVS . I haven't been able to get it to talk to anything but the host over the tap device, but it communicates with the host just fine.
4 May 2001
My 2.4.4 fixes, except for Andrew Morton's exitcall fix, are in 2.4.4-ac3.

I wrote and submitted my OLS paper yesterday, two days late. It's also posted on this site, as TeX and HTML

On the network driver front, I've got the unified front-end plus the slip back-end working. I've started working on the ethertap back-end. After that will come the socket and TUN/TAP back-ends. This stuff is in CVS, but I haven't updated the patch because the ethernet driver is broken, and I don't want a bunch of complaints from people who grabbed the latest patch without knowing what was in it.

Update: Andrew's patch made it into 2.4.4-ac5. I was beginning to wonder. That cleans out my pending ac patches.

28 Apr 2001
Linus released 2.4.4 yesterday, so I released the 2.4.4 UML today. No major changes - I dropped in the semaphore changes I was keeping in my ac tree, and I added an mm argument to pgd_alloc() . When I built the RPM (which uses a different configuration), I noticed that hostfs didn't compile any more , and UML didn't compile with CONFIG_PT_PROXY turned off . These fixes aren't in CVS or the patch yet.

Those changes, plus Andrew Morton's exitcall fix , are off to Alan.

27 Apr 2001
The last batch of patches I sent to Alan made it into ac14.

I started looking at the two network drivers today. I think it won't be too hard to merge them. They're pretty similar, since they're both derived from the same code base, and the differences seem to be orthogonal. They don't seem to have done the same things in fundamentally different ways. I posted my impressions for the devel list to comment on.

Andrew Morton looked at the shutdown crash that people started seeing lately and figured out that it was caused by /proc being unregistered before something else tried to remove its proc entries when it was unregistered. He sent in a patch which reversed the __exitcall order, and Henrik Nordstrom reported that it fixed the crash for him.

22 Apr 2001
The fixes I made on Thursday were broken. The initrd fix introduced a name clash with a function in hostfs, and the sleep fix made sleep always hang . I didn't notice because I was fixated on getting UML to boot from an initrd image, and that wasn't obviously showing the problem.

Anyhow, I made the fixes, submitted them to CVS, updated the patch, and sent fixes off to Alan. I hadn't sent in Thursday's changes to Alan, so the patches are the real thing, not just patches to the patches.

The patches sent off to Alan today add initrd support , fix the sleep bug , and make UML build and work with the generic rw semaphores .

18 Apr 2001
The patches I sent in a couple days ago are all in ac10 by the looks of Alan's change log.

I figured out how initrd support is supposed to work, and implemented the necessary stuff in UML. I booted a RH initrd image far enough to convince myself that it works.

I also figured out the sleep hang. It turns out to be a race between the registration of the timer irq and the first time the timer interrupt calls do_IRQ. The timer was enabled before the registration, so if an interrupt happened in that window, do_IRQ would bail out early, leaving the irq permanently marked as in progress and pending. This locked out all future timer interrupts from going through the irq system, so counters would never be decremented, and sleeps would never wake up.

17 Apr 2001
The UML build fixes made ac6. However, the rw semaphores in ac7 broke UML again. I sent Alan the fix for that yesterday, and it made ac9 later in the afternoon. I also sent a few other patches which fix the gcov and gprof support , add support for external debuggers , and clean up the umn driver a bit .
12 Apr 2001
They didn't make -ac5. Oh well.

I had a chat on #kernelnewbies with Rodrigo de Castro, who's using UML for his compressed caching project. He understands swapping better than I, and told me why my new pte bits were breaking it. So, I fixed it, and swapping now seems to work.

11 Apr 2001
Sent Alan the patches necessary for UML to build and run in his tree. I got back a reply which said in its entirety, "ok", which I think is good. Maybe they will make -ac5.
10 Apr 2001
UML is now in 2.4.3-ac4. I was on IRC with Alan and a bunch of other hackers when he merged it. He looked like he was going to start asking a bunch of embarassing questions about my locking, but he was concerned only about one thing, and that was a special case that didn't need locking.

Too bad it doesn't build. The patch that Alan merged was against the Linus 2.4.3 tree, which differs in a few respects from the current -ac tree.

8 Apr 2001
Released 2.4.3 a week or so late. Blame Linus for releasing it the night before the kernel summit officially started. We were all in San Jose and not able to react.
4 Apr 2001
A couple more summit tidbits that I forgot to mention in my last entry:
  • Willy is thinking about using UML as a testbed for NUMA support. He wants to fire up a number of virtual machines and have them hook themselves together so they can access each other's memory through device files. This would allow people who don't have access to the fancy hardware to develop and debug Linux support for these boxes.
  • UML may appear in the -ac trees at some point. He wanted to include it, but I had sounded fairly negative towards that in the past. What I don't want just yet is for UML to hit the Linus tree. Alan said he doesn't send stuff to Linus if the author doesn't want it sent, which is fine by me.
2 Apr 2001
Back from the kernel summit. I wanted to get a feel for whether four things that I wanted from the host kernel were reasonable. I got two OKs and two dings. That's fine, since the OKs were the important ones. Here's the run-down:
  • Userspace manipulation of address spaces : I want to be able to create, populate, release, and switch between mm_structs. This will speed up UML context switches, and greatly clean up that code. I asked Linus, and he said OK to the fairly static things that I want to do. Apparently, there are serious complications when fiddling with the address space of another process, but that's not what I want to do.
  • System call interception via signals : In order to avoid the context switching between threads involved in virtualizing a system call, I want to have a process intercept its own system calls by having the host kernel deliver a signal whenever it makes a system call. The handler would be the current syscall_handler, which would read the arguments from its sigcontext_struct. This would change a system call virtualization from four context switches to a signal delivery and return. I infer Alan's OK on this from my describing it in his presence and him not objecting.
  • Notification when a UML thread sleeps in the kernel due to a page fault : For the sake of cleanliness and completeness, I want to be able to have UML know when a thread is sleeping in the kernel and be able to call schedule when that happens. This would let UML do as much work as possible given its state of memory residence. Alan rejected this on the grounds that UML would be the only sane user of this mechanism.
  • Full kernel preemption : This was implicitly rejected as a UML need by Alan's rejection of the previous item. If UML is to call schedule whenever it sleeps, the whole kernel needs to be preemptible because the swapped-out page might be a kernel page. This doesn't at all mean that preemption isn't going to happen. Rather, it means that UML doesn't have a particular need for it.
Other tidbits:
  • A number of people consider UML a very neat hack, including Ben Lahaise, Andrea Arcangeli, and Eric Raymond.
  • Alan turns out to be a UML user. For the last month or so, he's been booting his kernels as UML kernels before booting them as native kernels. This is in part because recovery from a totally messed up kernel is a lot easier with UML than with a native kernel. UML is also his ptrace test case. It apparently does things with ptrace that nothing else tries.
  • Al Viro is thinking about porting UML to Plan 9. He asked me about what it would take. He had thought through the ptrace requirement, and I told him about the mmap requirement, which is the next hurdle. Plan 9 apparently doesn't have mmap. He's going to think about how to do that.

On the trip over, I did some debugging, and I also threw in some patches. There is now a "umid=<name>" switch for providing a virtual machine with an identifier. This causes a pid file to be created using that name, which is something that makes controlling multiple UMLs through a nice UI a lot easier. This file will be replaced with a socket to the machine console that Lennert is working on.

I also implemented __exitcall, which declares a procedure which is to be called on machine shutdown. This was prompted by the need to remove the pid files when the virtual machine goes away. I also converted other existing cleanups to use this mechanism.

25 Mar 2001
I checked in a bunch of changes again. Henrik Nordstrom provoked me into making it possible to use hostfs as a root directory by sending me a patch that did it, but which was wrong (IMHO). He did it in the same way that nfsroot and initrd support is done, which is by adding a special block of code to fs/super.c inside CONFIG_HOSTFS_ROOT. That works fine, but I didn't want to annoy Al Viro. What I did instead was to add a second registration of hostfs as a device (not a virtual) filesystem and change the ubd driver to support being given a directory rather than a file or block device. What happens is that when a read request comes in to the ubd driver, it is guaranteed to be a request for the superblock. The driver constructs a fake superblock with the directory name in it. hostfs recognizes that and claims the mount as its own. After that, it goes back to being a normal virtual filesystem and doesn't bother the block driver again. The involvement of the ubd driver is a bit of a kludge, but it works well on the command line, and I can't think of anything better besides some kind of general support for virtual root filesystems that cover nfs and initrd as well as hostfs.

There were also a number of patches from other people: Lennert Buytenhek's modify_ldt patch, a bunch from Greg Lonnon, one from Gordon McNutt, and a buffer overrun patch from Henrik Nordstrom.

23 Mar 2001
I spent a few days fixing the infinite recursive context switch bug. That was a lot more complicated than I expected. The fix involved replacing the shadow page tables that represent the mappings on the host for each process with bits in the pte that say whether it is up-to-date or not. These bits are set in the little functions that change ptes and cleared in fix_range after it's updated the mappings. Since the process page tables are per-mm and not per-process, a mapping that was changed in a multi-threaded process would only be updated for one of the threads. This meant that UML processes that share a memory context also need to share a memory context on the host. This in turn complicated exec, since it now needs to create a new host process in order to get out of a shared UML address space. I implemented this a few times, and on about the third try, I got something that works.

So, aside from eliminating a nasty bug, this also makes the modify_ldt fix more useful, since it now should work properly without any extra code, and opens the way to more efficient context switching between threads, since they won't have to go through the remapping that processes need.

18 Mar 2001
Lots of bugs have been fixed. I got a little list of hostfs complaints from Al Viro, which I think I fixed. hostfs is now pretty solid. I fixed the naming problem which cropped up if you held a file open, then moved its directory and accessed that file by its new name. You'd get 'file not found'. This is because I stored the full host pathname in the inode, and when you changed its name while holding it open by the old name, the inode continued to contain the old bogus name. This was fixed by having anything that needs a filename walk the dentry tree back up to the root, constructing the current filename. The other major problem was that readdir didn't work, resulting in missing files when a directory was copied. These are fixed. What remains is to get rid of some interfaces which will complain about not being implemented.

The signal delivery race is fixed. That induced me to clean up a lot of old, crufty code in the kernel entry and exit paths. That's sensitive code, and a few bugs in it caused some very selective and very strange behavior.

I've put together an RPM for UML just in time for the April Linux Magazine to hit the streets with my article in it. This is good, because the article claims that RPMs are available, which they weren't at the time that I wrote it. This also goes some way towards simplifying the network mess. The RPM installs the umn_helper, which lets the umn device run without any help from the user. It also installs the eth tools, which are otherwise hard to find unless you pull them from cvs.

25 Feb 2001
CVS update today. I fixed a few bugs and cleaned up a bunch of things.

I've started keeping an up-to-date TODO list. This will help me not forget anything important. I post it to the -devel list occasionally to prompt people to send in whatever gripes they have.

24 Feb 2001
I'm releasing 2.4.2 today. It has a number of bug fixes and no significant functionality changes.

A number of bugs have cropped up lately. The most significant is a race when a process signal handler returns. There is a narrow window in which an interrupt can cause a crash. The fix is to implement sigreturn like the other arches and run almost all of the kernel code on the kernel stack rather than the process stack as I'm doing now.

8 Feb 2001
I managed to reproduce a number of panics and fixed all but one of them. The key was hitting UML with a high-concurrency ab run with requests that fire off perl scripts which make mySQL requests, with not too much memory, so that it is at least starting to swap.

This reproduced two bugs, one was caused by a failed memory allocation in the middle of setting up a tracing thread request. The failure caused a schedule, which caused a switch request, which blew away the first, partially-set-up request. When the process was rescheduled, its request was garbage, confusing the tracing thread into detaching it. This was fixed by moving the allocation to before the request started being set up.

The other one, which isn't fixed yet, is caused by the shadow page tables maintained by arch/um/kernel/tlb.c. It occasionally needs to allocate a page table when it sets up ptes for a new range of memory. However, if the context switch that it's dealing with was forced by low memory, then that allocation will fail, causing a recursive context switch, and recursion continues until either the stack guard page is hit, or, in the case of a kernel thread, the task structure is polluted. I'm going to fix this by following a suggestion by prumpf, which is to use some spare bits in the pte rather than a separate page table to figure out what parts of the address space need updating.

And, panic number three, which is also fixed, was caused a faulty notion of when a thread is in kernel space. The old way was to look at whether the thread is being traced. That fails when a breapoint was put in a signal handler before it requested that tracing be turned off. The fix is to look at the current stack pointer. However, that causes problems when a signal is being delivered to a process. In this case, there is kernel code running on the process stack. So, a flag was added to the thread structure when this is happening.

29 Jan 2001
Back from Sydney. The talk went pretty well, it was well attended, and there was a fair amount of interest in UML there. Rik van Riel and I wandered around Syndey until the following Friday.

The slides from my talk are available here.

While I was in .au, my OLS paper proposal was accepted, so it looks like I'll be doing my song and dance in Ottawa this summer.

3 Jan 2001
Updated the web site with a couple pages describing hostfs and the new console/serial line input specification.

The hostfs memory corruption problems are fixed. slab debug found them for me. They turned out to be two string buffer overrun bugs. I'll release a patch with the fixes pretty soon.

1 Jan 2001
I released the uml patch for 2.4.0-prerelease. You can find it here. I'm going to make the full release tomorrow, hopefully after fixing the hostfs crash and getting socket inputs to work.

Today is the deadline for a first draft of my Linux Magazine article and for OLS paper proposals. I sent them both in last night. We'll see what happens.

27 Dec 2000
hostfs now pretty much works. I built UML from inside itself on hostfs. I fixed some bugs in the write code, added enough mmap support to run binaries from hostfs, and implemented statfs. However, there is some as-yet explained memory corruption going on.

In an attempt to reproduce the MySQL problems that a couple people are seeing, I moved some of my work, which is heavy on MySQL and perl, into UML. I've seen no problems, which is disappointing because I'm not any closer to finding the bug, but also nice because it shows that it's possible to do real work inside it.

I've also been banging on the ethernet driver trying to reproduce the server buffer overflow that I saw earlier. No dice there either.

The swapoff bug is now fixed. It turned out to be a bad idea to give kernel threads both a non-NULL mm and active_mm. That code has been that way for ages. I have no idea when or why it became a bad thing.

9 Dec 2000
hostfs is now almost all working. mknod doesn't work, and you can't run binaries out of a hostfs filesystem.

I also fixed that pesky linking failure that people have seen seeing sporadically for a while. I noticed that profiling was turned on in the latest case that showed up in my inbox. I did a profiling build of my own and lo! it failed to link. Since I could reproduce it, I was out of excuses for not fixing it, and so I did. You can see a full explanation of the problem here .

Those changes plus a couple of smaller ones are now in CVS. They aren't in the latest patch because the SourceForge upload system has been seriously b0rked. I'll update the patch when I can.

7 Dec 2000
I fixed the known bugs in the block driver. The
dd if=/dev/ubd/0 of=/dev/null
hang was due to the driver returning to the block layer rather than continuing to process the queue when it found an out-of-range I/O request. The dbench corruption was due to the elevator rearranging the request queue while a request was in flight. When that request finished, the interrupt handler was supposed to retire it by removing it from the head of the queue. The problem is that the elevator put some other request at the head, and that request was retired without ever being done. Meanwhile, the original request was pushed back in the queue somewhere, and it got done twice.

Dan Aloni has started the Windows port. He got most of the kernel to compile. There are a number of undefined symbols from files that don't compile yet. Overall, though, it's looking pretty good.

30 Nov 2000
I updated the site a little. The major changes involved the "ARCH=um" build change. The compilation page is now very explicit about that and there's a FAQ entry for it.

I fixed up the block driver a little. In the past, if you did

dd if=/dev/ubd/0 of=/dev/null
when it ran off the end of the device, it could apparently hang. This is fixed. The problem with dbench is not fixed, but I made the driver's synchronous mode accessible from the command line with the "ubd=sync" switch. In sychronous mode, the driver has no problems with dbench.
18 Nov 2000
I've got hostfs starting to work reasonably. ls now works, you can cd around and cat things. You can't write anything, create files, or execute them yet.
17 Nov 2000
After a bit of a hiatus, I did a CVS update. A number of buglets relating to running UML as a daemon were fixed. The build was cleaned up - I had hard-coded "gcc" instead of "$(CC)" in my Makefiles, the top-level Makefile is now able to do native and user-mode builds, and I cleaned up the drivers and fs Makefiles so that they let Rules.mk do all the hard work.

I'm also back on hostfs. I fixed the mm problems that it uncovered. It can now do ls on the top-level directory.

1 Nov 2000
Linus finally released the final test10 yesterday, so I made my release last night, with a freshmeat announcement this morning. The stack overflow problems in test9 are fixed by doubling the stack size. There is also an inaccessible page between the two stack pages and the task structure, so there shouldn't be any task structure corruption.

There were a number of other fixes. At the last minute, I found and fixed a nasty race which resulted in the kernel tracing its own system calls, resulting in some nasty stack corruption which made it hard to figure out what happened. UML can now run when its main console is not a terminal (i.e. /dev/null). That didn't work because it flipped the terminal between raw and cooked mode, complaining via printk if the ioctls failed. That led to an infinite recursion of printk error messages which ultimately resulted in a segfault. I also made it possible to mount host devices again. That was broke when I made the block driver check IO requests against the device size so it could report errors for out of bounds IO. It turns out not to be possible to get the size of the media behind a block special file, as far as I can tell. So, as far as the block driver was concerned, block devices had zero size, and all IO was out of bounds.

I also started work on the hostfs filesystem. This is a virtual filesystem which provides access to the host filesystem. The theory is straightforward - vfs calls are converted into the equivalent system calls on the host - but this uncovered a subtle memory management bug. If a libc routine which mallocs memory is called, and the break is increased, that extra memory only exists in that process. If the kernel in another process tries using that memory (or tries calling malloc at all), it will fault. What needs to happen is for the context switching code to see if malloc has increased the size of the data segment and map the new memory into the newly running process. This also raises some SMP issues because when the new memory is mapped in, the other processors will need to be told about it so they can also map it. The same is true of the kernel's virtual memory.

20 Oct 2000
At long last, I added a page for related projects and other interesting links.

In other news, it turns out that Michael Vines wrote a Linux executable runner for Windows that does what a UML port to Windows would have to do and he has GPL-ed it and made it available for anyone who wants to incorporate it into a UML Windows port. See the todo page for a link to his stuff.

17 Oct 2000
Back from ALS. The talk went pretty well. I'll put the slides up on the site at some point.

I fixed the stack overflow problems that people were seeing. The stack is now two pages long, with an inaccessible third page protecting the task structure, which is on the fourth. Now, any stack overflows will segfault rather than polluting the task structure, making them a lot easier to debug. This is in CVS along with a few other changes.

2 Oct 2000
Bill Stearns decided to go overboard on root_fs production. He's been fiddling with the mkrootfs script so that it can handle distros other than Red Hat 6.x. He's done Red Hat 7.0, Mandrake, and Immunix. These are all now available from the project download page . Caldera, Conectiva, and SuSE are in the works.
26 Sep 2000

SGI released a new version of XFS for test5 and I tried to apply it to my test8 um pool, the idea being that I could play with xfs in userspace. The patch went in ok, with some rejects that were not too hard to figure out. After some work, I got it to build. It didn't boot, though. There were some changes in ll_blk_rw.c that I didn't understand, and it looks like they are what resulted in the block device getting a NULL buffer to do I/O into.

So, maybe I'll give XFS another try when SGI gets it slightly more up-to-date.

25 Sep 2000

I found out why the kernel debugging interface doesn't handle breakpoints very well. Setting breakpoints results in process segfaults, floating point exceptions, and other strange behavior. It turns out that do_syscall stored the current register state in the thread structure while determining whether the process was doing a system call. If the process hit a breakpoint in the kernel instead, then that overwrote the state that was stored when the system call was called. When the system call returned, that bogus state was restored, and the process was essentially teleported back into the kernel just after the breakpoint, leading to all kinds of strange behavior. With that problem fixed, things work much better. The kernel debugger seems to be basically healthy, and works just like on a normal process.

While I was fixing breakpoints, I decided to see why gdb inside a virtual machine crashes it whenever it sets a breakpoint. There turn out to be a number of problems. First, SIGTRAP wasn't being delivered to the debuggee when it hit a breakpoint. This made it hard for gdb to find out that the breakpoint had been hit, and to remove it temporarily so the debuggee could get by it. Then, it turned out that PTRACE_SINGLESTEP wasn't implemented. This is used by gdb to execute the instruction which had the breakpoint and stop on the next one. There were one or two other buglets, but now that they are fixed, gdb seems happy with breakpoints.

23 Sep 2000
So, I've been a little lax. Here's what's happened in the last few weeks: two bugs were fixed, the reboot bug and and shell segfault bug. That's it.
1 Sep 2000

I realized that I am starting to lose track of bugs and functionality requests, so I dusted off the project's bug tracking system and put everything that I know of in it. I'm also using the patch manager to store the fixes. The idea is that I'll put fixes there and close them when I make a release that contains the fix.

I fixed a context-switching bug noticed by Lennert Buytenhek. The problem turned out to be a race while updating the address space of the process being restarted. If the interrupt handler needed data from the kernel's vm area, and that area hadn't yet been updated, then the kernel would crash. The fix was to disable signals during that period of the context switch.

31 Aug 2000

I put in Andrea's LFS patch. While I was in there, I cleaned that code up somewhat. That is some of the oldest code still remaining, and it really needed some work. I also put in the fix for the crash caused by a module creating a kernel thread. No word yet on whether it's the right fix, though.

Also, Laurent Bonnaud volunteered to update the ancient filesystem in the Debian package to potato. This is very cool. It is something that I've been wanting to do for a long time.

25 Aug 2000

I'm releasing test7 today.

My ALS2000 paper is now available as HTML and TeX.

I redid the RH mkrootfs script. It now prompts for the info it needs. It also works for RH6.1 and probably RH6.2, although I didn't test that.

23 Aug 2000

Finished my ALS paper and sent it in. That's a load off my mind. Made some more CVS checkins. This makes the various debugging options configurable, although I haven't tested the gprof and gcov configurations. I also added some compatibility code to make the Debian install happier. It now recognizes that uml can have disks, but the disk recognizer gets stuck in state 'D' for reasons I haven't figured out yet.

Linus just released test7, so I am building it right now (right now as I'm writing this, and not right now as you're reading it, because I've got no idea when you're reading this). A quick check of the patch shows no changes that I need to worry about, so this looks like a drop-it-in-and-it-just-works patch. We'll see...

Things look good. I booted it up, and ran a few things, and they all worked. So, I'll run the stress testers on it tomorrow, and if that checks out, I'll release it.

21 Aug 2000
I've made the ptrace proxy, gprof, and gcov support configurable. Also started playing with the Debian 2.2 install. It starts up ok, which the 2.1 install doesn't. It looks like I'll have to fake some /proc/ide entries before it will deign to admit that the virtual machine has disks. Right now, it's punting me into the diskless install.
17 Aug 2000
Fixed a network driver bug which caused a crash when ab was run against it. This might also fix the ping flood problem. The fix is in cvs. It will appear for real in test7.
15 Aug 2000
I found out what was causing uml not to boot. It turned out to be a casting bug which was making the compiler do pointer arithmetic rather than integer arithmetic. This was a long-standing bug, and test6 changed things so it got hit more heavily. So, assuming that it's not too badly broken now, I'll release test6 for real.
8 Aug 2000
Revamped the website. It will be put up as soon as Linus releases test6 and I've integrated the changes in. This is because this site talks about stuff which isn't really going to be released until then.
7 Aug 2000

Checked in changes which make the new debugging interface more or less work. I also added a 'debug' command line switch which starts the kernel in the debugger, so you have control of it from the start.

There are some problems with it. Commands attached to breakpoints cause segfaults for some reason. It also can't step across a context switch.

I also put in Rusty's patches. They completely revamp the config mechanism. For some reason, there also seems to be very complete networking/netfilter converage.

5 Aug 2000
The ptrace proxy is more or less working. I've checked it in to CVS and announced it on my devel list.
4 Aug 2000

Got a couple of patches from Rusty. I'm apparently going to graduate to a complete port once I've applied one of them :-) It is nice. It gives me what looks like a complete config process rather than the one I've kludged together. He also sent in enough exports to allow his stuff to be modular inside a UML.

I'm integrating in Lars Brinkhoff's ptrace proxy. It's partially working - enough that I can attach to the running thread, poke around it, set breakpoints, etc. This without needing to detach it from the uml tracing thread. I can't ^C gdb and have it stop the kernel wherever it happens to be. It also doesn't seem to be following threads as one goes to sleep and another starts up. Once these work, this will be a huge improvement in uml kernel debugging.

2 Aug 2000

Bill Stearns pointed out a reproducable way of crashing the kernel yesterday. It turns out that irq_save/irq_restore were completely wrong. irq_save was enabling signals when it should have been disabling them. This could explain a lot of the problems I saw in test4.

I checked the fix into CVS today.

28 Jul 2000

Linus released test5 last night, so I'm putting out the user-mode version today. There's nothing new in it. The virtual ethernet is in the patch, but not enabled in the binary kernels and off by default in defconfig.

The stress testing of this kernel produced no strange happenings. Maybe the segfaults and other stuff in the last release weren't my fault (heh).

27 Jul 2000
Started integrating Jim Leu's virtual ethernet driver. It basically works, but it misbehaves a fair bit. It's unclear whether that's the kernel's fault or the driver's.
18 Jul 2000

I/O, I/O, off to OLS I go

I'm leaving OLS a bit early because I've got a hiking weekend coming up - Carter Dome on Saturday and possibly Moriah on Sunday. So, no one had better expect any work from me until at least next week...

Plus I'm getting Kije next week.

17 Jul 2000

Announced test4 on freshmeat today.

Also discovered a few more problems which I didn't see on test2 or test3:

  • the occasional process segfault which I've mentioned already
  • a devfs segfault - I did this by displaying an xterm out to the host; when I logged out, the kernel paniced with memory corruption in devfs
  • X clients sometimes can't display against a local X server - strace says that they're stuck in select
  • strace also displayed their read masks as '[?]', which doesn't seem right
  • Patches for these will be forthcoming when I find fixes.

    14 Jul 2000

    Back from SF. Not only did Linus release test3 on Tuesday (as I discovered when I was checking things out just before leaving for the airport), but he also released test4 yesterday. So, it looks like I'm going to be skipping test3 and going straight to test4.

    The test3 changes are pretty minor, but enough to prevent the um test2 patch from going in. task.priority changed to task.nice, there were some minor locking changes, devfs_mk_dir lost a parameter, and kernel/timer.c doesn't compile because of that field change. With those things fixed, the kernel boots fine.

    I also decided to get rid of the

                    
    pid 16 (mount) - segv changing ip to 0x10025ff2 for address 0x8064000
    
                  
    messages that appear at boot time. They are debugging messages to convince me that the new uaccess macros are working. But they looked abnormal and worried people so they are now gone. Don't worry, be happy.

    The test3 kernel runs my stress tests (lmbench and a kernel build) fine, so I'm checking it in to CVS and announcing it on the devel list.

    On to test4. The timer.c bug got fixed. Otherwise, the patch went in cleanly. It compiles and the resulting kernel boots cleanly. Unfortunately, lmbench segfaults. I put in some debugging code, and lmbench stops segfaulting. On to the kernel build. That works fine. I try a couple more lmbench runs. They work fine. Oh well.

    I'll consider this releasable. Maybe someone else can find a better way to reproduce the problem. Check this stuff into CVS, and out goes the announcement.

    I also updated all of the downloadable stuff and announced it.

    3 Jul 2000
    I fixed the double panic bug. That was caused by a stacksize limit that was not a multiple of 4 meg. The reason that matters is that check_range (in arch/um/kernel/tlb.c), which is used to remap address spaces during a context switch, assumes that remappable areas and non-remappable areas are under different pgdirs, which represent 4 meg apiece. Non-remappable areas are areas of address space which don't belong to the process. Kernel text, data, and physical and virtual memory, plus the original stack, fall into this category. They are represented by vmas in the process mm_struct, but don't have page table entries. If check_range runs into one of these areas in the course of looking at something else, the lack of ptes for it will cause it to be unmapped. Since the process stack is placed just outside the stacksize limit, if that limit is (say) less than 4 meg, when check_range checks it for remapping, it will also run into the main stack provided by the host kernel and unmap it. The panic happened when a process tried to change its name, which is stored in that initial stack.

    If you see this problem, you can change your stacksize limit to a multiple of 4 meg, or apply this patch to the kernel.

    Those two fixes are now checked into CVS . Here's the devel list post describing the changes.

    2 Jul 2000
    UML doesn't run on recent 2.3/2.4 kernels and I figured out why. The signal frame size increased due to some extra x86 state that needed to be saved. UML is responsible for making sure that there is enough stack available when it asks the host kernel to send a signal to one of its processes. To do this, it pokes the stack (by reading and writing a word) a little below the current stack pointer. If there is nothing mapped there, the seg fault handler will map a page in and all will be well. The offset that it used to poke was a hard-coded 512 bytes, which I got by looking at the amount of stack state the syscall handler needed (312 bytes) and adding a bit. However, it turns out that the new stack frames are much bigger than that, so the 512 bytes wasn't enough. Fixing this makes UML run on new kernels. If you are seeing this problem, apply this patch to the 2.4.0-test2 pool.

    I'm also chasing a bug which causes a panic like this:

                    
    Kernel panic: Double fault on 0xbffff874 - panicing because it wasn't
    fixed the first time
    
                  
    Hosted at SourceForge Logo