I’ve been working on fos at MIT CSAIL in recent months. fos is a factored operating system, in which the parts of the OS communicate by sending messages to each other, rather than by communicating by shared memory with locks and traps and so forth. The idea of fos is to make an OS for manycore chips that is more scalable than existing systems. It also permits system services to be elastic – to grow and shrink with demand, and it permits the OS to span more than one box, if you want.
The fos messaging system has several implementations. When you haven’t sent a message to a particular remote mailbox, you send it to the microkernel, which delivers it. If you keep on sending messages to the same place, then the system creates a shared page between the source and destination address spaces and messages can flow in user mode, which is faster. Messages that cross machine boundaries are handled by TCP/IP between proxy servers on each end.
I’ve been making the messaging system a bit more object oriented, so that in particular you can have multiple implementations of the user space shared message message transport, with different properties.After I got this to pass the regression tests, I checked it in and went on to other stuff.
Charles Gruenwald, one of the grad students, started using my code in the network stack, as part of a project to eliminate multiple copies of messages. (I added iovec support, which makes it easier to prepend headers to messages), and his tests were hanging. Charles was kind enough to give me a repeatable test case, so I was able to find two bugs. (And yes, I need to fix the regression tests so that they would have found these!)
Fine.
Next, Chris Johnson, another one of the grad students, picked up changes from Charles (and me) and his test program for memcached started to break.
All the above is just the setup. Chris and I spent about two days tracking this down…
Memcached is a multithreaded application that listens for IP connections, stores data, and gives it back later. It is used by some large scale websites like facebook.com to cache results that would be expensive to recompute.
When a client sends a data object to memcached for storage, memcached replies on the TCP connection with “STOREDrn”. On occasion, this 8 character message would get back to Chris’s client as “