It has finally gotten cold here. Right now is it about 17F outside. Previously we had been getting by with just the heating zones for the kitchen/family room and the master bedroom turned on.
A few days ago, the boys had trouble getting to sleep while we were watching TV, because the noise from the set was keeping them up. Alex closed the door. The next morning, I noticed it was 55F in their room. Well, I reasoned, the heating zone up there is not turned on, and with the door shut, warm air from the rest of the house can’t get in so easily. I turned on the heat. The next night Alex happened to close the door again, and in the morning it was 52F. That isn’t so good.
Friday we had the neighbors over for dinner so I turned on the dining room heat. A couple hours later I went to check on it and it wasn’t any warmer.
This is our heating system. This is a gas fired hot water system. The “boiler” is the green box on the lower left. It heats water to 160F or so. From there, there are 9 heating zones. The horizontal pipe manifold in the front is the the return path to the boiler. The vertical pipes with yellow shutoffs representing the returns for each zone. The supply manifold is behind, along with the pumps and so forth. One zone heats water in the blue tank for domestic hot water faucets and showers. The other zones have circulating pumps that feed tubing that zigzags under the floors . This is called radiant heating.
Each zone typically has a manifold like this one that routes hot water through synthetic rubber tubes that are stapled to the undersize of the floors, and insulated below that to direct their heat upwards. This lets you walk around on warm floors and actually get by with colder air temperatures. Our oldest daughter was in the habit of leaving the next day’s clothes on the floor covered with a blanket, so they would be prewarmed in the morning. Notice that one tube is turned off. That one runs underneath the kitchen pantry, which we try to keep colder.
In the main system photo, on the left, you can see electronics boxes on the wall. Here’s a closeup.
Each zone has a thermostat, which comes into one of these boxes. This is a three channel box, with three 24 volt thermostats coming in on brown wires at the top, and red wiring for three 120 volt zone circulator pumps at the bottom. The box also signals the main boiler that heat is being called by at least one zone. Each zone has a plug in relay, one of which I have unplugged.
The circulator pumps look like this
So there is a central gas water heater, which feeds a number of zones. Each zone has a water circulation pump, controlled by a thermostat. The pump feeds hot water through rubber tubes on the underside of the floors.
Individual zones have failed before. I have fixed them by replacing the circulator pump. You can get these anywhere.
The hardest part about replacing these is the electrical wiring, which is hardwired by wirenuts in the green box attached to the pump. First, turn off the power. I did this by physically pulling the relay for the appropriate zone. Then I measure the pump current using a clamp on ammeter. Then I measure the voltage. Only then do I unscrew the wirenuts protecting the wires, and without touching the bare wires, touch the end to ground. Then brush the wire with the back of your hand only. If the wire is live, the electricity will contract your arm muscles, pulling your hand away. If you can’t think of at least four ways to make sure the wires are not live, hire someone to do this for you. Really. There are old electricians, and there are bold electricians. There are no old, bold electricians. I am an old electrician.
Our system has shutoff valves immediately on both sides of the pump. By turning those off, you can swap out the pump without draining all the water out of the system. As you can see in the picture, the pump is held in place by flanges at the inlet (bottom) and outlet (top). Each flange has two stainless steel bolts, so they won’t rust. In a burst of cleverness or good design, the nuts on these bolts are 11/16 and the bolts themselves are 5/8, so you can take them apart with only one set of wrenches. Here’s the pump I removed.
Note the corrosion inside the pump. I put the new pump in place and turned this zone back on, and now the dining room was getting heat. While I was down there, I took a look at this thing.
This is an air removal valve. It is installed on top of the boiler, along with a pressure relief valve. On some intuition, I lifted the pressure relief valve toggle, and air came out, followed by water. That is not good. The water for a heating system like this comes from town water, which has dissolved gas in it. Typically this will be air, although in the Marcellus Shale areas it can be natuural gas (in those areas, you can set your sink on fire). Air is bad for forced hot water systems. it corrodes the inside of the pipes, and water pumps won’t pump air, usually. If the radiant tubes get full of air, they will not be heating. By the way, these pipes are so rusty because some years ago the boiler was overheating to the point that the relief valve was opening, getting water everywhere. This was because the temperature sensor had come unstuck from the pipe it was measuring. Fixed by a clever plumber with a stainless pipe clamp. As collateral damage from rapid cycling, I had to get a new gas valve too. Separate story.
After waiting a few few minutes, I tried the relief valve again and got more air. This meant that the air removal valve wasn’t working, and probably some of my zones weren’t working because of air-bound pumps or bubbles in the pipes. You might be wondering how the valve knows to let out air, but not water. Inside the cylinder is a float. When there is water inside, the float rises and closes the output port. When there is air inside, the float falls, opening the outlet port and letting out the air. It is pretty simple. I called a plumber friend to see if he could fix this and he said “if you can replace a zone pump, you can replace this valve too.” Basically, you turn off the system, close all the valves, to minimize the amount of water that will come out, depressurize the system, and work fast. A new valve was $13 at Home Depot. The fact they had 10 in stock suggests they do go bad. Unfortunatly I failed to depressurize the system as well as I thought, and I got a 3 foot high gusher of 130F water. Be careful! Heating systems run at around 10 psi. The pressure comes partly from town water pressure through a pressure regulator, and partly from the expansion of hot water. There is an expansion tank to reduce that effect.
The next day, I tried the pressure relief valve again and got water immediately. Probably this means the new valve is working.
Each zone has a temperature gauge. You can see that the two on the right are low, and the two on the left in this picture are not. The right hand zone had the pump I replaced. The next one was not turned on. The temperature gauges are there because you don’t want to run 160F water through these radiant tubes. The floors will get too hot and the tubes won’t last very long. Instead, each zone has a check valve and a mixing valve.
The check valve keeps the loop from flowing backwards, or generally keeps it from circulating by gravity. Cold water is slightly denser than hot water, so the water on the colder side of the loop will fall, pulling hot water around the loop even without the pump running. The spring in the check valve is enough to stop gravity circulation.
The mixing valve has a green adjusting knob. This valve mixes hot water from the boiler with cooler water from the return leg of the zone, and serves to adjust the temperature of the water in each zone. Some water recirculates, with some hot water added.
When I turned on the zone second from the right, it did not work. The temperature gauge stayed put at 80F, (conducted heat through the copper pipes). I used my ammeter to confirm the pump was drawing power. I turned off the valves for all the other zones, so that this one would have more water. Didn’t work.
There are three reasons why a hot water zone might not work: the pump is not spinning, the pump is trying to pump air, or the pipes are clogged. I had just replaced a pump to fix a zone, but was there a second bad pump? Or something else?
I have an intra-red non-contact thermometer, and I used it to measure the pump housing temperatures. The working pumps were all around 125F, the non working pump was around 175F. That might mean that it was stalled, and not spinning, or that it was pumping air, and not being cooled by the water. I had one more spare pump, but I was getting suspicious.
I got to wondering if the pump I removed was really broken. I knew that these Taco 007-F5 pumps have a replaceable cartridge inside, but since the cartridge costs almost as much as a new pump I had never bothered with it. I decided to take apart the pump I removed to see what it looks like.
The pump housing is on the left. The impeller attached to the replaceable cartridge is in the center, and the motor proper is on the right. The impeller wasn’t jammed, but I wanted to know if it was working at all. I cut the cord off a broken lamp and used it to wire up the pump.
I was careful not to touch the pump when plugged in, because you will notice there is no ground. The impeller worked fine. Probably there was never anything wrong with the pump. While I had it set up like this, I measured 0.7 Amps current when running, which is what it should be. I then held on to the (plastic) impeller and turned it on. When stalled, the motor draw rose to 1.25 Amps. I now had a way to tell if a motor was stalled or spinning! The suspect zone was drawing .79 Amps, which probably means it was spinning, and the high temperature meant there was no water inside.
Around this time, Win called to ask me to go pick up firewood. While waiting I explained all this to Cathy. She has a PhD in Chemical Engineering, and has forgotten more about pipes and fluid flow that I will ever learn. She says “are the pumps self-priming?” Priming is the process of getting water into the pump so that they have something to pump. A self priming pump will pump air well enough to pull water up the pipe from a lower reservoir. A non-self-priming pump will not. These pumps are not self-priming. They depend on something else to get started. Cathy says “are the pumps below the reservoir level?” No. they are above the boiler. Cathy says “I would design such a thing with the pumps below the reservoir level, so they prime automatically”. Um, OK, but how does that help me? Cathy says “Turn off the top valve, take off the top flange and pour water in the top.” Doh.
I didn’t quite do that, because I remembered the geyser I got taking off the air vent. If I could let air out the top, water might flow in from below. All I did was loosen the bolts on the top flange a little. After about 10 seconds, I started getting water drops out of the joint, so I tightened the bolts and turned on the pump. Success! After a few minutes, the temperature gauge started to rise.
So probably my problems were too much air in the system all along.
On the way to buy a new air vent, I stopped at Win’s house to check his air vent, but we couldn’t find it! Either it is hidden away pretty well, which seems like a bad idea, or there isn’t one, which also seems like a bad plan. We’re puzzled, but he has heat. And now, so do I!
UPDATE 12/15/2013
One heating zone still doesn’t work. The temperature gauge near the pump rises to 100, and the nearby pipes are warm, but the pipes upstairs (this is a second floor zone) are cold. I replaced the cartridge of the pump with the one I took apart the other day, and it spins, but there is no change. The pump is drawing current consistent with spinning. I loosen the top flange above the pump and water comes out. These symptoms are consistent with the pump spinning, and having water, but there is no flow all the way around the loop.
I took a detour to the Taco website and looked at the pump performance curves for the 007-F5, which are at http://www.taco-hvac.com/uploads/FileLibrary/00_pumps_family_curves.pdf. A pump has a certain ability to push water uphill. The weight of water above the pump more or less pushes back on the pressure generated by the pump. This height of water is called the “head”. A pump will pump more water against a lower head, and as the head is larger, the pump will deliver less and less water. Above a certain head it won’t work at all. According to the performance curves for my circulating pumps, their flow rate will drop to 0 at 10 feet of head. From the pump location to the distribution manifold in the wall behind the closet in the upstairs bedroom is about 18 feet. This pump cannot work if the pipe is not completely full of water. If both the supply pipe to the upstairs and the return pipe coming back are full of water, then because water is incompressible, the suction of the water falling down the return pipe will balance the weight of water in the supply pipe. If the pipe is full of air, as it likely is, then this pump is not powerful enough to lift water to the top.
The solution to this is to “purge” the air out of the pipes, by using some external source of pressure to push water into the supply end until all the air is pushed out of the return end. For this to work, the return end must be opened up to atmosphere, otherwise there’s no place for the air to go. (It will likely just get squeezed by the pressure, but there is no route for it to get to, for example, the air vent. I think you need a pretty high flow rate to do this, because the return pipe is 3/4 inch, and without a high flow rate, the air bubbles will float up against the downwards flow of water.
Some systems have air vents at the high points. Mine do not. This would help, because water would flow up both the supply and return pipes, lifted by the 10PSI system pressure. Since it only takes 7.8 psi pressure to lift water 18 feet, this would completely fill the pipes. Of course there would be a potentially leaky air vent inside the walls upstairs, to cause trouble in some future year. I don’t know if the lack of vents is sloppy installation or if one is supposed to use some other method of purging.
My system installation has no obvious (to me anyway) purge arrangements. To purge, you shut off valves on the boiler, put a hose from a valve on the return side into a bucket of water, and turn on external water on the supply side. When the host stops bubbling air, you are good to go.
In my system, makeup water comes from the house cold water pipes, through a backflow preventer and a pressure reduction valve to the hot water manifold. The return pipes from the zones flow to the boiler return manifold and then to the boiler. There is no master return shutoff, and no purge tap on the return maniforld. There is a drain tap on the boiler itself, and there is a tap between the boiler and a valve that can isolate the boiler from the hot water supply manifold. The pressure regulator has a little lever on the top that according to its user manual will open the regulator and let more water through for purging.
I could close the valve to isolate the boiler from the supply manifold, but then the purge water has to run all the way through the boiler to get to the outlet hose. I would lose all the hot water in the boiler.
But I have a missing pump! Years ago, I borrowed the pump from the zone that heats the study, and never put it back. I closed all the supply zone valves except the bad zone, and closed all the return valves except the bad zone and the study zone. I closed the main boiler output valve. At this point, the only path through the system was from the makeup water regulator, through the broken zone, to the return manifold, backwards into the study zone return pipe, through the cold side of the study zone mixing valve, and out the bottom flange of the not-present study pump.
I put a bucket under it and opened the bottom study zone pump valve. Water came out, but after a few gallons, I only get a trickle. I can hear hissing when I open the regulator toggle, but I suspect there is not enough flow to do effective purging. The setup is complicated, so I am not completely sure. In any case, this didn’t fix the not-working zone.
Next step: test the pressure regulator flow by closing all valves except makup water and the tap that is connected to the boiler outlet manifold. That will let me see the flow supplied by the regulator. I found an old backflow valve and regulator set on the floor. Evidently it was replaced at some point. The old one had a pretty clogged looking input screen, so perhaps that is the trouble with the current one as well. That wouldn’t affect normal operations because you don’t need makeup water unless there is a leak.
Bike Safey
I just wanted to mention a few things that would help me survive the week.
I am eagerly awaiting a paved Wayland Rail Trail from the town Library through to Weston, but in the meantime, I bike along route 20. The problem is that few roads in Wayland are bike friendly, but you can help!
(About that rail trail, please see Wayland Rail Trail and check out the Minuteman Bike Trail from Lexington to Alewife or the Charles River Bike Path )
For my fellow residents:
- Take a look at the street in front of your house or other property.
- Keep the shoulders clear of debris, sand, leaves, sticks, broken glass, etc.
- Try and deal with the poison ivy that loves the edges of roads. I am so allergic to that stuff that I don’t dare ride right at the edge.
- If you have a sidewalk, please keep it clear. In addition to the debris, it is hard to navigate around those mailboxes and trash cans.
For our public works folks:
- When we do have sidewalks, they tend to be pretty awful, and unusable for bicycles. The paving isn’t up to street standards, and is broken by roots, holes. etc.
- The sidewalks tend to fill up with leaves, fallen branches, and so forth, which make them unusable.
- Guy wires cross from utility poles at just the right hight to clothesline a tall guy like me. Of course they are invisible at dusk!
- Many road corners lack curb cuts, so you can’t actually get on or off the sidewalk anyway.
Without sidewalks, I have to ride in the street. That is fine, but…
- The shoulders are, um, badly paved: potholes, jagged gaps in the top paving, bumpy drains
- The shoulders collect sand, which is like ice for bicycles, you can’t steer on sand.
- On Route 20, there is an unfortunate amount of broken glass.
Maybe we could street sweep more than once a year?
And that paving on Pelham Island Road is nasty, but that is a topic for a different letter.
For Drivers:
Most drivers are actually pretty awesome with bicyclists, Thank you! However:
- Look at that right side mirror once in a while. When you are caught in traffic, I will be passing you at my astounding 12 miles an hour or whatever. I’ll be coming up on your right.
- Don’t keep so far to the right that there isn’t room for me! The lanes are actually fairly wide and the shoulder is often very narrow.
For my part, I signal, I don’t run red lights, and I really try to watch where I am going and to be aware of my surroundings, but not every cyclist (especially the kids) will follow the rules. Treat them with suspicion and when possible, give extra space when passing a cyclist, just in case they have no idea you are there and swerve to miss a stick or pothole.
-Larry
BIOS vs GPT
This might be the 1000th blog posting on this general topic, but for some reason, the complexity of booting grows year over year, sort of like the tax code.
Back in 2009, Win and I built three low power servers, using Intel D945GCLF2 mini-ITX motherboards with Atom 330 processors. We put mirrored 1.5 Terabyte drives in them, and 2 GB of ram, and they have performed very well as pretty low power home servers. We ran the then-current Ubuntu, and only sporadically ran apt-get update and apt-get upgrade.
Fast forward to this summer. We wanted to upgrade the OS’s, but they had gotten so far behind that apt-get update wouldn’t work. It was clearly necessary to reinstall. Now one of these machines is our compound mail server, and another runs mythtv and various other services. The third one was pretty idle, just hosting about a terabyte of SiCortex archives. In a previous blog post I wrote about the month elapsed time it took me to back up that machine.
This post is about the adventure of installing Ubuntu 12.04 LTS on it. (LTS is long term support, so that in principle, we will not have to do this again until 2017. I hope so!)
Previously, SMART tools were telling us that the 2009 era desktop 1.5T drives were going bad, so I bought a couple of 3T WD Red NAS drives, like the ones in our Drobo 5N. Alex (my 14 year old) and I took apart the machine and replaced the drives, with no problem.
I followed directions from the web on how to download an ISO and burn it to a USB drive using MacOS tools. This is pretty straightforward, but not obvious. First you have to convert the iso to a dmg, then use dd to copy it to the raw device:
hdiutil convert -format UDRW -o ubuntu-12.04.3-server-amd64.img ubuntu-12.04.3-server-amd64.iso # Use diskutil list, then plug in a blank USB key >the image size, run diskutil list again to find the drive device. (In my case /dev/disk2) sudo dd if=ubuntu-12.04.3-server-amd64.img.dmg of=/dev/disk2 bs=1m # notice the .dmg extension that MacOS insists on adding diskutil eject /dev/disk2 (or whatever)
Now in my basement, the two servers I have are plugged into a USB/VGA monitor and keyboard switch, and it is fairly slow to react when the video signal comes and goes. In fact it is so slow that you miss the opportunity to type “F2” to enter the BIOS to set the boot order. So I had to plug in the monitor and keyboard directly, in order to enable USB booting. At least it HAS USB booting, because these machines do not have optical drives, since they have only two SATA ports.
Anyway, I was able to boot the Ubuntu installer. Now even at this late date, it is not really well supported to install onto a software RAID environment. It works, but you have to read web pages full of advice, and run the partitioner in manual mode.
May I take a moment to rant? PLEASE DATE YOUR WEB PAGES. It is astonishing how many sources of potentially valuable information fail to mention the date or versions of software they apply to.
I found various pieces of advice, plus my recollection of how I did this in 2009, and configured root, swap, and /data as software RAID 1 (mirrored disks). Ubuntu ran the installer, and… would not reboot. “No bootable drives found”.
During the install, there was an anomaly, in that attempts to set the “bootable” flag on the root filesystem partitions failed, and when I tried it using parted running in rescue mode, it would set the bootable flag, but clear the “physical volume for RAID” flag.
I tried 12.04. I tried 13.04. I tried 13.04 in single drive (no RAID). These did not work. The single drive attempt taught me that the problem wasn’t the RAID configuration at all.
During this process, I began to learn about GPT, or guid partition tables.
Disks larger than 2T can’t work with MBR (master boot record) style partition tables, because their integers are too small. Instead, there is a new GPT (guid partition table) scheme, that uses 64 bit numbers.
Modern computers also have something called UEFI instead of BIOS, and UEFI knows about GPT partition tables.
The Ubuntu installer knows that large disks must use GPT, and does so
Grub2 knows this is a problem, and requires the existence of a small partition flagged bios_grub, as a place to stash its code, since GPT does not have the blank space after the sector 0 boot code that exists in the MBR world (which grub uses to stash code).
So Ubuntu creates the GPT, the automatic partitioning step creates the correct mini-partition for grub to use, and it seems to realize that grub should be installed on both drives when using an MD filesystem for root. (it used the command line grub-install /dev/sda /dev/sdb) Evidently the grub install puts a first stage loader in sector 0, and the second stage loader in the bios_grub partition.
Many web pages say you have to set the “bootable” flag on the MD root, but parted will not let you do this,because in GPT, setting a “bootable” flag is forbidden by the spec. Not clear it would work anyway because when you set it, the “physical volume for raid” flag is turned off.
The 2009 Atom motherboards do not have a UEFI compatible BIOS, and are expecting an MBR. When they don’t find one, they give up. If they would just load the code in sector 0 and jump to it it would work. I considered doing a bios update, but it wasn’t clear the 2010 release is different in this respect.
So the trick is to use FDISK to <create an MBR> with a null partition. This is just enough to get past the Atom BIOS’ squeamishness and have it execute the grub loader, which then works fine using the GPT. I got this final trick from http://mjg59.dreamwidth.org/8035.html whose final text is
boot off a live CD and run fdisk against the boot disk. It’ll give a bunch of scary warnings. Ignore them. Hit “a”, then “1”, then “w” to write it to disk. Things ought to work then.
The sequence of steps that worked is:
Run the installer Choose manual disk partitioning Choose "automatically partition" /dev/sda This will create a 1 MB bios_grub partition and a 2GB swap, and make the rest rootDelete the root partition Create a 100 GB partition from the beginning of the free space Mark it "physical volume for RAID" with a comment that it is for root Use the rest of the free space (2.9T) to make a partition, mark it physical volume for raid. Comment that it is for /data Change the type of the swap partition to "physical volume for raid" Repeat the above steps for /dev/sdb Run "configure software RAID" Create MD volume, using RAID 1 (mirrored) Select 2 drives, with 0 spares Choose the two swap partitions Mark the resulting MD partition as swap Create MD volume, RAID 1, 2, and 0 Select the two 100 GB partitions Mark them for use as EXT4, to be mounted on / Create MD volume, RAID 1, 2, and 0 Select the two 2.9T partitions Mark them for use as EXT4, to be mounted on /data (I considered BTRFS, but the most recent comments I could find still seem to regard it as flakey) Save and finish installing Ubuntu Pretend to be surprised when it won't boot. "No bootable disks found" Reboot from the installer USB, choose Rescue Mode Step through it. Do not mount any file systems, ask for a shell in the installer environment. When you get a prompt, fdisk /dev/sda a 1 w Then fdisk /dev/sdb a 1 w ^d and reboot. Done Now I have a working Ubuntu 12.04 server with mirrored 3T drives.
Big Data
I propose a definition of Big Data. Big Data is stuff that you cannot process within the MTBF of your tools.
Here’s the story about making a backup of a 1.1 Terabyte filesystem with several million files.
A few years ago, Win and I built a set of home servers out of mini-ATX motherboards with Atom processors and dual 1.5 Terabyte drives. We built three, one for Win’s house, that serves as the compound IMAP server and such like, one for my house, which mostly has data and a duplicate DHCP server and such like, and one, called sector9, which has the master copy of the various open source SiCortex archives.
These machines are so dusty that it is no longer possible to run apt-get update, and so we’re planning to just reinstall more modern releases. In order to do that, it is only prudent to have a couple of backups.
In the case of sector9, it has a pair of 1.5 T drives set up as RAID 1 (mirrored). We also have a 1.5T drive in an external USB case as a backup device. The original data is still on a 1T external drive, but with the addition of this and that, the size of sector9’s data had grown to 1.1T.
I decided to make a new backup. We have a new Drobo5N NAS device, with 3 3T drives, set up for single redundancy, giving it 6T of storage. Using 1.1T for this would be just fine.
There have been any number of problems.
Idea 1 – mount the Drobo on sector9 and use cp -a or rsync to copy the data
The Drobo supports only AFP (Apple Filesharing Protocol) and CIFS (Windows file sharing). I could mount the Drobo on sector9 using Samba, except that sector9 doesn’t already have Samba, and apt-get won’t work due to the age of the thing.
Idea 2 – mount the Drobo on my Macbook using AFP, and mount sector9 on the Macbook using NFS.
Weirdly, I had never installed the necessary packages on sector9 to export filesystems using NFS.
Idea 3 – mount the Drobo on my Macbook using AFP and use rsync to copy files from sector9.
This works, for a while. The first attempt ran at about 3 MB/second, and copied about 700,000 files before hanging, for some unknown reason. I got it unwedged somehow, but not trusting the state of everything, rebooted the Macbook before trying again.
The second time, rsync took a couple of hours to figure out where it was, and resumed copying, but only survived a little while longer before hanging again. The Drobo became completely unresponsive. Turning it off and on did not fix it.
I called Drobo tech support, and they were knowledgeable and helpful. After a long sequence of steps, invoving unplugging the drives, and restarting the Drobo without the mSata SSD plugged in, we were able to telnet to it management port, but the Drobo Desktop management application still didn’t work. That was in turn resolved by uninstalling and reinstalling Drobo Desktop (on a Mac! Isn’t this disease limited to PCs?)
At this point, Drobo tech support asked me to use the Drobo Desktop feature to download the Drobo diagnostic logs and send them in….but the diagnostic log download hung. Since the Drobo was otherwise operational, we didn’t pursue it at the time. (A week later, I got a followup email asking me if I was still having trouble, and this time the diagnostic download worked, but the logs didn’t show any reason for the original hang.)
By the way, while talking to Drobo tech support, I discovered a weath of websites that offer extra plugins for Drobos (which run some variant of linux or bsd). They include an nfs server, but using it kind of voids your tech support, so I didn’t
A third attempt to use rsync ran for a while before mysteriously failing as well. It was clear to me that while rsync will synchronize two filesystems, it might never finish if it has to check its work from the beginning and doesn’t last long enough to finish.
I was also growing nervous about the second problem with the Drobo, that it uses NTFS, not a a linux filesystem. As such, it was not setting directory dates, and was spitting warnings about symbolic links. Symbolic links are supposed to work on the Drobo. In fact, I could use ln -s in a Macbook shell just fine, but what shows up in a directory listing is subtly different than what shows up in a small rsync of linux symbolic links.
Idea 4: Mount the drobo on qadgop (my other server, which does happen to have Samba installed) and use rsync.
This again failed to work for symbolic links, and a variety of attempts to change the linux smb.conf file in ways suggested by the Internet didn’t fix it. There were suggestions to root the Drobo and edit its configuration files, but again, that made me nervous.
At this point, my problems are twofold:
- How to move the bits to the Drobo
- How to convince myself that any eventual backup was actually correct.
I decided to create some end-to-end check data, by using find and md5sum to create a file of file checksums.
First, I got to wondering how healthy the disk drives on sector9 actually were, so I decided to try SMART. Naturally, the SMART tools for linux were not installed on sector9, but I was able to download the tarball and compile them from sources. Alarmingly, SMART told me that for various reasons I didn’t understand, both drives were likely to fail within 24 hours. They told me the external USB drive was fine. Did it really hold a current backup? The date on the masking tape on the drive said 5/2012 or something about a year old.
I started find jobs running on both the internal drives and the external:
find . -type f -exec md5sum {} ; >s9.md5
find . -typef -exec md5sum {} ; >s9backup.md5
These jobs actually ran to completion in about 24 hours each. I now had two files, like this:
root@sector9:~# ls -l *.md5
-rw-r--r-- 1 root root 457871770 2013-07-08 01:24 s9backup.md5
-rw-r--r-- 1 root root 457871770 2013-07-07 21:39 s9.md5
root@sector9:~# wc s9.md5
3405297 6811036 457871770 s9.md5
This was encouraging, the files were the same length, but diffing 450 MB files is not for the faint of heart, expecially since find doesn’t enumerate them in the same order. I had to sort each file, then diff the sorted files. This took a while, but in fact the sector9 filesystem and its backup were identical. I resolved to use this technique to check any eventual Drobo backup. It also relieved my worries that the internal drives might fail at any moment. I also learned that the sector9 filesystem had about 3.4 million files on it.
Idea 5: Create a container file on the Drobo, with an ext2 filesystem inside, and use that to hold the files.
This would solve the problem of putting symbolic links on the Drobo filesystem (even though it is supposed to work!) It would also fix the problem of NTFS not supporting directory timestamps or linux special files. I was pretty sure there would be root filesystem images in the sector9 data for the SiCortex machine and for its embedded processors, and I would need special files.
But how to create the container file? I wanted a 1.2 Terabyte filesystem, slightly bigger than the actual data used on sector9.
According to the InterWebs, you use dd(1), like this:
dd if=/dev/zero of=container.file block=1M seek=1153433 count=0
I tried it:
dd if=/dev/zero of=container.file block=1M seek=1153433
It seemed to take a long time, so I thought probably it was creating a real file, instead of a sparse file, and went to bed.
The next morning it was still running.
That afternoon, I began getting emails from the Drobo that I should add more drives, as it was nearly full, then actually full. Oops. I had left off the count=0.
Luckily, deleting a 5 Terabyte file is much faster than creating one! I tried again, and the dd command with count=0 ran very quickly.
I thought that MacOS could create the filesystem, but I couldn’t figure out how. I am not sure that MacOS even has something like the linux loop device, and I couldn’t figure out how to get DiskUtility to create a unix filesystem in an image file.
I mounted the Drobo on qadgop, using Samba, and then used the linux loop device to give device level access to the container file, and I was able to mkfs an ext2 filesystem on it.
Idea 6: Mount the container file on MacOS and use rsync to write files into it.
I couldn’t figure out how to mount it! Again, MacOS seems to lack the loop device. I tried using DiskUtility to pretend my container file was a DVD image, but it seems to have hardwired the notion that DVDs must have ISO filesystems.
Idea 7: Mount the Drobo on linux, loop mount the container, USB mount the sector9 backup drive.
This worked, sort of. I was able to use rsync to copy a million files or so before rsync died. Restarting it got substantially further, and a third run appeared to finish.
The series of rsyncs took several couple of days to run. Sometimes they would run at about 3 MB/s, and sometimes at about 7 MB/sec. No idea why. The Drobo will accept data at 11 MB/sec using AFP, so perhaps this was due to slow performance of the USB drive. The whole copy took close to 83 hours, as calculated by 1.1 T at 3 MB/sec.
Unfortunately, df said the container filesystem was 100% full and the final rsync had errors “previously reported” but scrolled off the screen. I am pretty sure the 100% is a red herring, because linux likes to reserve 10% of space for root, and the container file was sized to be more than 90% full.
I reran the rsync, under a script(1) to get a log file, and found many errors of the form “can’t mkdir <something or other>”.
Next, I tried mkdir by hand, and it hung. Oops. Ps said it was stalled in state D, which I know to be disk wait. In other words, the ext2 filesystem was damaged. By use of kill -9 and waiting, I was able to unmount the loop device and Drobo, and remount the Drobo.
Next, I tried using fsck to check the container filesystem image.
fsck takes hours to check a 1.2T filesystem. Eventually, it started asking me about random problems and could I authorize it to fix them? After typing “y” a few hundred times, I gave up and killed the fsck and restarted it fsck -p to automatically fix problems. Recall that I don’t actually care if it is perfect, because I can rerun rsync and check the final results using my md5 checksum data.
The second attempt to run fsck didn’t work either:
root@qadgop:~# fsck -a /dev/loop0
fsck 1.41.4 (27-Jan-2009)
/dev/loop0 contains a file system with errors, check forced.
/dev/loop0: Directory inode 54583727, block 0, offset 0: directory corrupted
/dev/loop0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)
Hoping that the fsck -a had fixed most of the problems, I ran it a third time again without -a, but I wound up typing ‘y’ a few hundred more times. fsck took about 300 minutes of CPU time on the Atom to do this work and left 37 MB worth of files and directories in /lost+found.
With the container filesystem repairs, I started a fourth rsync, which actually finished, transferring another 93 MB.
Next step – are the files really all there and all the same? I’ll run the find -exec md5sum to find out.
Um. Well. What does this mean?
root@qadgop:~# wc drobos9.md5 s9.md5
3526171 7052801 503407914 drobos9.md5
3405297 6811036 457871770 s9.md5
The target has 3.5 million files, while the source has 3.4 million files! That doesn’t seem right. An hour of running “du” and comparing the top few levels of directories shows that while rerunning rsync to finish interrupted copies works, you really have to use the same command lines. I had what appeared to be a complete copy one level below a partial copy. After deleting the extra directories, and using fgrep and sed to rewrite the path names in the file of checksums, I was finally able to do a diff of the sorted md5sum files:
Out of 3.4 million files, there were 8 items like this:
< 51664d59ab77b53254b0f22fb8fdb3a8 ./sicortex-archive/stash/97/sha1_97e18c8e2261b09e21b0febd75f61635d7631662_64088060.bin
—
> 127cc574dcb262f4e9e13f9e1363944e ./sicortex-archive/stash/97/sha1_97e18c8e2261b09e21b0febd75f61635d7631662_64088060.bin
1402503c1402502
and one like this:
> 8d9364556a7891de1c9a9352e8306476 ./downloads.sicortex.com/dreamhost/ftp.downloads.sicortex.com/ISO/V3.1/.SiCortex_Linux_V3.1_S_Disk1_of_2.iso.eNLXKu
The second one is easier to explain, it is a partially completed rsync, so I deleted it. The other 8 appear to be files that were copied incorrectly! I should have checked the lengths, because these could be copies that failed due to running out of space, but I just reran rsync on those 8 files in –checksum mode.
Final result: 1.1 Terabytes and 3.4 million files copied. Elapsed time, about a month.
What did I learn?
- Drobo seems like a good idea, but systems that ever need tech support intervention make me nervous. My remaining worry about it is proprietary hardware. I don’t have the PC ecosystem to supply spare parts. Perhaps the right idea is to get two.
- Use linux filesystems to hold linux files. It isn’t just Linus’ and his files that vary only in capitalization, it is also the need to hold special files and symlinks. Container files and loop mounting works fine.
- Keep machines updated. We let these get so far behind that we could no longer install new packages.
- A meta-rsync would be nice, that could use auxiliary data to manage restarts.
- Filesystems really should have end-to-end checksums. ZFS and BTRFS seem like good ideas.
- SMB, or CIFS, or the Drobo, or AFP are not good at metadata operations, it was a fail to try writing large numbers of individual files on the Drobo, no matter how I tried it. SMB read write access to a single big file seems to be perfectly reliable.
wget
I am struggling here decide whether the Bradley Manning proscecutors are disingenuous or just stupid.
I am reacting here to Cory Doctorow’s report that the government’s lawyers accuse Manning of using that criminal spy tool wget.
Notes from the Ducking Stool
I am hoping for stupid, because if they are suggesting to the jury facts they know not to be true, then that is a violation of ethics, their oaths of office, and any concept of justice.
Oh, and wget is exactly what I used, the last time I downloaded files from the NSA.
Really.
A while back, the back issues of the NSA internal newsletter Cryptolog were declassified so I downloaded the complete set. I think the kids are puzzled about why I never mind having to wait in the car for them to finish something or other, but it is because I am never without a large collection of fascinating stuff.
Here’s how I got them, after scraping the URLs out of the agency’s HTML:
wget http://www.nsa.gov/public_info/_files/cryptologs/cryptolog_01.pdf
wget http://www.nsa.gov/public_info/_files/cryptologs/cryptolog_02.pdf
wget http://www.nsa.gov/public_info/_files/cryptologs/cryptolog_03.pdf
wget http://www.nsa.gov/public_info/_files/cryptologs/cryptolog_04.pdf
. . .
wget http://www.nsa.gov/public_info/_files/cryptologs/cryptolog_132.pdf
wget http://www.nsa.gov/public_info/_files/cryptologs/cryptolog_133.pdf
wget http://www.nsa.gov/public_info/_files/cryptologs/cryptolog_134.pdf
wget http://www.nsa.gov/public_info/_files/cryptologs/cryptolog_135.pdf
wget http://www.nsa.gov/public_info/_files/cryptologs/cryptolog_136.pdf
Email Disaster Recovery and Travel adventures
Cathy is off to China for a few weeks. She wanted email access, but not with her usual laptop.
She uses Windows Vista on a plasticy HP laptop from, well, the Vista era. It is quite heavy, and these days quite flaky. It has a tendency to shut down, although not for any obvious reason, other maybe than age, and being Vista running on a plasticy HP laptop.
I set up the iPad, but Cathy wanted a more familiar experience, and needed IE in order to talk to a secure webmail site, so we dusted off an Asus EEE netbook running Windows XP.
I spent a few hours trying to clear off several years off accumulated crapware such as three different search toolbars attached to Internet Explorer, then gave up and re-installed XP from the recovery partition. 123 Windows Updates later, it seemed fine, but still wouldn’t talk to the webmail site. It turns out that Asus thoughtfully installed the open source local proxy server Privoxy, with no way to uninstall it. If you run the Privoxy uninstall, it leaves you with no web access at all. I finally found Interwebs advice to also uninstall the Asus parental controls software, and that fixed it.
Next, I installed Thunderbird, and set it up to work with Cathy’s account on the family compound IMAP server. I wanted it to work offline, in case of spotty WiFi access in China, but after setting that up, so I “unsubscribed” to most of the IMAP folders and let it download. Now Cathy’s inbox has 34,000 messages in it, and I got to thinking “what about privacy?” After all, governments, especially the United States, claim the right to search all electronic devices at the border, and it is also commonly understood that any electronic device you bring to China can be pwned before you come back.
Then I found a setting that tells Thunderbird to download only the last so many days for offline use. Great! But it had already downloaded all 6 years of back traffic. Adjacent, there is a setting for “delete mail more than 20 days (or whatever) old.”
You know what happens next! I turned that on, and Thunderbird started deleting all Cathy’s mail, both locally and on the server. Now there is (farther down the page), fine print that explains this will happen, but I didn’t read it.
Parenthetically, this is an awful design. It really looks like a control associated with how much mail to keep for offline use, but it is not. It is a dangerous, unguarded, unconfirmed command that does irreversible damage.
I thought this was taking too long, but by the time I figured it out, it was way too late.
So, how to recover?
I have been keeping parallel feeds from Cathy’s email, but only since March or so, since I’ve been experimenting with various spam supression schemes.
I had made a copy of Cathy’s .maildir on the server, but it was from 2011.
But wait! Cathy’s laptop was configured for offline use, and had been turned off. Yes! I opened the lid and turned off WiFi as quickly as possible, before it had a chance to sync. (Actually, the HP has a mechanical switch to turn off WiFi, but I didn’t know that.) I then changed the username/password on her laptop Thunderbird to stop further syncing.
Next, since the horse was well out of the barn, I made a snapshot of the server .maildir, and of the HP laptop’s Thunderbird profile directories. Now, whatever I did, right or wrong, I could do again.
Time for research!
What I wanted to do seemed clear: restore the off-line saved copies of the mail from the HP laptop to the IMAP server. This is not a well-travelled path, but there is some online advice:
http://www.fahim-kawsar.net/blog/2011/01/09/gmail-disaster-recovery-syncing-mail-app-to-gmail-imap/
https://support.mozillamessaging.com/en-US/kb/profiles
The general idea is:
- Disconnect from the network
- Make copies of everything
- While running in offline mode, copy messages from the cached IMAP folders to “Local” folders
- Reconnect to the network and sync with the server. This will destroy the cached IMAP folders, but not the new Local copies
- Copy from the Local folders back to the IMAP server folders
Seems simple, but in my case, there were any number of issues:
- Not all server folders were “subscribed” by Thunderbird, and I didn’t know which ones were
- The deletion was interrupted at some point
- I didn’t want duplicated messages after recovery
- INBOX was 10.3 GB (!)
- The Thunderbird profile altogether was 23 GB (!)
- The HP laptop was flakey
- Cathy’s about to leave town, and needs last minute access to working email
One thing at a time.
Tools
I found out about “MozBackup” and used it to create a backup copy of the HP laptop’s profile directory.
MozBackup
MozBackup creates a zip file of the contents of a Thunderbird profile directory, and can restore them to a different profile on a different computer, making configuration changes as appropriate. This is much better than hand editting the various Thunderbird configuration files.
Hardware problems
As I mentioned, the HP laptop is sort of flakey. I succeeded in copying the Thunderbird profile directory, but 23 GB worth of copying takes a long time on a single 5400 rpm laptop disk. I tried copying to a Mybook NAS device, but it was even slower. What eventually worked, not well, but adequately, was copying to a 250GB USB drive.
I decided to leave the HP out of it, and to do the recovery on the netbook, the only other Windows box available. I was able to create a second profile on the netbook, and restore the saved profile to it, slowly, but I realized Cathy would leave town before I finished all the steps, taking the netbook with her. Back to the HP.
First I tried just copying the IMAPMail subfolder files of mbox files and msf files to LocalFolders. This seemed to work, but Thunderbird got very confused about it. It said there were 114000 messages in Inbox, rather than 34000. This shortcut is a dead end.
I created a new profile on the HP, and restored the backup using MozBackup (which took 2 hours), and started it in offline mode. I then tried to “select-all” in Inbox to copy them to a local folder. Um. No. I couldn’t even get control back. Thunderbird really cannot select 34000 messages and do anything.
Because I was uncertain about the state of the data, I restored the backup again (another 2 hours).
This time, I decided to break up Inbox into year folders, each holding about 7000 messages. The first one worked, but then the HP did an undexpected shutdown during the second, and when it came back, Inbox was empty! The Inbox mbox file had been deleted.
I did another restore, and managed to create backup files for 2012 and 2011 messages, before it crashed again. (And Inbox was gone AGAIN)
The technique seemed likely to eventually work, but it would drive me crazy. Or crazier.
I was now accumulating saved Local Folder files representing 3 specific years of Inbox. I still had to finish the rest, deal with Sent Mail, and audit about 50 other subfolders to see if they needed to be recovered.
I wasn’t too worried about all the archived subfolders, since they hadn’t changed in ages and were well represented by my 2011 copy of Cathy’s server .maildir
Digression
What about server backups? Embarassing story here! Back in 2009, Win and I built some nice mini-ATX atom based servers with dual 1.5T disks run in mirrored mode for home servers. Win’s machine runs the IMAP, and mine mostly has data storage. Each machine has the mirrored disks for reliabiltiy and a 1.5T USB drive for backup. The backups are irregularly kept up to date, and in the IMAP machines case, not recently.
About 6 months ago, I got a family pack of CrashPlan for cloud backup, and I use it for my Macbook and for my (non IMAP) server, but we had never gotten around to setting up CrashPlan for either Cathy’s laptop or the IMAP server.
A few months ago, we got a Drobo 5N, and set it up with 3 3T disks, for 6T usable storage, but we haven’t gotten it working for backup either. (I am writing another post about that.)
So, no useful server backups for Cathy’s mail.
Well now what?
I have a nice Macbook Pro, unfortunately, the 500 GB SSD has 470 GB of data, not enough for one copy of Cathy’s cached mail, let alone two. I thought about freeing up space, and copied a 160 GB Aperture photo library to two other systems, but it made me nervous to delete it from the Macbook.
I then tried using Mac Thunderbird to set up a profile on that 250 GB external USB drive, but it didn’t work because the FAT filesystem couldn’t handle Mac Thunderbird’s need for fancy filesystem features like ACLs, but this triggered an idea!
First, I was nervous about using Mac Thunderbird to work on backup data from a PC. I know that Thunderbird profile directories are supposed to be cross-platform, but the config files like profile.ini and prefs.js are littered with PC pathnames.
Second, the USB drive is slow, even if it worked.
Up until recently, I’ve been using a 500 GB external Firewire drive for TimeMachine backups of the Macbook. It still was full of Time Machine data, but I’ve switched to using a 1T partition on the Drobo for TimeMachine. I also have the CrashPlan backup. So I reformatted the Firewire Drive to HFS, and plugged it in as extra storage.
Also on the Macbook, is VMWare Fusion, and one of my VMs is a 25 GB instance running XP Pro.
I realized I should be able to move the VM to the Firewire drive, and expand its storage by another 50 GB or so to have room to work on the 23 GB Thunderbird data.
To the Bat Cave!
It turns out to be straightforward to copy a VMWare image to another place, and then run the copy. Rather than expand the 25GB primary disk, I just added a second virtual drive and used XP Disk management to format it as drive E. I also used VMWare sharing to share access to the underlying Mac filesystem on the Firewire drive.
- Copy VMWare image of XP to the Firewire drive
- Copy MozBackup save file of the cached IMAP data and the various Local Files folders to the drive
- Create second disk image for XP
- Run XP under VMWare Fusion on the Macbook, using the Firewire drive for backing store
- Install Thunderbird and MozBackup
- Use Mozbackup to restore Cathy’s cached local copies of her mail from the flakey HP laptop
- Copy the Local Files mailbox files for 2013, 2012, and 2011 into place.
- Use XP Thunderbird running under VMWare to copy the rest of the cached IMAP data into Local Folders.
- By hand, compare message counts of all 50 or so other IMAP folders in the cached copy with those still on the server, and determine they were still correct.
- Go online, letting Thunderbird sync with the server, deleting all the locally cached IMAP data.
- Create IMAP folders for 2007 through 2013, plus Sent Mail and copy the roughly 40000 emails back to the server.
Notes
During all of this, new mail continued to arrive into the IMAP server, and be accessible by the instance of Thunderbird on the netbook.
A copy of Cloudmark Desktop One was active running on the Macbook using Mac Thunderbird to do spam processing of arriving email in Cathy’s IMAP account.
My psyche is scarred, but I did manage to recover from a monstrous mistake.
Lessons
- RAID IS NOT BACKUP
The IMAP server was reliable, but it didn’t have backups that were useful for recovery.
- Don’t think you understand what a complex email client is going to do
Don’t experiment with the only copy of something! I should have made a copy of the IMAP .maildir in a new account, and then futzed with the netbook thunderbird to get the offline use storage the way I wanted.
- Quantity has a quality all its own.
This quote is usually about massive armies, but in this case, the very large email (23 GB) just made the simplest operations slow, and some things (like selecting ALL in a folder with 34000 messages, impossible.) I had to go through a lot of extra work because various machines didn’t have enough free storage, and had other headaches because the MTBF of the HP laptop was less than the time to complete tasks.
-Larry
Hypervisor Hijinks
At my office, we have a rack full of Tilera 64-core servers, 120 of them. We use them for some interesting video processing applications, but that is beside the point. Having 7680 of something running can magnify small failure rates to the point that they are worth tracking down. Something that might take a year of runtime to show up can show up once an hour on a system like this
Some of the things we see tend, with some slight statistical flavor, to occur more frequently on some nodes than on others. That just might make you think that we have some bad hardware. Could be. We got to wondering whether running the systems at slightly higher core voltages would make a difference, and indeed, one can configure such a thing, but basically you have to reprogram the flash bootloaders on 120 nodes. The easiest thing to do was to change both the frequency and the voltage, which isn’t the best thing to do, but it was easy. The net effect was to reduce the number of already infrequent faults on the nodes where they occurred, but to cause, maybe, a different sort of infrequent fault on a different set of nodes.
Yow. That is NOT what we wanted.
We were talking about this, and I said about the stupidest thing I’ve said in a long time. It was, approximately:
I think I can add some new hypervisor calls that will let us change the core voltage and clock frequency from user mode.
This is just a little like rewiring the engines of an airplane while flying, but if it were possible, we could explore the infrequent fault landscape much more quickly.
But really, how hard could it be?
Tilera, to their great credit, supplies a complete Multicore Development Environment which includes the linux kernel sources and the hypervisor sources.
The Tilera version of Linux has a fairly stock kernel which runs on top of a hypervisor that manages physical chip resources and such things as TLB refills. There is also a hypervisor “public” API, which is really not that public, it is available to the OS kernel. The Tilera chip has 4 protection rings. The hypervisor runs in kernel mode. The OS runs in supervisor mode, and user programs can run in the other two. The hypervisor API has things like load this page table context, or flush this TLB entry, and so forth.
As part of the boot sequence, one of the things the hypervisor does is to set the core voltage and clock frequency according to a little table it has. The voltage and frequency are set together, and the controls are not accessible to the Linux kernel or to applications. Now it is obviously possible to change the values while running, because that is what the boot code does. What I needed to do was to add some code to the hypervisor to get and set the voltage and frequency separately, while paying attention to the rules implicit in the table. There are minimum and maximum voltages and frequencies beyond which the chip will stop working, and there are likely values that will cause permanent damage. There is also a relation between the two – generally higher frequencies will require higher voltages. Consequently it is not OK to set the frequency too high for the current voltage, or to set the voltage too low for the current frequency.
Fine. Now I have subroutine calls inside the hypervisor. In order to make them available to a user mode program running under Linux, I have to add hypervisor calls for the new functions, and then add something like a loadable kernel module to Linux to call them and to make the functionality available to user programs.
The kernel piece is sort of straightforward. One can write a loadable kernel module that implements something called sysfs. These are little text files in a directory like /sys/kernel/tilera/ with names like “frequency” and “voltage”. Through the magic of sysfs, when an application writes a text string into one of these files, a piece of code in the kenel module gets called with the string. When an application reads one of these files, the kernel module gets called to provide the text.
Now, with the kernel module at the top, and the new subroutines in the hypervisor at the bottom, all I need to do is wire them together by adding new hypervisor calls.
Hypervisor calls make by linux are done by hypervisor glue. The glue area starts at 0x10000 above the base of the text area, and each possible call has 0x20 bytes of instructions available.
Sometimes, such as “nanosleep”, the call is implemented inline in those 0x20 bytes. Mostly, the code in the glue area loads a register with a call number and does a software interrupt.
The code that builds the glue area is is hv/tilepro/glue.S.
For example, the nanosleep code is
hv_nanosleep: /* Each time through the loop we consume three cycles and * therefore four nanoseconds, assuming a 750 MHz clock rate. * * TODO: reading a slow SPR would be the lowest-power way * to stall for a finite length of time, but the exact delay * for each SPR is not yet finalized. */ { sadb_u r1, r0, r0 addi r0, r0, -4 } { add r1, r1, r1 /* force a stall */ bgzt r0, hv_nanosleep } jrp lr fnop
while most others are
GENERIC_SWINT2(set_caching)
or the like. where GENERIC_SWINT2 is a macro:
#define GENERIC_SWINT2(name) .align ALIGN ; hv_##name: moveli TREG_SYSCALL_NR_NAME, HV_SYS_##name ; swint2 ; jrp lr ; fnop
The glue.S source code is written in a positional way, like
GENERIC_SWINT2(get_rtc) GENERIC_SWINT2(set_rtc) GENERIC_SWINT2(flush_asid) GENERIC_SWINT2(flush_page)
so the actual address of the linkage area for a particular call like flush_page depends on the exact sequence of items in glue.S. If you get them out of order or leave a hole, then the linkage addresses of everything later will be wrong. So to add a hypercall, you add items immediately after the last GENERIC_SWINT2 or ILLEGAL_SWINT2
In the case of the set_voltage calls we have:
ILLEGAL_SWINT2(get_ipi_pte) GENERIC_SWINT2(get_voltage) GENERIC_SWINT2(set_voltage) GENERIC_SWINT2(get_frequency) GENERIC_SWINT2(set_frequency)
With this fixed point, we work in both directions, down into the hypervisor to add the call and up into linux to add something to call it.
Looking back at the GENERIC_SWINT2 macro, it loads a register with the value of a symbol like HV_SYS_##name where name is the argument to GENERIC_SWINT2. This is using the C preprocessor stringification operator ## that concatenates. So
GENERIC_SWINT2(get_voltage)
expects a symbol named HV_SYS_get_voltage. IMPORTANT NOTE – the value of this symbol has nothing to do with the hypervisor linkage area, it is only used in the swint2 implementation. The HV_SYS_xxx symbols are defined in hv/tilepro/syscall.h and are used by glue.S to build the code in the hypervisor linkage area and also used by hv/tilepro/intvec.S to build the swint2 handler.
In hv/tilepro/intvec.S we have things like
syscall HV_SYS_flush_all, syscall_flush_all syscall HV_SYS_get_voltage, syscall_get_voltage
in an area called the syscall_table with the comment
// System call table. Note that the entries must be ordered by their // system call numbers (as defined in syscall.h), but it's OK if some numbers // are skipped, or if some syscalls exist but aren't present in the table.
where syscall is a Tilera assembler macroL
.macro syscall number routine .org syscall_table + ((number) * 4) .word routine .endm
And indeed, the use of .org makes sure that the offset of the entry in the syscall table matches the syscall number. The second argument is a symbol elsewhere in the hypervisor sources of code that implements the function.
In the case of syscall_get_voltage, the code is in hv/tilepro/hw_config.c:
int syscall_get_voltage(void) { return(whatever); }
So at this point, if something in the linux kernel manages to transfer control to text + 0x10000 + whatever the offset of the code in glue.S is, then a swint2 with argument HV_SYS_get_voltage will be made, which will transfer control in hypervisor mode to the swint2 handler, which will make a function call to syscall_get_voltage in the hypervisor.
But what is the offset in glue.S?
It is whatever you get incrementally by assembling glue.S, but in practice, it had better match the values given in the “public hypervisor interface” which is defined in hv/include/hv/hypervisor.h
hv/include/hv/hypervisor.h has things like
/** hv_flush_all */ #define HV_DISPATCH_FLUSH_ALL 55 #if CHIP_HAS_IPI() /** hv_get_ipi_pte */ #define HV_DISPATCH_GET_IPI_PTE 56 #endif /* added by QRC */ /** hv_get_voltage */ #define HV_DISPATCH_GET_VOLTAGE 57
and these numbers are similar to, but not identical to thos in syscall.h. Do not confuse them!
Once you add the entries to hypervisor.h, it is a good idea to check them against what is actually in the glue.o file. You can use tile-objdump for this:
tile-objdump -D glue.o
which generates:
... 00000700 <hv_get_ipi_pte>: 700: 1fe6b7e070165000 { moveli r0, -810 } 708: 081606e070165000 { jrp lr } 710: 400b880070166000 { nop ; nop } 718: 400b880070166000 { nop ; nop }
00000720 <hv_get_voltage>: 720: 1801d7e570165000 { moveli r10, 58 } 728: 400ba00070166000 { swint2 } 730: 081606e070165000 { jrp lr } 738: 400b280070165000 { fnop } ...
and if you divide HEX 720 by HEX 20 you get
I use bc for this sort of mixed-base calculating:
stewart$ bc bc 1.06 Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. ibase=16 720/20 57 ^Dstewart$
and we see that we got it right, the linkage number for get_voltage is indeed 57
Now let’s turn to Linux. The archtecture dependent stuff for Tilera is in src/sys/linux/arch/tile
The idea is to build a kernel module that will implement a sysfs interface to the new voltage and frequency calls.
The module get and set routines will call hv_set_voltage and hv_get_voltage.
The hypervisor call linkage is done by linker magic, via a file arch/tile/kernel/hvglue.lds, which is a linker script. In other words, the kernel has no definitions for these hv_ symbols, they are defined at link time by the linker script. For each hv call, it has a line like
hv_get_voltage = TEXT_OFFSET + 0x10740;
and you will recognize our friend 0x740 as the offset of this call in the hypervisor linkage area. Unfortunately, this doesn’t help with a separatley compiled module because it doesn’t have a way to use such a script (when I try it, TEXT_OFFSET is undefined, presumably that is part of the kernel main linker script. )
So to make a hypervisor call from a loadable module, you need a trampoline. I put them in arch/tile/kernel/qrc_extra.c, like this
int qrc_hv_get_voltage(void) { int v; printk("Calling hv_get_voltage()n"); v = hv_get_voltage(); printk("hv_get_voltage returned %dn", v); return(v); } EXPORT_SYMBOL(qrc_hv_get_voltage);
The EXPORT_SYMBOL is needed to let modules use the function.
But where did hvglue.lds come from? It turns out it is not in any Makefile, but rather is made by a perl script in sys/hv/mkgluesyms.pl, except that this perl script optionally writes assembler or linker script output and I had to modify it to select the right branch. The modified version is mkgluesymslinux.pl and is invoked like this:
perl ../../hv/mkgluesymslinux.pl ../../hv/include/hv/hypervisor.h >hvglue.lds
The hints for this come from the sys/bogux/Makefile which does something similar for the bogux example supervisor.
linux/arch/tile/include/hv/hypervisor.h is a near copy of sys/hv/include/hv/hypervisor.h, but they are not automatically kept in sync.
Somehow I think that adding hypervisor calls is not a frequently exercised path.
To recap, you need to:
- have the crazy idea to add hypervisor calls to change the chip voltage at runtime
- edit hypervisor.h to choose the next available hv call number
- edit glue.S to add, in just the right place, a macro call which will have the right offset in the file to match the hv call number
- edit syscall.h to create a similar number for the SWINT2 interrupt dispatch table
- edit intvec.S to add the new entry to the SWINT2 dispatch table
- create the subroutine to actually be called from the dispatch table
- run the magic perl script to transform hypervisor.h into an architecture dependent linker script to define the symbols for the new hv calls in the linux kernel
- add trampolines for the hv calls in the linux kernel so you can call them from a loadable module.
- write a kernel module to create sysfs files and turn reads and writes into calls on the new trampolines
- write blog entry about the above
How are non-engineers supposed to cope?
The Central Vac
Today the central vacuum system stuck ON.
The hose was not plugged in, and toggling the kick-plate outlet in the kitchen did not fix it. That accounted for all the external controls.
The way this works is there is a big cylinder in the basement with the dust collection bin and large fan motor to pull air from the outlets, through the bin, and outside the house. This is a great way to do vacuuming, because all the dusty air gets exhausted outside.
The control for the fan motor is low voltage that comes to two pins at each outlet. When you plug in the hose, the pins are extended through the hose by spiral wires that then connect to a switch at the handle. You can also active the fan by shorting the pins in the outlet with a coin. Each outlet has a cover held closed by a spring. You open the cover to insert the hose. The covers generally keep all the outlets sealed except the one with the hose plugged in.
The outlets are all piped together with 1 1/2 inch PVC pipe to the inlet of the central unit. The contact pins at all the outlets are connected in parallel, so shorting any of them turns on the motor.
We also have a kickplate outlet in the kitchen – turn it on and sweep stuff into it. The switch for that is activated by a lever that also uncovers the vacuum pipe.
I ran around the house to make sure nothing was shorting the terminals in the outlets.
Next, I went to the cellar to look at the central unit. Unplugging it made it stop (good!) but plugging it back it made it start again. That was not good.
I noticed that the control wires were connected to the unit via quick connects, so I unplugged them. The unit was still ON, which meant the fault was inside the central unit.
I stood on a chair and (eventually) figured out that the top comes off, it is like a cookie tin lid. Inside the top was the fan motor (hot!) and some small circuit board with a transformer, some diodes, and a black block with quick connect terminals. The AC power went to the block and the motor wires went to the block. I imagine that the transformer and the diodes produce low voltage DC for the control circuit, and the block is a relay activated by the low voltage.
Relays can stick ON if their contacts wear and get pitted, or there could be a short that applied power to the relay coil.
I blew the dust off the circuit board, and gave the block a whack with a stick.
That fixed it.
I just don’t see what a non-engineer would do in this situation, except let the thing run until the thermal overload tripped in the fan motor (I hope it has one!) and call a service person. Even if the service folk know how to fix it without replacing the whole unit, it is going to cost $80 ro $100 for a service call.
I don’t have any special home-vacuum-system powers, but I have a general idea how it works, and a comfort level with electricity that I don’t mind taking the covers off things. This time it worked out well.
The Dishwasher
For completeness, I should relate the story of our Kitchenaid dishwasher. One day something went wrong with the control panel, so I took it apart. It wasn’t working, and I thought I couldn’t make it much worse. I was wrong about that.
I didn’t really know the correct disassembly sequence, and I took off one too many screws. The door was open flat, and taking off the last screw let the control panel fall off, tearing a kapton flex PC board cable in two. The flex cable connected the panel to some other circuit board. I spent a couple of days carefully trying to splice the cable by scraping off the insulation and soldering jumpers to the exposed traces, but I couldn’t get the jumpers to stick. New parts would have cost about $300, and the dishwasher wasn’t that new. We eventually just bought a new Miele and that was the Right Thing To Do, because the Miele is like a zillion times better. It has built in hard-water softeners, and doesn’t etch the glasses, and doesn’t melt plastic in the lower tray, and is generally awesome.
So OK, sometimes you can fix it yourself, and sometimes you should really just call an expert. How are you supposed to know which is the case?
The Garage Door Opener
Every few years, the opener stopped working. It would whirr, but not open the door. The first time this happened, I took it apart. Now you should be really careful around garage door openers, because there is quite a lot of energy stored in the springs, but if you don’t mess with the springs, the rest of it is just gears and motors and stuff.
On mine, the cover comes off without disconnecting anything. Inside there is a motor which turns a worm gear, which turns a regular gear, which turns a spur chain wheel, which engages a chain, which carries a traveller, which attaches to the top of the door. The door is mostly counterbalanced by the springs. With the cover off, you could see that the (plastic) worm gear had worn away the plastic main gear, so the motor would just spin. The worm also drove a screw that carries along some contacts which set the full open and full closed travel, which stops and reverses the motor. The “travel” adjustments just move the fixed contacts so the moving contacts hit them earlier or later.
An internet search located gear kits for 1/3 or 1/4 the price of a new motor, and I was able to fix it.
Last time the opener stopped working, however, the symptoms were different – no whirring. The safety sensors appeared to be operational, because their pilot lights would blink when you blocked the light beam. I suspected the controller circuit board had failed. A replacement for that would be about 1/2 the cost of a new motor unit and I wan’t positive that was the trouble, so I just replaced the whole thing. The new one was nicely compatible with the old tracks, springs, and sensors.
A few weeks later, my neighbor’s opener failed in the whirring mode, so we swiped the gears from my old motor unit with the bad circuit board and fixed it for free.
Take aways
Don’t be afraid to take things apart, at least if you have a reasonable expectation that you are not going to make it worse.
Or – Good judgement comes from experience, but experience comes from bad judgement. (Mulla Nasrudin)
… and just maybe, go ahead and get service contracts for complicated things with expensive repair parts, like that Macbook Pro or HE washing machine, particularly when the most-likely-to-fail part is electronic in nature.
So I usually get AppleCare, and we have a service contract for the new Minivan, and for the washing machine, but <not> for the clothes dryer, since it doesn’t appear to have any electronics inside. I was able to fix that by replacing the clock switch myself.
But how are non-engineers supposed to cope?
What I do
I used Splasho’s “Up-Goer Five Text Editor.” to write what I do, using only the most common 1000 words in English
In my work I tell computers what to do. I write orders for computers that tell them first to do this,and then to do that, and then to do this again.
Sometimes the orders tell the computer to listen for other orders from people. Then the orders tell the computer how to do what the people want, and then the orders tell the computer to show the people what the answer is.
I used to build computers. I would take one part, and another part, and many more parts, and put them together in just the right way so the computer would work right. Computers are all the same, they listen for an order, then do what it says, then listen for another order. We use them because they do this thing very very very very fast.
Equal Protection of the Law
I’ve been casting about for a way to follow up on my outrage of the government’s treatment of Aaron Swartz.
I wonder if the government’s conduct represents a violation of the equal protection clause of the constitution.
The 14th amendment says
…nor shall any State deprive any person of life, liberty, or property, without due process of law; nor deny to any person within its jurisdiction the equal protection of the laws.
Evidently this doesn’t apply to the federal government as written, but in Bolling v. Sharpe in 1954, the Supreme court got to the same point via the Due Process clause of the 5th amendment.
I think all governments, state, federal, and local, are bound to provide equal protection.
In the Swartz case, we have the following mess
- Congress writes vague laws
- Congress fails to update those laws as technology and society evolve
- Prosecutors use their discretion to decide who to charge
- Prosecutors use pre-trial plea bargaining to avoid the scrutiny of the courts
It would be nice to have a case before the Supreme Court, leading to a clear ruling that equal protection applies to the actions of prosecutors. I suspect that would also give us proportional responses to crimes, although I am not sure about that.
In the medium term, Congress needs to act. I’d suggest a law repealing all laws more than 20 years old. Sunset provisions need to be in all laws. The ones that make ongoing sense can be reauthorized, but it will take a new vote every time. (Maybe laws forbidding action by the government should be allowed to stand indefinitely, while laws forbidding action by the people will have limited terms.)
In the short term, we need action by the executive branch, to provide equal protection, control of pre-trial behavior of prosecutors, and accountability of both prosecutors and law enforcement.