Home Networking Troubleshooting

Sometimes a technological scramble is triggered by the most mundane events.  In this case, the season finale of “X Factor”.

Last night, there was a special church choir rehearsal for the Christmas Eve services, and all seven of Win’s and my kids went.  Since the rehearsal would overlap the broadcast finale of X Factor, Erica asked Win to record it.  Maybe the appearance of 1 Direction had something to do with it as well.

We used to have Replay TVs to solve things like this, and cable TV to deliver the bits, but the conversion to digital TV and the crazy anti-customer behavior of Comcast has changed all that.  We don’t get cable, and the TV is hooked up to an antenna.  We’ve also got a Silicon Dust HDHomeRun network tuner connected to the antenna on my front porch, so we can watch TV on any computer as well.  Win has the copy of EyeTV that came with the HDHomeRun, and he planned to record the show.

About an hour before air time, he called to ask me about video artifacts and bad audio.   I said I’d take a look.

I used hdhomerunner (a now lost Google Code project to develop an open source HDHomeRun control program) and directed the video to VLC running on my Macbook Pro.  Indeed, the video was blocky and the audio spotty.

I power cycled the HDHomeRun, replaced the ethernet cable, and plugged it into a different switch port on the 16-port gigE switch.  No change.  I looked for firmware upgrades, and found the device running 4-year old firmware.  The upgrade went smoothly, but there was no change in video quality.

After sitting and swiveling back and forth for a while, I went back downstairs and plugged the device into the 100 Mbps switch instead of the 1000 Mbps switch.  I had some vague memory that the negotiation doesn’t always work right.  This fixed the problem and I was able to watch good video and audio with VLC.

Win called back to report his video was still breaking up.  This suggested some other networking problem between the houses.

Backgound.  Win and I are neighbors, and we have a conduit between the houses with a couple of outdoor rated Cat V cables and a 6-fiber multimode fiber.  One pair of fibers are connected to 1000base-SX media converters at the two ends and plugged into the house gigE switches.

I remembered once setting up netperf on the home servers, and indeed it was still installed.  Win’s house to mine reported 918 Mbps, but mine to Win’s reported 16! At this point, there wasn’t much time to debug the networking, and X Factor was about to start.

I remembered that VLC can record an  input video stream, and set that up to record the program on my Macbook.  (I had 45 GB free on disk, and the program was running at 2 Megabytes/second, so it would take 14 GB for the two hours.  No doubt there is a way to transcode, but not enough time to learn how to do it!)

The VLC recording froze once, at about the one hour point, but I only missed a couple of minutes.  I copied the files to an external USB drive for sneakernet delivery.

This morning, Win and I started taking a look at the networking.  First, we got netperf running on our respective Macbook and iMacs, in order to figure out if the link was bad or one of the home servers.  I was able to talk both ways to my server at about 600 Mbps, and Win to his at about 95 Mbps.  Win’s results are explained by a fast Ethernet hop somewhere, but all these rates are way above the pitiful 16 Mbps across the fiber.

Next Win wiggled his connectors, dropping the path to about 6 Mbps.  We swapped the transmit and receive fibers at both ends, and the direction of the problem did not change.  It was looking more and more like a bad media converter.

I was staring at the wiring in my basement, wondering if we could use the copper link as backup while waiting for parts.  It never worked very well, but we did use it to cross connect our DMZs before the firewalls at one point.  I found the cable, and found it plugged into the ethernet switch on the back of my FIOS router – with LINK active!  Huh?  What was it plugged into at Win’s end?  He reported it plugged into a small switch, but that it wasn’t easy to tell what else was plugged in.

For experiment, we unplugged the copper link and … Win lost Internet access.  Evidently (a) his routes were set to use the Serissa business FIOS rather than his home Comcast, and (b) the traffic was going over this moldy waterlogged CatV instead of our supposedly shiny gigabit fiber.  Now the gears are turning.  If we did have a loop in the switch topology, then it was entirely possible that one direction between the houses would use the fiber while the other direction would use the copper.  I don’t know much about how these cheap switches figure out things like that.    We tried unplugging the fiber, forcing all traffic onto copper, but the netperf results were much worse.  ping seemed to work, and ping -c 1000 gave fairly good results, but ping -c 1500 had a lot of trouble.  That would explain why, generally, ping and ssh seemed to work but netperf gave bad results.

We unplugged the copper and plugged the fiber back in, and after a few seconds, the asymmetrical performance resumed.  I’ve placed an order for another media converter, and we’ll see if that fixes it.  At least they now cost half as much as when we got the first pair!

So, there was a lot going on here.

The hdhomerun was plugged into a gigabit switch, and working poorly.  Changing to fast Ethernet fixed that.

The topology loop was routing off-site traffic over a poor copper link, but it was working well enough that we didn’t notice.

The media converter is probably bad, working well in one direction but not the other, and probably that explains the poor video quality .

And Erica gets to watch 1 Direction.

How are just plain folks supposed to figure this stuff out?

UPDATE

The new media converter arrived… and didn’t fix the problem.  Well we have a spare now!  The actual problem was a bad 8-port switch in Win’s basement, which we belatedly figured out once ruling out the fiber.  We could have tested the link standalone by plugging computers into both ends, but we did’t think of it.  Does gigE need crossover cables to do that? Or is the magic echo cancellation make crossover cables unneccesary?