Update: 12/30, 10 AM – problem appears fixed. Will call to find out what it was.
So on 12/22, Win noticed that email to Cathy wasn’t being delivered. She’s using an IMAP server here at Serissa Galactic HQ, and our mail gateway, hosted on a virtual machine at Rackspace, normally delivers her mail to the IMAP server.
By two days ago, we figured out that in fact we can’t establish TCP connections between the mail gateway and systems at Serissa that happen to use a particular one of our 5 static IP addresses. The others work fine.
This is just weird, but the VZ supplied Actiontec MI424 router is, well, just weird . . . but the problem isn’t the router. After several hours of trying to configure various port forwarding and static NAT setups in the router, I called Verizon tech support. After about 2 hours of phone hell, I got through to a fellow who was, well, clueful. It turns out you can set up screen sharing with them, and jointly click around in the router configuration screens. The support rep eventually agreed with me that the problem existed, but at midnight December 23, there was not much help to be had from Actiontec. He suggested connecting our system upstream of the router with a a switch, or using a different router if we had one.
I did not know that is now Verizon FIOS static IP works, but it makes a lot of sense. There is an ethernet between the optical network terminal (ONT) and the Verizon supplied router, but you don’t have to use their router. I unplugged the router and plugged in my macbook. I said to Win “OK, I’m on the Internet… wait. I am ON the internet!” I have actually never been before directly connected, not through a firewall, since Arpanet days. Cool.
We have five IP addresses, and with the macbook running tcpdump, it was easy to see what wasn’t happening. With the macbook configured with our .10 address, we would attempt to open a TCP connection to our cloud system, but never got any replies. Attempts to open connections from the cloud end never showed up. By running netstat on the cloud end, we could see connections in “SYN_RCVD” state, but not getting ESTABLISHED. Packets were going out, but not coming in. Incidently, and strangely, ping, traceroute, and other ICMP stuff worked fine.
By changing the macbook IP address from .10 to .11 (another of our static IPs), it worked fine.
This was enough evidence to open a trouble ticket at Verizon. We were told that they would get back to us in 24 hours…NOT.
In the meantime. We changed our IMAP server to static NAT on a working IP address, and changed the port forwarding for inbound SMTP to match. Now future email could be delivered, but 400 odd messages were stuck in the queue. Win figured out how to add new Postfix rules to rerun the queue and translate the address, and we cleared the backlog.
Win also noticed that we can’t talk to www.dropbox.com, which may be hosted by Rackspace as well. The IP address is on a different class B, but it isn’t much different. Same symptom. We can’t talk to dropbox via our .10 address, but we can via .11 or other.
Christmas evening, after about 48 hours of silence from Verizon, I tried to get the trouble ticket status. This is quite difficult. There is evidently no online way to do it, you have to go through phone hell. After a few tries, I again got a skillful and helpful tech. He told me that the ticket was assigned to the network techs, but there were no comments indicating anyone was working on it. However, he searched around and found an outage report saying, roughly, that Massachusetts business fios static IP customers can’t talk to certain websites, and this outage report now had 75 trouble tickets linked to it. He said he couldn’t tell me about other customers, but did mention trouble contacting www.experian.com, so I tried it. We can’t talk to www.experian.com from .10 but we can from .11.
Our trouble ticket is now number 76, but there is no clue about who or when anyone might work on the problem. Evidently other folks are much worse off than us, with their credit card processing machines unable to talk to the processors.
I will call back tomorrow or Monday to see what is going on.
I find this fascinating, but now fairly stress-free since Cathy’s email has been delivered. What could cause reliable lossage of TCP connection setup, between stable, but seemingly random addresses? Works fine for ICMP, but fails every time for SYN packets. fios-10.serissa.com fails, but fios-11.serissa.com works. www.experian.com fails, but www.google.com works. Maybe a corrupted hash table somewhere? It seems like a very subtle and mysterious kind of thing.
Oh. This blog is hosted by our cloud system, so I can’t talk to it via FIOS. I’ve changed my laptop’s default route to use Win’s Comcast DSL instead, which works fine. More proof that having a gigabit fiber between our houses is just a good idea.
One of the many problems with the Internet is that most people are at the mercy of their ISP. The ISP controls the last mile and you have no real alternative. Serissa happens to have both FIOS and Comcast links, but that isn’t as useful as you might think. Inbound traffic knows about one or the other, and failover is manual and tedious. I think we need an ASN so we can just let BGP deal with this, but that solution doesn’t scale well.
Update 12/26/2010 9 PM
We’ve found that our other IP addresses also don’t work … to different sets of sites. For example, .11 can’t reach www.patternreview.com.
I called Verizon at 888 244 8880 to report this and to find out ticket status. I was on hold for 35 minutes and reached a fairly clueless agent this time. He couldn’t get any information out of the network technician group, which probably means that no one is working on the problem. He was able to pull up the group outage report RIEH032H87.
I asked why I couldn’t get online status, and he says because my trouble ticket is linked to a group ticket, I can’t see status anymore. That seems unlikely.
I’ve created a #fios hashtag on Twitter, just for fun.
Update Monday 12/27/2010 11 PM
I called Verizon again to find out if there is any progress. Evidently the problem has been passed up from the network technicians to IP Engineering, and the NOC. This seems good. However, according to the rep I talked to, they are looking into a theory that traceroutes along affected paths are showing the trouble outside the Verizon network.
That doesn’t match what I see. As an example, from our .10 IP address, we cannot reach www.stewart.org (this blog). However, traceroute works. From our .11 IP address, we cannot reach www.patternreview.com (never mind), but traceroute works. From .10, patternreview works fine, and from .11, stewart.org works fine.
Here’s (part of) the trace for .11 to patternreview.com
Here’s part of the trace for .10 to www.stewart.org
4 so-7-2-0-0.BOS-BB-RTR2.verizon-gni.net (220.127.116.11) 9.101 ms 9.121 ms 9.028 ms
5 0.so-0-2-0.XL4.BOS4.ALTER.NET (18.104.22.168) 18.682 ms 18.757 ms 21.011 ms
6 0.xe-4-1-0.XL4.NYC4.ALTER.NET (22.214.171.124) 21.096 ms 19.589 ms 19.308
The only common elements there are verizon (and the fact that the paths both go into Alternet.
Both traceroutes work all the way to the destinations, it is just TCP SYN/ACK packets that don’t come back.
I’ve heard a theory that someone is blacklisting fios addresses. Until yesterday, we never used .11 for outbound connections, so I am skeptical.
In other news, we got about 14 inches of snow here. The kids are happy.
Update Tuesday 12/28/2010
Today’s wait on 888-244-8880 was 28 minutes. Verizon needs better music on hold.
The representative today said the problem affects 71.x.x.x addresses (true) because when the 71 addresses were assigned to Verizon, website admins are notified to unblock them, but sometimes they don’t.
This is a fairly pathetic claim. We’ve had the addresses for 5 years, they worked fine until a week ago, I control a machine I can’t talk to from one of my addresses, and ICMP traffic works fine, just not TCP.
It sounds like Verizon still has a theory about websites blacklisting Verizon addresses. I think it is much more likely that some fancy router in the broken paths has a bad memory module, My guess about which one it is based on the rather small differences between traceroutes of working paths and non-working paths. All of the non-working paths I know about pass from Verizon to Alternet in New York, for example, before branching off into other networks. Try rebooting
6 0.ae1.BR2.NYC4.ALTER.NET (126.96.36.199) 26.290 ms 24.964 ms 24.691 ms
and see if that helps…
Update 12/29 at 11 PM
I called Verizon again. As expected, there was a 35 minute wait on hold, and the representative said “they are still working on it”. I asked for a supervisor and got very little more. There are now 120 tickets linked to the group outage (up from 57), but there have been no comments added to the log since 12/27. I suggested that certainly gave me the impression Verizon didn’t take the problem very seriously.
17 thoughts on “Verizon FIOS Static IP routing outage”
Short of a routable ASN you could get a range of mobile IPv6 addresses with a reliable upstream home which would provide your ISP address as the in-care-of. When one ISP goes down, you update the in-care-of to the other ISP. Also, this would give you the ability to move individual hosts (per mobile IP) around anywhere you want on the internet. Of course, this does require IPv6 support from your ISP or sending all of your traffic to a tunnel broker.
Similar problems here. Not sure where you are physically located, I’m in eastern Massachusetts.
I have Fios business service with static IP. I first noticed the problem because a blog I have hosted at that address was receiving huge amounts of spam, turns out akismet spam filtering service was unreachable from the address (well, two of the akismet IPs are failing, two are working).
I also can’t get to wordpress.org (188.8.131.52), although I can get to wordpress.com just fine. Another example is american.redcross.org (184.108.40.206) , although redcross.org itself is okay.
I’m about to call Verizon now. I had resisted the urge to escalate, in hopes that it would go away on its own, but it’s been about 5 days now, and it hasn’t been resolved.
I don’t have another WAN IP to test, unfortunately, but I suspect I’d see the same thing you have been. My best guess is that someone (not Verizon unfortunately) probably has a borked routing table. It’s possible our bits are getting to the destination, but responses are being dropped on the way back, even. The worst-case scenario is that our IPs have been added to a blacklist deep inside some particular network and neither Verizon nor the broken destination sites have any knowledge or access to even fix it. At that point, I might just start begging Verizon for a new IP! blech.
Email me privately if you’d like, I’d be anxious to exchange notes, but I’ll spare you the bits, and me the privacy, and not post all my pings and traceroutes here.
Verizon’s outage report is RIEH032H87 so anyone having trouble should get verizon to link their trouble ticket to that outage report.
I doubt this is just a blacklist, because why would icmp continue to work?
Same problem – found your blog via the a search for verizon fios static ma; but oddly the ticket number didn’t come up.
We are static fios in Eastern MA with 5 IP and noticed Wednesday afternoon that some sites, including adobe.com, were down for browsing (although I could ping it). Realized we could browse some additional addresses via another IP as well. Heard from another customer with similar issues, so I stopped my attempt to switch permanently to another static IP. I’ve been able to use NAT for a few addresses to allow browsing; but I have to configure it for each host.
Also seems that blackberry phones aren’t retrieving our email, though we don’t have BES so I think there is a service of theirs that is manipulating Outlook Web Access and perhaps that service can’t reach my server.
As a quick test now I was able to browse experian but not patternreview.
We had some trouble in the last month where supposedly a host had a subnet blocked and we also found ourselves on the sorbs blacklist since it thought our range was dynamic hosts. Had to get an rdns entry to get excluded. Been a rough month for our Internet connection!
I’ve discovered tcpping, a connetivity tester for tcp rather than ICMP.
Stewart thanks for the information.
Having similar issues since last week. Am in Eastern Mass as well on static IP’s. Spoke to Verizon today. They were aware of the issue. From what the tech said, they are having trouble getting it resolved. In fact they asked if I could contact the owners of some of the websites I could not access and ask them to contact their host provider/ISP and report the issue as well. Does not inspire a lot of confidence that this will be quickly fixed.
Will I suspect you may be right with the black listing. I also have FiOS at home with dynamic IP. I compared tracert results and they seem to take the same hops. With the static IP’s it’s timing out at teliasonera-gw.customer.alter.net.
I have read on some other forums that some gamers are complaining about very high latency from FiOS with hoping through alter.nat.
Thanks for the update. We’ve had the 71.* addresses a few years and never had an issue. We have seen the same problem passing through alter.net. What’s interesting is we have 5 static ip’s. I’ve never used one of them. Today went ahead and switched the ip of our router to the “new” ip and can now get to the sites we could not get to before.
I am a sysadmin for a hosting provider. We have several customers who use FIOS in Massachusetts. I can say for certain we are not blocking these 71. IP addresses. What I am seeing is consistent with what the original poster mentioned regarding the SYN/ACK packets. He said he wasn’t receiving them. Running a Wireshark trace at our edge router, what I am seeing is that we receive the SYN packet from the FIOS customer, my server sends the SYN/ACK packet, and then I never receive the final ACK packet; this makes sense if we assume that the FIOS client never receives the SYN/ACK packet.
We are in a unqiue scenario in that we have carrier redundancy. We have Cox and Verizon. When we are on Cox, FIOS customers route over verizon-gni.net to level3.net to alter.net to us. Under this scenario, the problem exists and FIOS customers are able to hit some of our sites, but not others.
When we fail over to Verizon FIOS customers router over verizon-gni.net to alter.net to us. Under this scenario there are no problems whatsoever.
We are concerned that when we fail back to Cox that trouble for our FIOS customers will resume.
If any FIOS customer is experiencing the issue and you are able to get a knowledgeable network engineer on the phone (one who understands what you mean when you explain that the SYN/ACK packet originating from the hosting provider is not being returned to you) and Verizon would like to engage a hosting provider in troubleshooting this problem, please contact me:
I don’t want to talk to a frontline tech. These guys will not understand the problem, and even if they do they will not have the tools to troubleshoot or fix the problem. But if you can get a high level network engineer who needs a hosting provider to assist in troubleshooting the issue, I’d love to help.
Thanks. Passed your information over to Verison FiOS. Hopefully they will contact you. I don’t think they have a grip on the problem yet.
I’ve been dealing with this same problem (no inbound e-mail and unable to connect to certain websites) at work for more than a week and have been making daily calls to Verizon for status updates. Today’s call confirmed that it seems only IPs starting with 71. are affected. Armed with this info, Verizon is working with the backbone providers to try to resolve things.
If Verizon hasn’t made significant progress and doesn’t have a reasonable ETA when I call tomorrow, we’ll contemplate having them assign us a different static IP. I’m working with our network support vendor to figure out how much extra work that would require (reconfiguring VPN clients, firewall, etc.).
I’ve been having the exact same problem as everyone is reporting. I’m in the Boston area and there are certain websites that time out consistently for me: http://www.instapaper.com (this one is killing me!), http://www.pinestreetinn.org, http://tvbythenumbers.zap2it.com and so on. I just verified that experian.com does not work for me either.
This started a couple of days before Christmas and restarting my router hasn’t helped. I’ve been unable to contact Verizon support as they say they have had to drop everything to handle issues caused by the blizzard.
I’ll try turning my router off over night as someone else suggested, but I’m not holding out much hope.
I confirmed with Verizon today that their latest notes show the issue affecting all “low-70s” IP blocks, at least 70.* – 72.* (I’m 72.70.*, and having the routing issues).
Seems like it’s been ongoing for about 8 days, although they didn’t recognize it as a general problem until 5 days ago. Front-line tech support didn’t seem to have any faith that they had identified the root cause, and couldn’t provide any estimated time to fix.
As of 7:00 AM today this issue has been resolved for us. Inbound e-mail is flowing freely and I don’t have any trouble accessing web sites.
Mine is fixed too. I called billing and got a 9 day credit by referencing the outage ticket number RIEH032H87. Not much, but something.
It is working for me today too. I thought it was because I left my router off all night, but it sounds like it has been fixed for everybody. Phew!
At my calculation at the highest rate you would pay for FiOS Static Service as a standalone product without any term agreements it would be:
Static Service of 150Mbps/35Mbps [Highest Speed] = $239.99
Static IP Block of 125 IPs [Maximum/ONT] = $190.00
Total Price for FiOS Static w/125 IPs = $429.99
Total Estimate for taxes @ 15% average = 64.49
Total Price with estimated taxes = $494.48
Total Number of websites in 2009 = 234000000
Total Number of website adds in 2009 = 47000000
Percentage of website adds in 2009 = 25%
Percentage of expected website adds in 2010 = 58500000
Total Number of expected websites for 2010 = 292500000
Average Number of sites each user could not access = 3
Total Number of days of issue = 9
Days in December = 31
(Total price with estimated taxes)/(31 days)*(9 days of problem) = $143.56 = “Total Price for days out of service”
(Average number of sites not-accessible)/(Total number of expected websites for 2010) =
1.18623962 x 10 to -8 = “Total sites not accessible for whole web”
Total Credit due to 3 of 292500000 sites not working for 9 days is…
$143.56/1.18623962 x 10 to -8 =
Frustration for 9 days of not being able to access 3 sites without a backup internet provider to circumvent the problem =
Terms of Service for FiOS Static service:
Section 6 specifically.
This problem seems to have arisen again. Is anyone with Verizon experiencing trouble again? Specifically, if you are on Verizon, are you able to hit: