The case of the blocked web pages

LES Tyre

One of my fears when I set up the network in Tyre last year was that I would be called out for emergency repair trips. It’s an hour and quarter each way on a good day, double that if you hit the traffic wrong. And, for those who don’t know Lebanese traffic, hitting it wrong often involves an unhealthy rise in blood pressure.

Anyhow, I had mentally prepared for, at worst, one callout a month. Twelve months later, not one single callout. No emergencies. No “we need you here now” phone calls. The few times there were problems, I’d talk Dave (their resident computer expert) through them over the phone or get him to set up a reverse ssh tunnel so I could fix them from here.

Last week, that twelve month streak was finally broken. It started off with a phone call.

“Jonathan, none of our computers can get on the web. I can ssh with no problems, IMAP and POP3 work fine, but web pages only load sporadically, if at all.”

I talked Dave through checking the school’s squid proxy and then checked what happened when they bypassed their proxy. Still nothing.

“Ok, Dave, it’s obviously a problem with your ISP. Call them up and get them to fix it.”

The next day, Dave calls me again.

“The guy from the ISP was just here. He had no problems at all until he put his laptop behind the proxy. So he says it’s the proxy.”

Ok, that’s reasonable enough. Just to test, I have Dave bypass the proxy with his laptop (running Ubuntu), and, sure enough, the web works fine. For a couple of minutes. And then, again, nothing.

“Dave, if we’re bypassing the proxy, and you’re still not getting any web pages, it must be the ISP. Here’s what we’re going to do. We’re going to completely shut the proxy down and bypass it for everyone. That’s not going to fix the problem, but at least they can’t blame the proxy.”

The next day, I get a call again. “Jonathan, the technician came, and it’s definitely not them. He connected his laptop straight to the ISP using PPPoE, bypassing the router, and everything worked. He then went through the router, and, again, everything worked. He browsed for 15 minutes, with no problems at all. And here’s the crazy thing. All of the Macs and Windows machines are working fine. It’s only the Linux machines that aren’t working.

Well, that sucks. The school runs Fedora on all of its desktops, the servers run CentOS, and Dave runs Ubuntu on his computer. And none of them can access the web.

At this point, I’m out of ideas, so I get in my car and head on down to Tyre. Of course, Dave has a meeting up here in Beirut, but he clears everything with the school secretary, and I’m given access to the router.

The first thing I do is plug my laptop into the network and start browsing the web. Five minutes later, when Google has still failed to load, I finally accept that, yes, there is actually a problem browsing the web.

My next step is to try swapping in another router. Even after setting the username, password, and MAC address, the new router just won’t connect. I remember what Dave said about the technician plugging straight into Internet ethernet cable and making the connection using PPPoE. So I plug my laptop straight into the cable, setup PPPoE in NetworkManager (which is insanely easy), and, boom, I’m in, bypassing the router.

I check my emails (using Evolution, connecting over IMAP). Looks great. I open Google. Not so great. I then test a Windows computer that’s sitting on the desk. Instant web access.

At this point, a bulb finally lights in my brain. Most of the ISPs in this country using transparent caching proxies, as bandwidth is expensive for them too. Could this have to do with their ISP’s proxy?

I set up my computer to use our server in the States as a proxy. All of a sudden, my web access is working perfectly. It’s the ISP’s proxy. There’s obviously something wrong with how it’s parsing any requests that come from Linux computers.

I then realize that the Mac and Windows computers started working after we shut down the school’s proxy… which was running under Linux. Ouch.

When Dave returns from Beirut, we sit down and talk through the problem. The first step is for me to turn the school proxy back on, and set it to use the US server as a parent proxy. Now, all web traffic is getting routed through the US server, which may not be efficient, but at least works. The next step is for the school to switch ISPs, and we’re still waiting on that process to finish.

As for me, I’m still a bit shell shocked. We live in 2010 and an ISP is using a transparent proxy solution that doesn’t work with Linux? My best guess is that we’re looking at some weirdness in how it’s parsing TCP packets… but how?

If anyone ever works out what the explanation is, I’d sure love to hear it.

Update (10/02/2010): A big thank you to all who offered suggestions in the comments. We went down to Tyre for a visit today, and while we were down there, I switched the school’s proxy back to a direct connection to the web so I could test some of the suggestions. Of course, the web started working correctly immediately. Obviously the ISP fixed whatever it was that they broke (which is good), but they haven’t explained what went wrong to the school (which isn’t so good).

Anyhow, if I come up against this again, I’ll at least have some things to try. Thanks again.


Tuesday, Sep 28, 2010

perhaps this?: TCP window scaling and broken routers

Tuesday, Sep 28, 2010

Might it be that the ISP’s proxy is discriminating based-on the HTTP User-Agent string that the browsers and/or the squid proxy are sending?

Tuesday, Sep 28, 2010

Reduce the proxy network interface MTU by 60-90 bytes.

Jonathan Dieter
Tuesday, Sep 28, 2010

I did wonder that and installed the User-Agent Switcher on Firefox. It didn’t fix anything. And then I realized that Windows computers running through the school’s Linux proxy were also blocked, so it is probably a problem much lower down the stack.

Jonathan Dieter
Tuesday, Sep 28, 2010

I’ll check it out when we’re down in Tyre again for a visit (should be within the next week or so). However, given that TCP window scaling is enabled on Macs and the Macs were having no problems connecting, I’m not convinced this is the problem. I will try shutting it off, though, and see what happens. Thanks for the pointer.

Jonathan Dieter
Tuesday, Sep 28, 2010

I’ll go ahead and give that a try when I’m down in Tyre again for a visit (should be within a week).

Tuesday, Sep 28, 2010

I had an issue running a Windows program in WINE that kept failing on TCP connections. I tried different MTU sizes with no result. I ran across this article which instructs how to optimize TCP by increasing the buffer size:

Modified the sysctl.conf and the problem disappeared.

I never really gave much thought to the differences between Windows and Linux TCP handling, but its seems they are very different animals.

Jonathan Dieter
Wednesday, Sep 29, 2010

Great link! We’re planning on doing a visit to Tyre this weekend, so I’ll be running their system through a bunch of permutations of settings. Thanks for the pointer.