The case of the blocked web pages

LES Tyre

One of my fears when I set up the network in Tyre last year was that I would be called out for emergency repair trips. It’s an hour and quarter each way on a good day, double that if you hit the traffic wrong. And, for those who don’t know Lebanese traffic, hitting it wrong often involves an unhealthy rise in blood pressure.

Anyhow, I had mentally prepared for, at worst, one callout a month. Twelve months later, not one single callout. No emergencies. No “we need you here now” phone calls. The few times there were problems, I’d talk Dave (their resident computer expert) through them over the phone or get him to set up a reverse ssh tunnel so I could fix them from here.

Last week, that twelve month streak was finally broken. It started off with a phone call.

“Jonathan, none of our computers can get on the web. I can ssh with no problems, IMAP and POP3 work fine, but web pages only load sporadically, if at all.”

I talked Dave through checking the school’s squid proxy and then checked what happened when they bypassed their proxy. Still nothing.

“Ok, Dave, it’s obviously a problem with your ISP. Call them up and get them to fix it.”

The next day, Dave calls me again.

“The guy from the ISP was just here. He had no problems at all until he put his laptop behind the proxy. So he says it’s the proxy.”

Ok, that’s reasonable enough. Just to test, I have Dave bypass the proxy with his laptop (running Ubuntu), and, sure enough, the web works fine. For a couple of minutes. And then, again, nothing.

“Dave, if we’re bypassing the proxy, and you’re still not getting any web pages, it must be the ISP. Here’s what we’re going to do. We’re going to completely shut the proxy down and bypass it for everyone. That’s not going to fix the problem, but at least they can’t blame the proxy.”

The next day, I get a call again. “Jonathan, the technician came, and it’s definitely not them. He connected his laptop straight to the ISP using PPPoE, bypassing the router, and everything worked. He then went through the router, and, again, everything worked. He browsed for 15 minutes, with no problems at all. And here’s the crazy thing. All of the Macs and Windows machines are working fine. It’s only the Linux machines that aren’t working.

Well, that sucks. The school runs Fedora on all of its desktops, the servers run CentOS, and Dave runs Ubuntu on his computer. And none of them can access the web.

At this point, I’m out of ideas, so I get in my car and head on down to Tyre. Of course, Dave has a meeting up here in Beirut, but he clears everything with the school secretary, and I’m given access to the router.

The first thing I do is plug my laptop into the network and start browsing the web. Five minutes later, when Google has still failed to load, I finally accept that, yes, there is actually a problem browsing the web.

My next step is to try swapping in another router. Even after setting the username, password, and MAC address, the new router just won’t connect. I remember what Dave said about the technician plugging straight into Internet ethernet cable and making the connection using PPPoE. So I plug my laptop straight into the cable, setup PPPoE in NetworkManager (which is insanely easy), and, boom, I’m in, bypassing the router.

I check my emails (using Evolution, connecting over IMAP). Looks great. I open Google. Not so great. I then test a Windows computer that’s sitting on the desk. Instant web access.

At this point, a bulb finally lights in my brain. Most of the ISPs in this country using transparent caching proxies, as bandwidth is expensive for them too. Could this have to do with their ISP’s proxy?

I set up my computer to use our server in the States as a proxy. All of a sudden, my web access is working perfectly. It’s the ISP’s proxy. There’s obviously something wrong with how it’s parsing any requests that come from Linux computers.

I then realize that the Mac and Windows computers started working after we shut down the school’s proxy… which was running under Linux. Ouch.

When Dave returns from Beirut, we sit down and talk through the problem. The first step is for me to turn the school proxy back on, and set it to use the US server as a parent proxy. Now, all web traffic is getting routed through the US server, which may not be efficient, but at least works. The next step is for the school to switch ISPs, and we’re still waiting on that process to finish.

As for me, I’m still a bit shell shocked. We live in 2010 and an ISP is using a transparent proxy solution that doesn’t work with Linux? My best guess is that we’re looking at some weirdness in how it’s parsing TCP packets… but how?

If anyone ever works out what the explanation is, I’d sure love to hear it.

Update (10/02/2010): A big thank you to all who offered suggestions in the comments. We went down to Tyre for a visit today, and while we were down there, I switched the school’s proxy back to a direct connection to the web so I could test some of the suggestions. Of course, the web started working correctly immediately. Obviously the ISP fixed whatever it was that they broke (which is good), but they haven’t explained what went wrong to the school (which isn’t so good).

Anyhow, if I come up against this again, I’ll at least have some things to try. Thanks again.

Better Building

As I mentioned in my last post, I’m setting up the computer system in our sister school in Ain Zhalta, up in the mountains, and last summer I set up the computer system in our sister school down in Tyre.

This includes both servers and workstations, and, being the lazy sysadmin that I am, I prefer not to reinvent the wheel for each place. My method last summer was to build rpms for most of the school-specific configuration settings, which allows me to make small changes and have them pulled in automatically.

The one problem I’ve hit is that there are some packages that have to be different between the two (and now three) schools. For example, the package lesbg-gdm-gconf contains the gconf settings so our login says “Welcome to the LES Loueizeh computer system”. Somehow, I don’t think Tyre or Ain Zhalta will appreciate having that showing on their welcome screen. Each school also has a different logo, and, again, the other schools don’t want our logo on their backgrounds.

So, what I really need is a way of organizing my rpms so that the common ones get passed to all the schools while the per-school ones only get passed to their school. Hmm. Think, think, what software is available in Fedora that could do that…

Enter koji. I had already setup a koji buildsystem to help track down the disappearing deltarpms bug (yes, the bug is still there, but that’s for another day), and the hardest part was getting the SSL certs right.

I set up a koji instance on our dedicated server (now yum upgraded to Fedora 13, see this post for more details) by following these instructions, and now have a nice centralized build system for our schools at http://koji.lesbg.com.

The beauty of koji is that it handles inheritance. For Fedora 13, I’ve created one parent tag, dist-f13, and three child tags dist-f13-lesbg, dist-f13-lest, dist-f13-lesaz. All of the common packages are built to the dist-f13 tag, while the school-specific packages are built to their respective tags. Every night, I generate three repositories (lesbg, lest, and lesaz), and each repository has the correct rpms for that school. What could be easier than that?

There are a few caveats, though. First, our dedicated server is slow. It’s an old celeron with a whole 1GB of RAM (through HiVelocity), so I’ve had to compromise on a few little things. First off, we run both the x86_64 and i386 Fedora distributions, but our server is i386 only. This means that, at least for the moment, all of our packages have to be noarch.

Second, a normal part of the build process is to merge the upstream Fedora repositories with the local packages after each build (so it can be used to build the next package). On our server, this takes almost two hours. So I’ve modified it so the build repository doesn’t include the local packages, and that mess is now gone. The downside is that I can’t BuildRequires any local packages, but, seeing as they’re all supposed to be configuration anyway, that hasn’t been a problem yet (and I don’t expect that it ever will be).

Anyhow, aside from some small glitches that seem to reflect more on the slow hardware available, koji has done the trick and done it nicely. With our current setup, I can now add another organization with a minimal amount of fuss, and that’s just what I was looking for! Thanks koji devs!

Gears credit: Gears gears cogs bits n pieces by Elsie esq. Used under CC BY