I hate NFS

On our network we have about 100 client computers, most of which are running Fedora 11.  We have two real servers running CentOS 5.4, using DRBD to keep the virtual machine data on the two real machines in sync and Red Hat’s cluster tools for starting and stopping the virtual machines.

We have five virtual machines running on the two real machines, only one of which is important to this post, our fileserver.

Under our old configuration, /networld was mounted on one of the real servers, and then shared to our clients using NFS. Our virtual machine, fileserver, then mounted /networld over NFS and shared it using Samba for our few remaining Windows machines (obviously, a non-optimal solution).

Old configuration (click on image for full size)

There were a couple of drawbacks to this configuration:

  1. I had to turn on and off a number of services as the storage clustered service moved from storage-server01 to storage-server02
  2. Samba refused to share a nfs4-mounted /networld, and, when mounted using nfs3, the locking daemon would crash at random intervals (I suspect a race condition as it mainly happened when storage-server0x was under high load).

My solution was to pass the DRBD disks containing /networld to fileserver, and allow fileserver to share /networld using both NFS and Samba, which seemed a far less hacky solution.

Current configuration (click on image for full size)

I knew there would be a slight hit in performance, though I’m using virtio to pass the hard drives to the virtual machine, so I would expect a maximum of 10-15% degradation.

Or not. I don’t have any hard numbers, but once we have a full class logging in, the system slows to a crawl. My guess would be that our Linux clients are running at 1/2 to 1/3 of the speed of our old configuration.

The load values on fileserver sit at about 1 during idle times and get pumped all the way up to 20-40 during breaks and computer lessons.

So now I’m stuck. I really don’t want to go back to the old configuration, but I can’t leave the system as slow as it is. I’ve done some NFS tuning based on miscellaneous sites found via Google, and tomorrow will be the big test, but, to be honest, I’m not real hopeful.

(To top it off, I spent three hours Friday after school tracking down this bug after updating fileserver to CentOS 5.4 from 5.3. I’m almost ready to switch fileserver over to Fedora.)

PXE and gPXE

Boot menu

So we’ve been using PXE booting on our network for the last couple of years and it has made life much easier.  We use pxelinux and vesamenu.c32 to have a pretty boot menu show up (specific to the system’s ip address).  Any school computers only see “Boot from hard drive” and “Administrative tools” (and choose the first option after five seconds).  An unknown computer (or a new computer) will get the menu on the right.  Administrative tools is password-locked so students/teachers can’t reinstall the operating system.

This system is incredibly efficient when it comes to imaging a new system (I go to “Administrative tools”, type the password, and then choose “Image system” and walk away).

We have run into several problems, though.  The main one was that several of our newer Intel motherboards have odd BIOS bugs that can occasionally be tripped up by PXE booting.  My favorite was the systems that completely froze when you chose “Boot from hard drive”.  We bought twenty of them.  My theory is that pxelinux is overwriting the RAM that contains the BIOS code for booting off of the hard drive.  And then there were the computers that would hang when trying to access “Administrative tools”.  The worst part is that I couldn’t replace the network boot rom with something from etherboot because I couldn’t find any tools that would allow me to modify Intel BIOSes this way.

It finally got to the point where half of our computers had PXE booting turned off by default, which made life fun when I needed to re-image one of those systems.  Enter gPXE.  gPXE is an open source network bootloader, and is the successor to etherboot.

I was playing around with it one day and came across this page.  I realized that I could cunningly have PXE load the gPXE boot rom over the network, and then replace the manufacturer’s PXE code in memory with gPXE code, which seems to be far more robust.

So, now all of our systems can boot off of the network, and then boot from the hard drive without hanging.  All of our systems can access Administrative tools.  And we even get a few bonuses (data gets loaded over http rather than tftp).