I hate virtual machines (was I hate NFS)

(Please note that you’ll probably want to read the previous post before this one)

So, I set up a new virtual machine running Fedora rather than CentOS 5.4 and migrated the services over to it. We did see an improvement, but just not enough. I went into the computer room during break, and several students had gray screens for Firefox and OpenOffice.org.

So I’ve switched us back over to the original configuration (running NFS off of the real servers). I have to admit that I’m quite curious as to what the load will be tomorrow when everyone logs in.

Thanks to those who commented on my last post. The general consensus seems to be that this just isn’t the best area to use a virtual machine.

EDIT:

We’ve been running the new system for a few days now and it’s much more responsive. Logins never take longer than 30 seconds, and none of the students are getting gray windows. Load during breaks now ranges from 7 to 20. I’d still love to see a much lower load, but at least we’re back to a reasonably fast system.

PXE and gPXE

Boot menu

So we’ve been using PXE booting on our network for the last couple of years and it has made life much easier.  We use pxelinux and vesamenu.c32 to have a pretty boot menu show up (specific to the system’s ip address).  Any school computers only see “Boot from hard drive” and “Administrative tools” (and choose the first option after five seconds).  An unknown computer (or a new computer) will get the menu on the right.  Administrative tools is password-locked so students/teachers can’t reinstall the operating system.

This system is incredibly efficient when it comes to imaging a new system (I go to “Administrative tools”, type the password, and then choose “Image system” and walk away).

We have run into several problems, though.  The main one was that several of our newer Intel motherboards have odd BIOS bugs that can occasionally be tripped up by PXE booting.  My favorite was the systems that completely froze when you chose “Boot from hard drive”.  We bought twenty of them.  My theory is that pxelinux is overwriting the RAM that contains the BIOS code for booting off of the hard drive.  And then there were the computers that would hang when trying to access “Administrative tools”.  The worst part is that I couldn’t replace the network boot rom with something from etherboot because I couldn’t find any tools that would allow me to modify Intel BIOSes this way.

It finally got to the point where half of our computers had PXE booting turned off by default, which made life fun when I needed to re-image one of those systems.  Enter gPXE.  gPXE is an open source network bootloader, and is the successor to etherboot.

I was playing around with it one day and came across this page.  I realized that I could cunningly have PXE load the gPXE boot rom over the network, and then replace the manufacturer’s PXE code in memory with gPXE code, which seems to be far more robust.

So, now all of our systems can boot off of the network, and then boot from the hard drive without hanging.  All of our systems can access Administrative tools.  And we even get a few bonuses (data gets loaded over http rather than tftp).