GlusterFS Madness

Background

As mentioned in Btrfs on the server, we have been using btrfs as our primary filesystem for our servers for the last year and a half or so, and, for the most part, it’s been great. There have only been a few times that we’ve needed the snapshots that btrfs gives us for free, but when we did, we really needed them.

At the end of the last school year, we had a bit of a problem with the servers and came close to losing most of our shared data, despite using DRBD as a network mirror. In response to that, we set up a backup server which has the sole job of rsyncing the data from our primary servers nightly. The backup server is also using btrfs and doing nightly snapshots, so one of the major use-cases behind putting btrfs on our file servers has become redundant.

The one major problem we’ve had with our file servers is that, as the number of systems on the network has increased, our user data server can’t handle the load. The configuration caching filesystem (CCFS) I wrote has helped, but even with CCFS, our server was regularly hitting a load of 10 during breaks and occasionally getting as high as 20.

Switching to GlusterFS

With all this in mind, I decided to do some experimenting with GlusterFS. While we may have had high load on user data server, our local mirror and shared data servers both had consistently low loads, and I was hoping that GlusterFS would help me spread the load between the three servers.

The initial testing was very promising. When using GlusterFS over ext4 partitions using SSD journaling on just one server, the speed was just a bit below NFS over btrfs over DRBD. Given the distributed nature of GlusterFS, adding more servers should increase the speed linearly.

So I went ahead and broke the DRBD mirroring for our eight 2TB drives and used the four secondary DRBD drives to set up a production GlusterFS volume. Our data was migrated over, and we used GlusterFS for a week without any problems. Last Friday, we declared the transition to GlusterFS a success, wiped the four remaining DRBD drives, and added them to the GlusterFS volume.

I started the rebalance process for our GlusterFS volume Friday after school, and it continued to rebalance over the weekend and through Monday. On Monday night, one of the servers crashed. I went over to the school to power cycle the server, and, when it came back up, continued the rebalance.

Disaster!

Tuesday morning, when I checked on the server, I realized that, as a result of the crash, the rebalance wasn’t working the way it should. Files were being removed from the original drives but not being moved to the new drives, so we were losing files all over the place.

After an emergency meeting with the principal (who used to be the school’s sysadmin before becoming principal), we decided do ditch GlusterFS and go back to NFS over ext4 over DRBD. We copied over the files from the GlusterFS partitions, and then filled in the gaps from our backup server. Twenty-four sleepless hours later, the user data was back up and the shared data was up twenty-four sleepless hours after that.

Lessons learned

  1. Keep good backups. Our backups allowed us to restore almost all of the files that the GlusterFS rebalance had deleted. The only files lost were the ones created on Monday.
  2. Be conservative about what you put into production. I’m really not good at this. I like to try new things and to experiment with new ideas. The problem is that I can sometimes put things into production without enough testing, and this is one result.
  3. Have a fallback plan. In this case, our fallback was to wipe the server and restore all the data from the backup. It didn’t quite come to that as we were able to recover most of the data off of GlusterFS, but we did have a plan if it did.
  4. Avoid GlusterFS. Okay, maybe this isn’t what I should have learned, but I’ve already had one bad experience with GlusterFS a couple of years ago where its performance just wasn’t up to scratch. For software that’s supposedly at a 3.x.x release, it still seems very beta-quality.

The irony of this whole experience is that by switching the server filesystems from btrfs to ext4 with SSD journals, the load on our user data server has dropped to below 1.0. If I’d just made that switch, I could have avoided two days of downtime and a few sleepless nights.

Nuclear explosion credit – Licorne by Pierre J. Used under the CC-BY-NC 2.0 license.

Goodbye GDM (for the moment)

Our school system has been running Fedora on our desktops since early 2008. During that time, our login screen has been managed by GDM and our desktop session has been GNOME. It doesn’t look like our desktop session is going to change any time soon, as we transitioned to GNOME Shell in Fedora 13 and the students and teachers have overwhelmingly preferred it to GNOME 2.

At our school we have a couple of IT policies that affect our login sessions. All lab computers that aren’t logged in have some form of screensaver running (not a black screen) as it helps students identify which computers are on and which aren’t at a glance. It also helps IT see which computers need to be checked. Logged in computers should never have a screensaver running and screen-locking is disabled as we have far more users than computers. Some may argue that these policies should be amended, but, for the moment, they are what they are.

In older versions of Fedora, gnome-screensaver was set to run in gdm with the floating Fedora bubbles coming on after a minute of disuse. The screensaver was inhibited during the login session (I experimented with changing the gconf settings so it didn’t come on for 100 hours and other such nonsense, but inhibiting the screensaver was the only way I found that worked reliably over long periods of time).

With Fedora 16 we now have a much more beautiful new version of GDM, but, unfortunately, the gnome-screensaver that comes with it no longer allows you to actually show a screensaver. I decided to try using xscreensaver instead, but it cannot run in GDM. It keeps complaining that something else is grabbing the keyboard, and I can only assume that something is GDM. Finally, I can’t even write a simple screensaver program in python as it seems I can’t even run a full-screen app over the GDM screen.

Add to all that the fact that we have 1000+ students in the school who are able to log into any lab computer and GDM lists all users who ever logged into the computer. Which theoretically could be 1000. Urgh!

So for our Fedora 16 system, I’ve switched over to lxdm. A quick configuration change to tell it to boot gnome-shell as its default session (and some hacks so it doesn’t try to remember what language the last user used to log in) and it was set. Xscreensaver runs just fine over it and we now have some pretty pictures of Lebanon and the school in a carousel as our login screensaver.

It looks like the screensaver functionality will get merged straight into gnome-shell, and, if it does, we may be able to have extensions that actually implement the screensaver. If that happens, and if GDM re-acquires the ability to not show the user list, we’ll switch back to GDM. Until then, we’ll stick with lxdm.

Now I just need to work out how to inhibit gnome-screensaver during login as gnome-screensaver --inhibit no longer works. I’m sure there was a good reason for removing that code, but for the life of me I can’t work out what it was…