Courgette vs. deltarpm comparison

I keep on getting questions about deltarpm using the courgette algorithm, so I thought I would temporarily put them to rest:

firefox-3.5.4-1.fc12.i686.rpm – 15M firefox-3.5.6-1.fc12.i686.rpm – 15M firefox-3.5.4-1.fc12_3.5.6-1.fc12.i686.drpm (rpm-only deltaprm) – 434K firefox-3.5.4-1.fc12_3.5.6-1.fc12.i686.courgette.bz2 (delta of rpm cpios) – 426K

Please note that this is not a reflection of how courgette would work if it could use its disassembly algorithm on Linux binaries. The problem is that the disassembly algorithm only works with Windows binaries right now. Until courgette is able to do its disassembly-foo on Linux binaries, there will be no real benefit to using courgette in deltarpm.

Method

For deltarpm:

$ makedeltarpm -r firefox-3.5.4-1.fc12.i686.rpm \
  firefox-3.5.6-1.fc12.i686.rpm \
  firefox-3.5.4-1.fc12_3.5.6-1.fc12.i686.drpm

For courgette:

$ rpm2cpio firefox-3.5.4-1.fc12.i686.rpm > \
  firefox-3.5.4-1.fc12.i686.cpio
$ rpm2cpio firefox-3.5.6-1.fc12.i686.rpm > \
  firefox-3.5.6-1.fc12.i686.cpio
$ courgette -gen firefox-3.5.4-1.fc12.i686.cpio \
  firefox-3.5.6-1.fc12.i686.cpio \
  firefox-3.5.4-1.fc12_3.5.6-1.fc12.i686.courgette
$ bzip2 firefox-3.5.4-1.fc12_3.5.6-1.fc12.i686.courgette

Note: I believe the 8K difference in file size is because the courgette delta doesn’t contain any of the rpm metadata.

Deltarpm problems (Part I)

In my last post, I talked a bit about how deltarpm’s delta algorithm actually works, especially in relation to binary files.

Today, I want to look at one of the real-world problems that has a large effect on the size of some of our deltas.

Background

Our problem stems from how RPM deals with multiple arches. All binary files in an RPM are labeled with a “color”, that is, the architecture the RPM is built for. When two different arches of the same package are installed and binary files conflict, the binary files whose color matches the system color are the ones that are kept.

For example, let’s imagine that you install samba-common.x86_64 on an x86_64 Fedora installation. You then install samba-common.i686 because some program requires 32-bit libnetapi.so. This isn’t a problem because the 64-bit version of libnetapi.so is in /usr/lib64 while the 32-bit version is in /usr/lib.

But there’s no such thing as /usr/bin64, so what about the files installed into /usr/bin by samba-common? Because our system “color” is x86_64, the 32-bit /usr/bin executables from samba-common.i686 never actually get installed. If you run the file command on /usr/bin/pdbedit, it will tell you that it’s a 64-bit binary.

The Problem

While this is (generally) what you want on your system, it leaves us in a bit of a pickle for deltarpms. One of the requirements of a deltarpm is that it must build back perfectly into the original rpm.

Let’s imagine that we are now wanting to upgrade to a newer version of samba-common.i686 with the only change being a minor typo being fixed in the documentation. (Why you would want to upgrade for that is beyond the scope of this article.)

At first glance, this is an excellent place for a deltarpm. A small change in the documentation should result in a very small deltarpm. But when we try to apply our deltarpm to the currently installed samba-common.i686, we run into a major problem:

The currently installed 64-bit /usr/bin/pdbedit doesn’t match the 32-bit /usr/bin/pdbedit we need for our new rpm.

Bam! The delta fails, and yum-presto proceeds to download the full 14MB samba-common.i686 package.

The Solution

We became aware of this problem a few years back when 32-bit packages were common in 64-bit installs (this was back in the day when OpenOffice.org was still 32-bit only). We ended up pushing a patch into deltarpm that forces all colored executables not installed into multilib directories to be included in the deltarpm. This way, we never use the locally installed executables and the whole color problem disappears.

The Tradeoff

But this leaves us with another problem. We end up losing all delta savings on any binaries in /usr/bin. For samba-common, that’s 27MB (uncompressed) of binaries that we have to redownload every time we get an update.

Which is exactly the problem that deltarpm is supposed to solve.

Broken bicycle credit: unwanted & undesired … just let it rot by notsogoodphotography on Flickr. Used under a Creative Commons Attribution 2.0 license.