Deltarpm problems (Part I)

In my last post, I talked a bit about how deltarpm’s delta algorithm actually works, especially in relation to binary files.

Today, I want to look at one of the real-world problems that has a large effect on the size of some of our deltas.

Background

Our problem stems from how RPM deals with multiple arches. All binary files in an RPM are labeled with a “color”, that is, the architecture the RPM is built for. When two different arches of the same package are installed and binary files conflict, the binary files whose color matches the system color are the ones that are kept.

For example, let’s imagine that you install samba-common.x86_64 on an x86_64 Fedora installation. You then install samba-common.i686 because some program requires 32-bit libnetapi.so. This isn’t a problem because the 64-bit version of libnetapi.so is in /usr/lib64 while the 32-bit version is in /usr/lib.

But there’s no such thing as /usr/bin64, so what about the files installed into /usr/bin by samba-common? Because our system “color” is x86_64, the 32-bit /usr/bin executables from samba-common.i686 never actually get installed. If you run the file command on /usr/bin/pdbedit, it will tell you that it’s a 64-bit binary.

The Problem

While this is (generally) what you want on your system, it leaves us in a bit of a pickle for deltarpms. One of the requirements of a deltarpm is that it must build back perfectly into the original rpm.

Let’s imagine that we are now wanting to upgrade to a newer version of samba-common.i686 with the only change being a minor typo being fixed in the documentation. (Why you would want to upgrade for that is beyond the scope of this article.)

At first glance, this is an excellent place for a deltarpm. A small change in the documentation should result in a very small deltarpm. But when we try to apply our deltarpm to the currently installed samba-common.i686, we run into a major problem:

The currently installed 64-bit /usr/bin/pdbedit doesn’t match the 32-bit /usr/bin/pdbedit we need for our new rpm.

Bam! The delta fails, and yum-presto proceeds to download the full 14MB samba-common.i686 package.

The Solution

We became aware of this problem a few years back when 32-bit packages were common in 64-bit installs (this was back in the day when OpenOffice.org was still 32-bit only). We ended up pushing a patch into deltarpm that forces all colored executables not installed into multilib directories to be included in the deltarpm. This way, we never use the locally installed executables and the whole color problem disappears.

The Tradeoff

But this leaves us with another problem. We end up losing all delta savings on any binaries in /usr/bin. For samba-common, that’s 27MB (uncompressed) of binaries that we have to redownload every time we get an update.

Which is exactly the problem that deltarpm is supposed to solve.

Broken bicycle credit: unwanted & undesired … just let it rot by notsogoodphotography on Flickr. Used under a Creative Commons Attribution 2.0 license.


Comments

Victor Bogado da Silva Lins
Wednesday, Nov 18, 2009

I always thought that delta rpms worked packaging the delta of each separated file in the package. If it were done like that this would be a non-problem, since the delta of x86 binaries that were not installed in x86 systems would not be necessary, since they weren’t needed before.

So if the drpm package contains :

/usr/lib/libfoo.so.delta /usr/bin/bar.delta (x86 colored)

and the system contains :

/usr/lib/libfoo.so /usr/bin/bar (x86_64 colored)

the package install would apply the libfoo.so.delta, but would skip the bar.delta since their colors do not match. This would be consistent, I believe, with the end result of installing the updated x86 package without deltas.

I never thought very hard on those problems before, so I am probably missing some detail.

Jonathan Dieter
Thursday, Nov 19, 2009

What actually happens is that we do a delta of the rpm cpio archives. This gives us high delta ratios even when file names have changed, but the information is the same.

The other problem is that we have to rebuild the new rpm so that it is bit-for-bit identical to the rpm that would have been downloaded. If we don’t, the package signatures don’t work.

bogado
Thursday, Nov 19, 2009

just out of curiositym cpio streams are compressed or do rpm compress them after (like a tar | gz)?

Jonathan Dieter
Tuesday, Nov 24, 2009

Not sure if I understand what you’re asking, but deltarpm works with the uncompressed cpio archive. Obviously, the archive is compressed inside of the rpm.