What is zchunk?

Over the past few months, I’ve been working on zchunk, a compression format that is designed to allow for good compression, but, more importantly, the ability to download only the differences between an old version of the file and a new version.

The concept is similar to both zsync and casync, but it has some important differences. Let’s first look at how downloading a zchunk file works.

Downloading a chunk file

A zchunk file is basically a bunch of zstd-compressed “chunks” concatenated together with a header specifying the location, size and checksum of each chunk. Let’s take an example with only a few chunks:

Note that the file has three chunks, labeled A, B and C, each with a unique checksum. These checksums are stored in the header.

Now let’s imagine that we want to download a new version of the file:

Note that the new file has two chunks that are identical to the original file and one new chunk. The header in the new file contains the checksums of chunks A, C and D. We start by downloading just the header of the new file:

We then compare the chunk checksums in the old file’s header with the chunk checksums in the newly downloaded header and copy any matching chunks from the old file:

We finish by downloading any remaining chunks, reducing the number of http requests by combining the range requests, and then inserting the downloaded chunks into the appropriate places in the final file:

When we’re finished, you have a file that is byte-for-byte identical to the new file on the server:

Background

What inspired this format is the ridiculous amount of metadata you download every time you check for updates in Fedora. Most of the data from one day’s updates is exactly the same in the next day’s updates, but you’ll still find yourself downloading over 20MB of metadata.

When I first took a look at this problem, there were two potential solutions: casync and zsync.

casync

At first glance, casync looked like it provided exactly what we need, but real-world testing showed a couple of problems. Because casync puts each chunk into a separate file, we downloaded hundreds (and sometimes thousands) of individual files just to rebuild the original metadata file. The process of initiating each http request is expensive, and, in my testing, downloading only the changed chunks took much longer than just downloading the full file in the first place.

The more I looked at casync, the more obvious it became that it’s designed for a different use-case (delivering full filesystem images), and, while close, wasn’t quite what I needed.

zsync

zsync approaches the problem a completely different way, by requiring you to use an rsyncable compression format (gzip –rsyncable is suggested), splitting it into chunks and then storing the chunk locations in a separate index file. Unfortunately, it also sends a separate http request for each chunk that it downloads.

Add to that the fact that zsync is unmaintained and somewhat buggy, and I didn’t really feel like it was the best option. I did find out later that OpenSUSE uses zsync for their metadata, but they put all the new records at the end of their metadata files, which reduces the number of ranges (and, therefore, the number of http requests).

zchunk

After looking at the drawbacks of both formats, I decided to create a completely new file format, with one major design difference and one major implementation difference compared to both casync and zsync.

Unlike both casync and zsync, zchunk files are completely self-contained. For zsync, you need the archive and its separate index, while casync requires that each chunk be stored in separate files alongside the index. Casync’s method fit its use-case, and zsync’s method works, given that it’s meant to be a way of extending what you can do with already-created files, though it’s hobbled by the fact that you have to intentionally use special flags to make compressed files that give good deltas.

The downside of having a separate index is that you have to make sure the index stays with the file it’s pointing to, and, since we’re creating a new format, there wasn’t much point in keeping the index separate.

The implementation difference is the ability that zchunk has to combine range requests into one http request, a rarely used http feature that is part of the spec. Zsync could theoretically add this feature, but casync can’t because it downloads separate files.

Zchunk will automatically combine its range requests into the largest number that the server will handle (the nginx default is 256 range requests in a single http request, while Apache’s default is to support unlimited range requests), send them as one http request, and then split the response into the correct chunks.

The zchunk format is also designed to store optional GPG keys, so zchunk files will be able to be signed and verified without needing to store the signature in a separate file.

What still needs work in zchunk

  • The C API for downloading needs to be finalized. I’m leaning towards not actually providing an API for downloading, but rather providing functions to generate the range requests and providing a callback that re-assembles the downloaded data into the correct chunks
  • Full test cases need to be written
  • GPG signature integration needs to be written
  • Python extensions need to be written

What’s needed to get zchunk-enabled Fedora repositories

  • I’ve written patches for createrepo_c that allow it to generate zchunk metadata, but it needs some work to make sure there are test cases for all the code
  • I’ve written a patch for libsolv that allows it to read zchunk files, but I still need to submit it for review
  • I’ve started on the work to get librepo to download zchunk metadata, but I’m not finished yet.

Introducing zchunk

Introducing zchunk, a new file format that’s highly delta-able (is that even a word?), while still maintaining good compression. This format has been heavily influenced by both zsync and casync, but attempts to address the weaknesses (for at least some use-case) in both formats. I’ll cover the background behind this in a later post.

Like casync and zsync, zchunk works by dividing a file into independently compressed “chunks”. Using only standard web protocols, the zchunk utility zckdl downloads any new chunks for your file while re-using any duplicate chunks from a specified file on your filesystem.

Zchunk is a completely new compression format, and it uses a new extension, .zck. By default, it uses zstd internally, but, because it compresses each chunk separately, a zchunk file cannot be decompressed using the zstd utilities. A zchunk file can be decompressed using the unzck utility and compressed using the zck utility.

Zchunk also supports the use of a common dictionary to help increase compression. Since chunks may be quite small, but have repeated data, you can use a zstd dictionary to encode the most common data. The dictionary must be the same for every version of your file, otherwise the chunks won’t match. For our test case, Fedora’s update metadata, using a dictionary reduces the size of the file by almost 40%.

So what’s the final damage? In testing, a zchunk file with an average chunk size of a few kilobytes and a 100KB dictionary ends up roughly 23% larger than a zstd file using the same compression level, but almost 10% smaller than the equivalent gzip file. Obviously, results will vary, based on chunk size, but zchunk generally beats gzip in size while providing efficient deltas via both rsync and standard http.

The zchunk file format should be considered fixed in that any further changes will be backwards-compatible. The API for creating and decompressing a .zck file can be considered essentially finished, while the API for downloading a .zck file still needs some work.

Future features include embedded signatures, separate streams, and proper utilities.

zchunk-0.4.0 is available for download, and, if you’re running Fedora or RHEL, there’s a COPR that also includes zchunk-enabled createrepo_c (Don’t get too excited, as there’s no code yet in dnf/librepo to download the .zck metadata).

Development is currently on GitHub.

Updated 05/03/2018 to point to new repository location