On Zipping

In the last thrilling installment on mission creep I talked about how I had been distracted by a nearly finished Bandcamp collection downloader. Well, just to shave the yak further, I thought a nice feature would be to unzip the downloaded files into place.

My usecase for it is so I can download the music to my NAS so I have to download the zips to my local machine, unzip them manually and upload them by hand which is a pain.

So I thought, Apple has the Compression.framework to handle decompressing zip files from a buffered stream, so I could do it as the file downloads meaning I don’t need any intermediate storage.

I tried a few experiments but quickly realised that it’s not zip files that Compression.framework can decompress but zlib streams, that is, it handles the compressed data inside the zip file, but not the container format wrapped around the compressed data.

Took a look and found a load of Swift packages that can handle the container format, but none that could handle it in a buffered fashion and the only one that hinted that it might be possible - Zip Foundation - seemed to require a full file to be present so it could jump around a bit.

Why? Well, it seems that the Zip container format isn’t very condusive to streaming. It has an entry header that describes the data, but it might optionally put the size of the data in an extra section AFTER the data, which means working out how long the data will be is trickier. The format does have an archive catalogue which lists the offset for all these extra sections, so the way to handle it is to read that catalogue first and then parse all the extra sections. Except they decided to put this archive at the end of the file.

Which kinda sucks for trying to do the decompression on the fly.

Anyway, I spent Christmas break reading the Zip format specification and implemented a very very simple unarchiver that takes the async data stream from URLSession and unpacks it as it goes.

The twist is that the files from Bandcamp does appear to contain any compressed data so I don’t even need Compression.framework, it’s just a case of finding the data in the archive and writing it out to disk.

The code is nearly ready to go into Ohia and maybe it’ll be finished soon, cos I’ve got a ZX Spectrum Next I want to play with