iain's development activities. May contain z80, Cocoa, or whatever.

20 May 2024

Preparing for a release

After successfully downloading the 1/2Tb that is my Bandcamp collection, I think Ohia is nearly ready for release but trying to decide how to monetize the app is always weird. Since the last time I actually charged for an app subscriptions have taken off and the freemium / pro model still exists.

Started thinking about how people would use Ohia: They’ll install it, download their Bandcamp collection in a single large download and then never use it again, or maybe use it occasionally to download anything new they’ve purchased.

That sort of usage doesn’t seem to support a subscription model. I’m sure there’s some scummy companies out there which “use once, subscribe forever” is the perfect goal, but I’d feel dirty trying that.

Likewise, I don’t know how many people would care about possible “pro” features I could add, scheduled downloading for people who might be on limited bandwidth/metered connections, and maybe automatic downloading of updates - but I don’t think Ohia will be an app that people leave running in the background. It feels more like an app that you run when you’ve added lots of things to your Bandcamp collection.

Limited downloads for free, pay to unlock unlimited downloads? Lock away the automatic zip decompression?

I dunno. I’m a developer, not a market…eer?

I think I’ll probably just put a single price on it and see what happens. Now the next thing to do is guess what that price should be. macOS users do seem willing to spend money on software though, so that’s nice, I guess.

18 April 2024

Improving Zip unpacking performance

Spoilers: You Probably Don’t Want To Use URLSession.bytes(from:delegate:)

Back when I started writing my on the fly zip unpacker the app was all nicely asynchronous and I wanted the zip extractor to be all nice and async too. The data comes from the URLSession download and the only option available to get the data in an async / await context is URLSession.bytes(from:delegate) so that’s what I went with.

I knew it would be slow because it’s switching async contexts every byte (but the discussion on the Swift forums about it possibly being slow suggested that they weren’t too worried about it being too much slower) so I wrote a fairly naive unpacker using it. On my test zip (2.6GB in size) it took about 700seconds and the CPU was 100% the whole time. Yeah, it’s slow, but it worked, so I moved onto other things.

Over the last week I’ve wanted to make it faster. I profiled it in Instruments and a lot of the time was spent searching for header information in the unpacked data.

Quick aside about the zip format: The way the zip file format works is that there’s a header to say what type the next chunk of data is, and if it’s a file there’s a second header to say the name, the compression type, the data length and then the file data. Thing is that sometimes that data length is 0, and in that case, there’s an extra chunk AFTER the file data that contains how long the data was and then at the end of the file, there’s a directory with offsets to all the data size chunks thoughout the file. I guess it’s a way to do things, but it’s not very useful if the goal is to unpack the file as it’s being downloaded.

With this in mind, my first method ignored the data length and scanned the file data for the 4byte header that indicated the data length chunk was starting using a nice ring buffer and stuff. It was pretty neat, but slow.

So to avoid all the searching I implemented a fast path if the zip file did contain the data length in the file header before the data. That let me know how many bytes to copy out and it was all good. Timing it and it came out at ~300seconds. Twice as fast, nice.

But there were still big chunks of time that Instruments didn’t really seem to be able to explain very well, but I got the feeling that maybe they were coming from the use of URLSession.AsyncBytes as my AsyncSequence

I wrote a small wrapper that turned the URLSession.dataTask(with:) delegate method into an AsyncStream<Data> that just returns the buffers of data as they are downloaded.

public final class AsyncDownloader: NSObject {
    typealias ThrowingContinuation = AsyncThrowingStream<Data, any Error>.Continuation
    private lazy var session: URLSession = {
        let configuration = URLSessionConfiguration.default
        configuration.waitsForConnectivity = true
        return URLSession(configuration: configuration,
                          delegate: self,
                          delegateQueue: nil)
    private var taskToContinuation:[URLSessionDataTask: ThrowingContinuation] = [:]
    public func buffer(from url: URL) -> AsyncThrowingStream<Data, Error>
        AsyncThrowingStream<Data, Error> { contination in
            let dataTask = session.dataTask(with: url)
            taskToContinuation[dataTask] = contination

extension AsyncDownloader: URLSessionDataDelegate {
    public func urlSession(_ session: URLSession,
                           dataTask: URLSessionDataTask,
                           didReceive data: Data) {
        if let continuation = taskToContinuation[dataTask] {
    public func urlSession(_ session: URLSession,
                           task: URLSessionTask,
                           didCompleteWithError error: Error?) {
        guard let dataTask = task as? URLSessionDataTask else {
            fatalError("Unknown task in session")
        if let continuation = taskToContinuation[dataTask] {
            if let error {
                continuation.finish(throwing: error)
            } else {

This meant the unpacker had to be rewritten to process buffers instead of individual bytes, which, honestly, simplified the code enormously. I only rewrote the fast path because it turns out that all of the zip files I care about contain the data length in the inital file header anyway, but for completeness I’ll probably port the slow path too sometime.

Ok, so, how much faster was URLSession.dataTask(for:) over URLSession.bytes(from:delegate:)?

It unpacked the whole 2.6GB file, and wrote it out to disk in 1004ms

URLSession.bytes(from:delegate:) is not just slow. It is incredibly slow. You probably shouldn’t use it (and the related functions on URL and FileHandle too)

30 March 2024

Fighting with sendable

Now I’ve enabled strict concurrency checking in preparation for Swift 6, like everyone else, I’ve been fighting with Sendability warnings, especially when trying to return data from a task.

Update 2:

Matt Massicote linked me to his Concurrency Recipes which has a way to do what I want without needing the closure to be Sendable which works much nicer. Create an async wrapper around the synchronous code that generates the results, putting it inside a CheckedContinuation and inside that continuation it runs the generator on the DispatchQueue

func getResults() async -> [String] {
    await withCheckedContinuation { continuation in
        DispatchQueue.global().async { [weak self] in
            let result = []
            continuation.resume(returning: result)

Now it can just be called with

func doSomething(resultsClosure: @escaping ([String]) -> Void) {
    Task { [weak self] in
        if let results = await self?.getResults() {
            try resultsClosure(results)

because the resultsClosure is only called inside the MainActor isolation.

Original post with the more complicated way

func doSomething(resultsClosure: @escaping ([String]) -> Void) {
    Task.detached {
        let results: [String] = []

Capture of 'resultsClosure' with non-sendable type '([String]) -> Void' in a `@Sendable` closure

The problem, I think, is that the closure isn’t sendable because [String] isn’t sendable because it could, in theory (though not in practice here), be mutated by the closure.

So I created a custom struct wrapping the array as a nonmutable property

struct SendableResults: Sendable {
    public let results: [String]

func doSomething(resultsClosure: @escaping (SendableResults) -> Void) {
    Task.detached {
        let results = SendableResults(results: [])

And the closure needs to be explicitly required to be Sendable

func doSomething(resultsClosure: @Sendable @escaping (SendableResults) -> Void)

Now this makes the closure required to be isolated to the detached Task, and I’m calling a MainActor isolated function inside it so I get an error.

Call to main actor-isolated instance method 'setResults' in a synchronous nonisolated context

so I need to add an await to the setResults call

resultsGenerator.doSomething { [weak self] in
    let results = $0.results
    await self?.setResults(results)

but that’s made the closure async, so need to declare it as such and await it when calling the closure.

func doSomething(resultsClosure: @Sendable @escaping (SendableResults) async -> Void) {
    Task.detached {
        let results = SendableResults(results: [])
        await resultsClosure(results)

I feel like there’s probably some way I could mark it all as unchecked sendables to shut the compiler up, but in my mind that’s the same as force unwrapping a variable because “I know it’ll never be nil” - it’s a hack that will come back to bite me later.

Update 1:

As Matt Massicote pointed out on Mastodon, I don’t need to wrap [String] inside SendableResults, the warning was just about the closure needing @Sendable annotation.

21 March 2024

Xcode placeholder trick

A trick I found a few months back but always forget to write down. A lot of things are like that, by the time I have time to write something, I’ve forgotten the things I want to write down.

Anyway, Xcode placeholders.

Xcode’s placeholder system can be used when you need to copy lines and change one little detail, like in this extremely contrived example

array.append("string 1")
array.append("a different string")
array.append("string number 3")
array.append("string 4")

You can type

array.append("<#string#>") and then copy and paste it 4 times, and each time you paste, the <#string#> part will be selected ready for you to enter the new bit.

3 January 2024

On Zipping

In the last thrilling installment on mission creep I talked about how I had been distracted by a nearly finished Bandcamp collection downloader. Well, just to shave the yak further, I thought a nice feature would be to unzip the downloaded files into place.

My usecase for it is so I can download the music to my NAS so I have to download the zips to my local machine, unzip them manually and upload them by hand which is a pain.

So I thought, Apple has the Compression.framework to handle decompressing zip files from a buffered stream, so I could do it as the file downloads meaning I don’t need any intermediate storage.

I tried a few experiments but quickly realised that it’s not zip files that Compression.framework can decompress but zlib streams, that is, it handles the compressed data inside the zip file, but not the container format wrapped around the compressed data.

Took a look and found a load of Swift packages that can handle the container format, but none that could handle it in a buffered fashion and the only one that hinted that it might be possible - Zip Foundation - seemed to require a full file to be present so it could jump around a bit.

Why? Well, it seems that the Zip container format isn’t very condusive to streaming. It has an entry header that describes the data, but it might optionally put the size of the data in an extra section AFTER the data, which means working out how long the data will be is trickier. The format does have an archive catalogue which lists the offset for all these extra sections, so the way to handle it is to read that catalogue first and then parse all the extra sections. Except they decided to put this archive at the end of the file.

Which kinda sucks for trying to do the decompression on the fly.

Anyway, I spent Christmas break reading the Zip format specification and implemented a very very simple unarchiver that takes the async data stream from URLSession and unpacks it as it goes.

The twist is that the files from Bandcamp does appear to contain any compressed data so I don’t even need Compression.framework, it’s just a case of finding the data in the archive and writing it out to disk.

The code is nearly ready to go into Ohia and maybe it’ll be finished soon, cos I’ve got a ZX Spectrum Next I want to play with