iain's development activities. May contain z80, Cocoa, or whatever.

11 January 2023

Slow few days. Have the even clip code done and am now adjusting it to work for the odd clips as well. Most of the last few days has been spent agonising about how to improve the speed and keep the code maintainable. For example, the code for drawing the even/odd maps are identical except for three subroutine calls, and I was trying to find the best way to not duplicate this code, but not add too much extra time to each frame. I tried jump tables and vector tables and finally the thing that worked was just using self modifying code to change the addresses that get called every frame. But that added about 60 t-states to each frame over just having the same code duplicated twice.

I had resigned myself to just maintaining two copies of slightly complex code, but then I realised that if I can move the memory page switching code to just switch pages once at the start of each frame then I’ll save 100 t-states per tile, which for the level I’m using to test works out as a saving of about 15,000 t-states, so I shouldn’t worry too much about adding 60 t-states per frame.

6 January 2023

Tile overdraw is now working and drawing is much faster now we don’t need to blank the whole play area each frame. It’s just too slow at the moment to run without double buffering especially once the clipped tiles get drawn. Going to go back to those soon, to get scrolling “finished”.

Currently the screen is drawn between lines 44 and 148, but the draw routine is synced to the vblank interrupt. Technically I could move it down to the bottom of the screen to give more drawing time, but I quite like having it centred in the screen. The Sam Coupé does have per line interrupts, so I could set a line interrupt at line 148 and sync the drawing to that. But ultimately, any extra time gained there would end up being lost when game logic and sprites were introduced to the so we’d end up back at square one and have to double buffer anyway.

5 January 2023

Still working on tile overdraw to erase any tile. Ended up spending time playing with the less used jp conditionals like m, po, and pe, which I will never remember what they actually mean. p means that the sign didn’t change, m checks that it did. So it lets us do to check when a was 0, like in a decrement loop.

dec a
jp m, noReset
ld a, 7
noReset:

if you want to loop from 7 -> 0 and then reset back to 7. The alternative would be to check for 0 before the dec which is larger and slower and is going to need multiple jumps.

4 January 2023

Tiles are 8x8. In the zeroOffset and evenOffset case we just draw the 8 pixels of width, but because tiles are shifted 1 nibble right in the oddOffset case we end up drawing 10 pixels, with the right hand pixel being set to the background colour. It occurred to me that if we drew the 10 pixels in zeroOffset and evenOffset as well, with the right most two pixels set to the background colour that would erase any previous content, removing the need for a clearPlayarea call. A quick test suggests it works greatly increases the scrolling speed although there’s still something wrong with it.

3 January 2023

Got the non-clipped version of the odd offset map finished, and it wasn’t as hard or as expensive as I’d expected. Maybe 80 t-states more per tile than the even version. Not the worst penalty, although in a screen full of tiles it’d add up.

Going to work on the left and right clipped tiles next.

Every time the map is drawn to screen the code needs to select which of the three methods get called: zeroOffset, oddOffset, or evenOffset (which need to be called when the scroll offset is 0; 1, 3, 5, 7 and 2, 4, 6 respectively). Two options for it would be to use a 16 byte vector table with the addresses for each value, or just do a if/else if/else block.

(all code here is done in my head, so might have some errors)

The vector table is a constant speed cost of 71 t-states

    ld h, 0
    ld l, a
    add hl, hl
    ld de, mapOffsetVectorTable
    add hl, de

    ld a, (hl)
    inc hl
    ld h, (hl)
    ld l, a
    jp (hl)

and for longer if/switch statements it might be worth it, but for only 3 options the if block is quicker.

    bit 0, a
    jr nz, oddOffset
    cp 0
    jr nz, evenOffset
    jr zeroOffset

With this slightly odd ordering, the timing is

  • 0 : 41 t-states
  • 1, 3, 5, 7: 20 t-states
  • 2, 4, 6: 34 t-states

The thinking behind this ordering is that zeroOffset takes the least number of t-states to run and is only called once for every 8 frames, and oddOffset takes the most but is called 50% of the time so selecting it should be cheapest and zeroOffset should cost the most.