Put your decompressed next nametable piece to a buffer in a piece of RAM at $100 (You can use other places, but PLA STA is great for a little unrolling). Then decompress the next update to the screen outside of vblank into the buffer. It's not too bad at all. Honestly, that compression is simple enough you can probably do it in VBlank completely if you don't want to/can't use a buffer. I'd say it's worth buffering, though. Here's a decompressor to the PPU with the format (MMMM.RRRR) where M is the metatile number and R is the repeat count for it. You'll need to add a bit to it and it probably doesn't work as I JUST wrote it, but it may help others see how they can do it/what needs to happen.
Code:
MetatileataArrayFormat: UPPER RIGHT TILE,BOTTOM RIGHT TILE,UPPER LEFT TILE, BOTTOM LEFT TILE. ;16 times for 16 tiles.
ZPNTData: 2 bytes zeropage ;Zeropage pointer to nametable stream. Expected to be set up by main level loader.
Metatile: 1 byte RAM. ;Metatile using right now. Expected to be set up first run.
DCStepsLeft: 1 byte RAM. ;Steps left in this decompression. Expected to be set up first real run, too.
ColStepsLeft: 1 byte RAM. ;Steps left for column. In metatiles. Set up every time code is ran.
DecompressStripToPPU:
(Set PPU to screen location next updated here, and also set PPU to +32 for horiz. scrolling.)
LDA #15 ;Load number of metatiles needed put up.
STA ColStepsLeft ;Store that number.
LDY #$00 ;Clear Y because we need it for a plain [] read for the nametable data pointer.
.ContinueMetatileProcess:
LDA Metatile ;Get current decompression Metatile.
ASL
ASL
TAX ;Put tile index to data.
; (determine left or right metatile block here. Add 2 to X if left. Keep same if not.)
.WriteMetatile:
LDA MetatileBlock,X ;Get tile upper.
STA $2007 ;Put to screen upper tile.
LDA MetatileBlock+1,X ;Get lower tile.
STA $2007 ;Put to screen lower tile.
DEC DCStepsLeft ;One less decompressed tile
BPL .TestColumnDone ;No new metatile needs loaded.
INC
LDA [ZPNTData],Y ;Get new byte.
INY ;Point to next stream byte.
PHA ;Save A for high bits later.
AND #$0F ;Get Repeat count.
STA DCStepsLeft Store to steps left.
PLA ;Recover A.
LSR A ;Put top 4 bits into bottom 4.
LSR A
AND #%00000011 ;Clear bottom 2 bits filled with the top 2 bits of loop count.
TAX ;Put new metatile to X, we're basically reloading the metatile array pointer (X) to the new metatile data needed.
LSR A
LSR A ;Finish putting top 4 bits to bottom 4.
STA Metatile ;Store the metatile.
; (determine left or right metatile block here. Should default to +2 if moving right or +0 for left since we know we're on a new colum edge.)
.TestColumnDoneMain:
DEC ColStepsLeft ;Is the entire column of metatiles written?
BNE .WriteMetatile ;If no, loop another metatile!
.ExitDecompressor:
(If we just wrote the edge of a column [which is determined by the "camera"] add Y value to the current ZPNTData pointer to point to upcoming metatiles. If not, don't adjust the metatile pointer in any way as we'll need the same tiles for the upcoming column, which are the same metatiles just the other side.)
RTS ;Or can be code after this, not having to be a subroutine.
It should be trivial to change this code to decompress to a buffer, too, if you wanted to. I've never wrote a game that scrolls during playing, but I've messed with scrolling regs a ton. If anyone else (preferably who have written a scrolling game) got any comments on this decompressor, please speak up. I'd love to know where I can improve my programming. (Despite, in this case, I'd probably use a buffer instead of doing the code above in vblank.)