Donut NES CHR compression

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Donut NES CHR compression
by on (#217823)
Donut is the sequel to my previous codec Bagel.

(2018-02-15 Edit)
New versions with slightly tweaked format from last year's version:
6520 decompressor for ca65 via github.
Compressors written for C and/or python also available there.

Written narrative, dedicated webpage, and windows exe builds may appear next year or so.
Re: Donut NES CHR compression
by on (#229449)
up-to-date version is on github, for anyone looking

https://github.com/jroatch/donut-nes
Re: Donut NES CHR compression
by on (#233829)
I'm really fascinated by this. So far I've been using lz4 for all of my compression needs. Simple and ubiquitous. On the other hand, it doesn't do a great job compressing character tables compared to name/attribute data. My thoughts about how to improve the situation haven't gone much further than "uh... preprocess with XOR maybe?"

Donut gives slightly better compression than straight lz4, using only ~20 bytes more code. If I had 3 full character tables, it would even pay off to have both compressors for different types of data. Additionally, lz4 really requires turning off the PPU while decompressing. Donut on the other hand seems to work with a small CPU side buffer and looks relatively easy to put in a coroutine and pause for the NMI when the buffer gets full. Maybe it's even fast enough to stream a couple tiles per frame?
Re: Donut NES CHR compression
by on (#233834)
Thanks,
Working with some form of coroutines was part of my thinking when designing this (and the codec it descended from).

but in practice there is is one complication I've found with it's current design, the requirement to retain the buffer for the "Reuse/Duplicate previous block" feature. In my usage I found that It gets largely used to code a sequence solid color blocks, since solid color blocks takes 2 bytes, but duplicate blocks takes only 1 byte.

I'm contemplating making a breaking change where I get rid of "Duplicate blocks", and make solid color blocks only 1 byte. That will remove the requirement to save the 64 byte buffer, making it easier for further routines to transform the uncompressed data.
(C encoder easier to program as well)

I'm not sure about using it directly in vblank as each (non duplicate) 64 byte block takes 1293~7247 cycles.

------
I'm also glad to hear about the viability of lz4 on the NES.
Re: Donut NES CHR compression
by on (#233835)
Duplicates were important for the Action 53 data set. Games with CHR ROM often have duplicate "X" tiles to mark unused tiles, and there was an intent to interleave different 4K or 8K CHR banks in case a game uses $2000 bit 4 (background pattern table base address) or CNROM to animate a subset of the tiles.

Would it be better to make a format option to allow or disallow dupes?
Re: Donut NES CHR compression
by on (#233841)
And that was precisely the reason I held on to it for as long as I did.

but even the way I'm doing it now (comparing a match of 64 bytes) is subpar when compared to pb53 (matches of 16 bytes).
I can do better, and I have plenty of codec header space to do it with (all header bytes >= 0xc0).

and even if I don't make up better compression schemes, I believe I have the option to just turn those X tiles to solid black ones.

Edit: didn't see the second half of your post

I have some ideas about now to handle duplicates between pages, and I think taking out the duplicate block feature in the core decoder will help with how that will work.
Re: Donut NES CHR compression
by on (#233845)
JRoatch wrote:
the requirement to retain the buffer for the "Reuse/Duplicate previous block" feature


Not an issue if you reserve the buffer bytes exclusively for Donut though right?

JRoatch wrote:
I'm not sure about using it directly in vblank as each (non duplicate) 64 byte block takes 1293~7247 cycles


Nah. I mean fill the buffer, yield, copy it to the PPU during NMI, then resume during the next frame. It would require modifications, but the source isn't that complicated. This is basically impossible with lz4 without extra work RAM.

JRoatch wrote:
I'm contemplating making a breaking change


I'm sort of curious about that. Donut isn't really an archival format. Why worry too much about breaking changes? Speaking personally, since my tools are a build dependency, I keep them in the same repository as my project (or in a git submodule, etc).
Re: Donut NES CHR compression
by on (#233847)
slembcke wrote:
Donut isn't really an archival format. Why worry too much about breaking changes?

True.
It's from a mental habit I picked up from reading too many data format documents.
Re: Donut NES CHR compression
by on (#233849)
slembcke wrote:
JRoatch wrote:
the requirement to retain the buffer for the "Reuse/Duplicate previous block" feature

Not an issue if you reserve the buffer bytes exclusively for Donut though right?

It's 64 out of 2048 bytes that the rest of your program can't use while a decompression is running in the background.
Re: Donut NES CHR compression
by on (#233851)
Sure, but you could say that about anything that uses RAM. If streaming to CHR RAM is considered important, 64 bytes of RAM might be considered a reasonable cost. It's not like it's being forced onto anybody. At this point I simply think it would be interesting if it could be done. :p
Re: Donut NES CHR compression
by on (#234458)
Anyway I went ahead made some breaking changes in the basic block format, and edited the first post pointing to the current new locations.

I also managed to complete the C version of the encoder, and I really like the 1200x speedup compared to the python version.