Hi guys.
I've been thinking the last few days what would a good compression scheme for NES graphics be.
I coded a simple implementation of an RLE encoder, and testing it against some rather random data resulted in pretty bad compression, about 10%. I then tried a simple LZ encoder, with a 256 byte window, and I managed to compress the same data to little more than half it's original size.
However, these LZ an RLE implementations are generic compression schemes acting on bytes. I can't stop asking myself whether an algorithm that was aware of the NES graphics format could perform any better. Anyway, it's not likely that we'll find many rows of repeated pixels, so RLE is pretty much out. Should LZ be aplied at the pixel level, and not at the byte level? The pointers to previous data would be as big as before, but maybe longer matches may compensate for that. I don't know. Maybe compressing each 1-bit plane separatelly will work better?
What have you guys learnt from your own experiences with compression of NES CHR data?
The inherent NES graphics scheme already represents the data in a quite compact form. The use of 4-color tiles, a color palette, and multiple sub-palettes selected by attribute bits covering 16x16 pixel areas are all ways of reducing data. Some games using CHR ram probably go further and represent fonts in 1 bit per pixel format. As a guess, run-length encoding and huffman might help a bit at the pixel level (i.e. work on 2-bit entities). At any other level you're adding complexity by giving the compressor groups of pixels to work on, which might hide some patterns. What kind and volume of graphic data are aiming to compress?
Although this is something that may be usefull for many future things, my aim right now is to fit 8kb of CHR data into 2kb. It's for that minigame compo thing.
I haven't decided the compression scheme yet, but wichever it may be I'll try to draw art that's somewhat "flat" for this game, to make compression more possible.
I have some serious doubt that moving to pixel-oriented compression will help. In the LZ implementation I'm using now, 1 byte tells the length of the run (and whether it is a compressed run or an uncompressed string) and, if compressed, another byte points to the data to copy. Two bytes are necessary to describe compression. This is not much, when the primary unit is a byte, but may be pretty heavy if the primary unit is a NES pixel (2 bits). I don't know if I'll be getting many more matches, in wich case it may be worth it.
I don't know about combining two methods of compression (huffman after the other one), 'cause decompression would be very complicated. You'd either have to go through the first stage of decompression and when all data is avaliable you go through the 2nd stage (wich would require a lot of memory) or run both algorithms in parallel, and that may be complicated, deciding when to switch to the other and all.
I'm kinda lost on this one...
If PAQ or WinRK can't shrink 8k down to 2k, nothing will.
Speaking of which, PAQ shrinks the 8k of Super Mario Bros's graphics down to 4.4k, while 7-zip and winrar got it down to 4.7k.
The 6502 compatible Pucrunch got it down to 5k, and APlib (which needs its decompressor ported) got it down to about the same size.
Since pucrunch and aplib are reletively fast to decompress (aplib is faster), and both LZ based, it looks like LZ is the way to go, if your game is Super Mario Bros.
Yeah, compressing them down to 2k will be no picnic. I'll probably go with my LZ approach, but since it uses a small 256 byte window I'll try to keep the tiles I find similar close to each other, so they will compress better. I could do some hardcore compression by hand, instead of letting some program find what is similar, but that'd be slow and I doubt I'd benefit much from that.
If you use cartridge ram, you could do it windowless, making it decompress the whole thing into ram at once, using anything seen before as a history.
Yeah, but having access to the whole data may not be that great. The pointers to previous data, will have to be larger than 1 bytes, and that may have some impact over the compression ratio. On the other hand, repeated data would be more likely to happen, so the increased compression may account for the bigger pointers.
Anyway, I've decided something, at least for the background, that doesn't have to be so detailed. I'll use variable tile resolution. Groups of tiles hill have a header indicating the vertical and horizontal resolution of the tiles that follow. This will allow me to have detailed tiles when needed, but have very compact ones when resolution isn't needed. With a healthy detailed/simple ratio I could reach a great compression ratio.
I'm still fuguring things out, but it will most likely be something like I described above.
I think two levels of compression is do-able, as long one of the level is very light compressed.
Typically, to encode text, DTE compression can easily down the size to 60% of what it was just by having dual-letters in a simple byte (since you won't use all the 8 bits for a single letter). You won't need ram, since a text is fully acessible even when DTE compressed (you just need a few more pointers instead of a real buffer). This actually isn't any serious compression, so you could add a "real" compression on the top of this, such as LZ.
The same could happen with mixing RLE and LZ compression. RLE isn't any "serious" compression algorithm.
Another point is that you have to define your own algoritm in function of how your data is. Typically, if you have a meta-tile based map, you probably will use only 5-bit for the metatile index, leaving 3 bytes for a possible repeat counter, this is "packed" RLE compression.
I don't intend to compress the levels maps (at least not with general purpose compression schemes), I'll be using indirection (screens composed by blocks that are composed by metatiles) to produce long levels with minimum size.
I'll be compressing CHR data only. I think that the "variable resolution" is a very good idea. Sure, things may look a bit blocky here and there, but that's better than having a ridiculous ammount of tiles to use. I'd happily sacrifice some resolution in favour of variety. And it's not like it will be too blocky, many tiles in NES games are like that already. The background does not need that much resolution. Sprites are a very different matter, they can get pretty unrecognizable fast.
You could use vector graphics like the old King's Quest series (before they remade it to use bitmap graphics). Devote a bunch of tiles to a background area and render the graphics at the beginning of the level.
How about using an algorithm like Scale2x to smooth low-resolution tiles? I've implemented it in Z80 before for monochrome graphics, and it took 271 bytes. No clue for the 6502 size.
Then again, 271 bytes is pretty big for a routine.
Dwedit wrote:
How about using an algorithm like Scale2x to smooth low-resolution tiles?
That is a good idea! Graphics would definatelly be reduced to 1/4 of it's size. I thinks this may be usefull only for the background, though.
Quote:
I've implemented it in Z80 before for monochrome graphics, and it took 271 bytes.
What algorithm exactly did you use? Any pages describing it?
http://scale2x.sourceforge.net/algorithm.html
A major problem is that you have to arrange the tiles, and decide how to handle edge cases. If you do it per 8x8 tile, the results will suck. You need to group large objects together and isolate them for best results.
You can see the release thread (with Z80 source and screenshots) here:
http://www.unitedti.org/index.php?showtopic=2068
You did that using those exact rules? Your sprites look very good after scaling. I guess it works better on black and white images than with color, no?
And you're right, using it on individual tiles will suck. Maybe I could define the objects as a whole, and have the NES scale them and then convert to individual tiles. That'd be cool.
If you make your objects strictly rectangular, you can define an object with a single byte (the dimensions as nibbles), followed by raw tile data.
So a 16x16 object which would resize to 32x32 would take 65 bytes. Although, would the overhead of unpacking the nibbles be worth the one byte saved per object? Probably yes.
How much space might the scale2x unpacker take in PRG?
As blargg has stated, the NES tile format is already "compressed". Running any other algorithm on top of that won't yield a whole lot of bang for the buck; it's comparable to using .zip on an .mp3.
That being said, i have two suggestions:
* Procedural tiles. Write 6502 code to render to tile data (lines, text, etc).
* Multiple-bitplanes. Store 2 1-bit images per tile. Tricky. But worth it.
Yeah, they are already very compressed in the sense of color, but not really in the spatial sense. NES graphics usually have a lot of flat areas, and that is not explored by the use o lower bitdepth.
I'm working on a variation of the HAAR wavelet transform, that is supposed to make images more compressable. It does not compress the image, but a simple RLE step after the transform is enough to achieve pretty good compression.
My transform turns this:
Into this:
You have to agree that the second image is much more compressable, even with RLE. Maybe a zig-zag RLE, like in JPEG. The transformation was applied once to the image, some images may compress better after several transformations.
The result after the transform is an image with the axact same size (byte-wise and dimension-wise) as the original, so, in the worst case I'll have no compression at all. But it's likely that I'll get some compression every time. At least my data will not expand.
If I try to leave some flat areas or reduce detail a bit whenever possible, when the image goes through this filter it will be highly compressable.
And the decompression (reverse transform, actually) is very simple, and could be implemented well in 6502 assembly.
Your image above is trying to fake more colors by using the alternating vertical line pattern. It'd be more easily RLE compressible if you used more bits per pixel, representing patterns as another color. For example, you could use 4 bits per pixel to represent the even and odd column colors. 0000 would mean even and odd columns use color 0, 1001 would mean that even columns use color 10 and odd columns use color 01. Combine this with doing things at half resolution and it should compare to your wavelet example, but be simpler to implement.
This shows how it's important to avoid introducing any artifacts into the data that aren't in the original. Represent as close as possible the essence of the source graphics. If you can whittle down the mental idea of what your source graphics really are, you can find a compact representation.
blargg wrote:
Your image above is trying to fake more colors by using the alternating vertical line pattern.
But that was an isolated example. I doubt that there will be many of these in the game. The top part are grass tiles (just doesn't have the correct palette yet), and grass is usually drawn with vertical stripes, but I won't be using many "patterns to simulate colors" through the game.
Quote:
It'd be more easily RLE compressible if you used more bits per pixel, representing patterns as another color. For example, you could use 4 bits per pixel to represent the even and odd column colors. 0000 would mean even and odd columns use color 0, 1001 would mean that even columns use color 10 and odd columns use color 01. Combine this with doing things at half resolution and it should compare to your wavelet example, but be simpler to implement.
Yes, but I don't want to actually use lower resolution. I've been thinking
A LOT about this lately. The wavelet way is the best I came up with so far, it is very adaptative, so that you can combine high and low resolution together. And it succeeded in removing the patterns. They were in fact turned into flat areas, wich will highly increase compression performance.
I don't think the pattern thing will help me in this case, but I did like the idea. The fact that you could represent the tiles with lower resolution if you could also define patterns as colors, so that the resolution does not look so crappy. I liked it.
Quote:
This shows how it's important to avoid introducing any artifacts into the data that aren't in the original. Represent as close as possible the essence of the source graphics. If you can whittle down the mental idea of what your source graphics really are, you can find a compact representation.
But these lines were supposed to be in the original, it's grass. =)
I know the graphics are rushed but... it's still grass.
tepples wrote:
How much space might the scale2x unpacker take in PRG?
Did you check the page explaining it? It seems pretty simple, there are only a few simple rules to follow. Accessing individual pixels would be the real pain, in my opinion. Maybe the scaling should be handled in 8bpp, and then converted back to the NES format when uploading to CHR RAM. If done on small objects at once, it woudn't be a memory (RAM) problem.
Also, checking the pixels near the borders may complicate it a bit, since there are some inexistent pixels the normal routine would be trying to check, and you must trap that.
Can't say how much it would take, but I don't think it is minigame compo material. Dwedit's 1bpp implementation of it in Z80 assembly took 271 bytes (without the boundary checks?)...
I haven't had much time to work on this, so it took me a while. I came up with a scheme to hold all the graphics in only 2432 (2048 + 256 + 128) bytes. It is kind of a crazy idea, but it seems very promising.
If anyone is interested, it goes something like this:
Blocks of 16x16 pixels are individually encoded, wich means there are 128 of them to fill the whole 512 tiles of the pattern tables. All graphics are drawn with rectangles (from 1x1 to 16x16 pixels in size, with any possible combination in between). Each block has a background color, wich is used to clear it. Each block also has a rectangle count, that tells how many rectangles are used in that block. Each rectangle has a starting position in the block, and it's width and height.
2048 bytes are used to hold the positions and sizes of 1024 rectangles, to be used on all 128 blocks. Wich means the rectangles/block ratio is 8. That may seem little at first, but with the simple graphics this game requires it will be OK. Most background blocks can easily stay within this average, but most sprites will go a bit over that. The main character uses twice the average. Diagonal detail is practically forbidden, as they are hard to represent with rectangles. Maybe a future version will incorporate other shapes? The algorithm can't get too complicated though, as the goal here is to save space.
128 bytes hold the background color and the rectangle count for each block. 256 bytes hold the colors of each of the 1024 rectangles. These bytes are interleaved with level map data (wich doesn't use the top bit), so they aren't of much impact.
A preliminary version of the decoding algorithm is ready and working (although I should prepare some more meaningful test data, I'm using just a couple of random rectangles for now), and it takes between 2 to 3 seconds to decode all blocks. Wich means that is the time to wait before anything shows on the screen. I'll try to improve the algorithm a bit, but I don't think I'll be getting much of a speed increase. Do you think this is too much time? Are 3 seconds enough for people to think something is not right?
I though about flashing the screen or something, but I'd rather use something that doesn't mess with $2006 (no palette changes), since I want to render all tiles at once, setting $2006 only once, in the beginning. I don't want to interrupt that. Maybe the color (de)emphasis bits? Anyway, I'll think of a simple effect to let the player know the game is alive.
tokumaru wrote:
Maybe the color (de)emphasis bits? Anyway, I'll think of a simple effect to let the player know the game is alive.
Probably be better off using the monochrome mode bit. Turns 0F black into 00 grey -- very noticalbe difference visually. The emphasis bits are much more subtle.
Disch wrote:
Probably be better off using the monochrome mode bit. Turns 0F black into 00 grey -- very noticalbe difference visually.
hum... but then I'd be flashing 2 colors only... kinda boring! =)
Quote:
The emphasis bits are much more subtle.
I wish I could do some sort of smoother effect, like cycling the emphasis bits, giving an idea of animation, rather than flashing. But I guess you are right, it will be much too subtle.
Heh, I didn't follow all the post, but do you really think that a *LOSSY* graphics compression could be good on a system with such low resolution like the NES ? If you're talking for a 1kb constest or something, okay, but for a 'real' NES game, wouldn't that just even limit more the NES graphics that what they are already limited ?
You are 100% right. In the past few days I figured, what good would it be limiting myself even more... for a silly contest? So I decided I'd not lower the quality of a game so that it would fit in a smaller area.
I'm not good at making decisions. And when I do make them, I seem to pick the wrong ones... =)
Quote:
I'm not good at making decisions. And when I do make them, I seem to pick the wrong ones... =)
Hey, I think your scheming were perfectly suited for one of those contests. It'd be cool to have a game's graphics fit just right with a compact decompressor.
In the case of the contest, maybe, but this isn't real NES developpement since your size is limited to ridiculous amount of data.
In a normal NES game, having lossy compression would be a real shame, since the graphics themselves are already made a "compact" way, with 2x2 tiles attributes and 2 planes 8x8 tiles.
Bregalad wrote:
this isn't real NES developpement since your size is limited to ridiculous amount of data.
Unless you want to clone WarioWare on the NES. Imagine putting two hundred minigames into a 512 KB Game Pak.
The idea was OK, but it was not optimum. The size of the decompression code was around a full page (256 bytes), if I remember correctly. And 18 bits per colored rectangle (2 for color, 8 for position and 8 for dimensions) also wasn't that good.
It is a strangle battle to fight: If you have little space, that means you need compression. To better compress the data, the compression scheme becomes more complex. As it becomes complex, the decompression code gets bigger and starts eating ROM space. So, as the data shrinks, the decompressor gets bigger, so we have to find a healthy balance between the two. And that's not always so simple.
I plan on doing more work on compression schemes, specially for graphics, but for now this is all I have, and it will remain like this since I'm not using it for my current project. If anyone wants to see the code as is, let me now and I'll look for it.
Compression should be only used for with big, big chunks of data, like levels maps or text in a RPG. Small things (such like a standard tileset) doesn't need to be compressed, because it slows the process time, the game developpement time and doesn't spare that much space.
Forgive me, but GFX compression IS very important. The graphics used by a level can eat as much space as it's map, depending on the game. I'm designing a game where all (about 12) level maps TOGETHER use only 8KB (they are somewhat compressed). That's the same as 1 full tileset (512 tiles).
If you are using CHR-ROM of course that is not even a discussion, you'll not be using compression for the tiles. But if you were to use CHR-RAM, you woudn't want all those tiles stealing huge chunks of your PRG/DATA space.
On other gaming systems, such as the SNES or the Megadrive/Genesis, that must use RAM for tilesets, games almost always compress their tiles.
I some times consider GFX compression more important than level map compression, as it has less impact on performance. You'll most likely decompress only once, before the level starts, while levels must be partially decoded all the time, as the player moves, due to memory restrictions.
I'm in favour of encoding the levels in more compact ways, yes, but not with general compression, 'cause that makes it harder to have random access to parts of the map without huge tables of pointers, that kinda go against the purpose of compression.
The SNES and Megadrive/Genesis have 4BPP tiles, while the NES have 2BPP tiles. The SNES hold up to 1024 BG tiles and 1024 sprites tiles in one single tileset, while the NES have only 256 of each. The space of tiles loaded in a single BG or sprite tileset is 4kb for the NES, and 32kb for the SNES, so 8 times larger.
So, the space taken by SNES graphics is much larger, so compress them becomes more significant.
If you have large, large amount of tiles in a game, yeah, it's meningfull to compress them, but not for a small game where you only have a few tilesets.
You often have more maps than tilesets (sound logical, right ?), so you'd normally compress maps over tilesets.
Tough, one big advantage of compressing tilesets over maps is that you don't have to buffer the whole chunk to ram, just write it to $2007 directly in the decompression routine. This of course go hell if you want to animate tiles.
tokumaru wrote:
If you have little space, that means you need compression. To better compress the data, the compression scheme becomes more complex. As it becomes complex, the decompression code gets bigger and starts eating ROM space. So, as the data shrinks, the decompressor gets bigger, so we have to find a healthy balance between the two. And that's not always so simple.
In fact, it's not even computable. See
Kolmogorov complexity.
tokumaru wrote:
Forgive me, but GFX compression IS very important.
If so, why are there only three North American NES games using TGROM (MMC3 + CHR RAM), two using TQROM (MMC3 + CHR ROM + CHR RAM), and no other MMC3 boards that have CHR RAM?
tokumaru wrote:
The graphics used by a level can eat as much space as it's map, depending on the game. I'm designing a game where all (about 12) level maps TOGETHER use only 8KB (they are somewhat compressed). That's the same as 1 full tileset (512 tiles).
Super Mario Bros. used about 8 KB for maps, but it shared the tileset among all maps.
tokumaru wrote:
You'll most likely decompress [tiles] only once, before the level starts, while levels must be partially decoded all the time, as the player moves, due to memory restrictions.
Not if half the tiles on the screen are animating, as is the case in
Super Mario Bros. 2 or
Super Mario Bros. 3, as Bregalad points out. And not if you have one continuous map for the whole overworld, as is the case in a few notable RPGs.
Bregalad wrote:
So, the space taken by SNES graphics is much larger
But the cost in dollars per megabit of replicating a program was greater in the NES era.
I've always thought about separating the bits and compressing them separately with some bit-sized RLE. I never tested it to make sure if it's even work, but if anyone has time...maybe? Because IMO getting graphics down even 20% easily would be nice.
You need to handle repeated patterns in order to get significant compression on the NES, RLE will never have outstanding performance, specially when the graphics are broken up into 8x8-pixel squares (there will never be really long runs of repeated bits).
Did we really need to necrobump this thread? There have been other threads where better techniques have come up.
Not mentioning I made a thread about a tool I made that allows you to apply a dozen of different compression technique to the same data and compare the result....
Not that all of my algorithms are less efficient than tokumaru's, as his is optimized for graphics.