As I've said before this game has masterpiece compression for gfx, so I was lucky enough to expand ROM itself and overwrite uncompression code, so it could load plain data. As a result, I've got
this. As you can see This patch changes Font loading part, so uncompressed font is now at 0x4D in ROM, so I've began to repaint it, of course... And BANG! Game Crashes! I couldn't believe it - I thought it was impossible. Arright - just open ROM in tile moelester or something and delete blank red(second color) tile(I mean, make it black) - you'll see. So, as far as I can say, the problem somewhhere near code:
Code:
$E6CF:2C 02 20 BIT $2002 = #$20
$E6D2:50 FB BVC $E6CF
$E6D4:A9 81 LDA #$81; this command is never executed
Strange thing, cause the difference only in GFX data loaded...
[/url]
That code is waiting for a sprite 0 hit. If you changed the graphics then the right pixels may no longer be overlapping so the hit never happens. You will have to check out the original game to see which pixels can't be changed.
What's so special about the compression used in this game?
Quote:
That code is waiting for a sprite 0 hit.
D'oh! I've read about it hundred times! Thanks a lot!
Quote:
What's so special about the compression used in this game?
Dunno... I've never senn something like this before. I tried to ask 'bout it
here. There I've also included IDA Lising. I just couldn't figure out what it as by myself, so I've also asked at couple of forums - and nothing... Plus even if i could write unpacker, it would be much harder (impossible? =)) to write the actual packer.
griever wrote:
Dunno... I've never senn something like this before. I tried to ask 'bout it
here.
Yeah, I remember that...
Quote:
There I've also included IDA Lising. I just couldn't figure out what it as by myself, so I've also asked at couple of forums - and nothing... Plus even if i could write unpacker, it would be much harder (impossible? =)) to write the actual packer.
Compression is something that really interests me... but I don't have the patience to trace the decompression code... it seems like too much work! =) But, just for curiosity, since you seem to think the compression in this game is very efficient, can you give me an idea of how good it is? Do you have any compression ratios to share?
Quote:
Do you have any compression ratios to share?
Well... As far as I know from Code/Data Logger:
start screen GFX: Comp-0x584 bytes; UnComp-0x9D0 bytes; Ratio- 1,78
font GFX:Comp- 0x13F bytes; UnComp- 0x290 bytes; Ratio- 2,06
Don't know... Is it good or bad?
I spent some time yesterday trying to figure something of this compression scheme out... but that's just too boring! I saw they had a function to load 1 bit from the input data, and another to read 2 bits. And depending on the values read they's read something from a table... but I got bored easily, and though I was just wasting precious time I could be working on my game.
griever wrote:
start screen GFX: Comp-0x584 bytes; UnComp-0x9D0 bytes; Ratio- 1,78
I think
my LZ compressor did better than that most of the time.
EDIT: I think I was wrong about this. It usually did worse than that. Of course it really depends on the data...Quote:
font GFX:Comp- 0x13F bytes; UnComp- 0x290 bytes; Ratio- 2,06
Fonts should be easy to compress because usually they use only one of the bitplanes. I'm not sure about Bee52, but with such a good compression ratio I'd guess this is the case.
EDIT: I just noticed the link is dead. If you are interested I can upload again, as soon as I find the exact file...
Quote:
I saw they had a function to load 1 bit from the input data, and another to read 2 bits. And depending on the values read they's read something from a table...
That's the VERY beginning of the hell =)
Quote:
I think my LZ compressor did better than that most of the time.
Maybe. BTW how about kirby's adventure tile map compression scheme? Have you seen it? Well, It's easy
Quote:
It uses seven different kinds of compression techniques, which includes the common LZ-Copy,RLE, and uncompressed data methods. The remaining types are simply variations of LZ-Copy and RLE.
But quite effective (sorry, but I don't have ratios). And I've never seen working packer for ths game.
Quote:
I just noticed the link is dead. If you are interested I can upload again, as soon as I find the exact file...
Why not? It's interesting...
Fact: RLE is nothing more than the special case of LZ with a 1-byte window.
griever wrote:
That's the VERY beginning of the hell =)
Yeah, I gave up because it was getting hard to keep track of all the branches... maybe looking at a log of the executed instructions would make it easier to understand... as we could see exactly what happens for each bit read...
Quote:
Maybe. BTW how about kirby's adventure tile map compression scheme? Have you seen it? Well, It's easy
Quote:
It uses seven different kinds of compression techniques, which includes the common LZ-Copy,RLE, and uncompressed data methods. The remaining types are simply variations of LZ-Copy and RLE.
Sounds like what's used in SMW and many other SNES (and even the Pokemon games on the Game Boy, I think) games... is this it? This was pretty well documented. This seems hard to encode efficiently, as you'd have to try every type of compression for each byte and see what works best.
The compression schemes used in Sonic games are also very interesting (the Mega Drive ones, as the SMS ones just seem to use simple RLE). They have 3 or 4 different formats, each used in differenttypesof data (level maps, graphics, music, etc). The most interesting one is some sort of LZSS optimized to encode small runs ofrepeated bytes, IIRC. gotta check that out again.
Quote:
Why not? It's interesting...
I'll upload it to a more permanent place in a few minutes... The compression ratio for the CHR I tried was about 1.5 to 1.6... The decompressor written in 6502 assembly will output the data to $2007, but I should probably make a version that outputs to regular RAM.
tepples wrote:
Fact: RLE is nothing more than the special case of LZ with a 1-byte window.
You are right. The cool thing about LZ is that, being a step up from RLE, it has built-in RLE support.
Get the LZSS compressor/decompressor
here. The host does not allow ZIP files, so change the extension from JPG to ZIP.
I tested it a bit, and I'm not so sure it performs that well anymore... This is just a generic compression algorithm, so it'd be no surprise if it didn't do very well with NES tiles.
I'm still thinking of a way to make better use of the 2D redundancy present in the tiles... generic compression algorithms are 1D, and will ignore that.
If I have the time I'll take another look at that Bee52 code.
Quote:
Get the LZSS compressor/decompressor here.
Arright, thanks a lot.
Quote:
If I have the time I'll take another look at that Bee52 code.
OK, also you can try
this (sorry for shitty colors, but I have net problems, so just had to compess image).
griever wrote:
OK, also you can try
this (sorry for shitty colors, but I have net problems, so just had to compess image).
What is this? I only see a very tall image with a white-ish rectangle at the top...
Quote:
What is this?
Ah, sorry... As I said, that I have net problems =)
Here it is
griever wrote:
Cool! Is everything here? This sure makes things easier!
Quote:
Is everything here?
That's the whole Decompress Subroutine, but some little soubroutines (like 'load_New_gfx_byte') are not displayed.
Bump! For a good reason. I figured the compression out. It was a lot of work, I've been working on it for like, 4 days (with breaks for real life things, of course). It isn't so difficult once you get past all the bit-reading and spaghetti branching and try to describe it in plain text.
Here are my notes on the compression format used in tiles in the game "Bee 52" by Codemasters. This game was used during the whole reverse engineering process, but apparently the format is used some of their other games as well. I checked the game "Big Nose Freaks Out" and it appears to contain the exact same decompression logic, it was just assembled to a different location, and the variables are at different locations as well.
[Overview]
I don't really know what to call this compression scheme. From a point it looks like it's dictionary-based, because it defines common pixel sequences that can be reused, but it also offers the possibility to deviate from the pre-defined sequences by using variable-length codes (longer codes for values that deviate more), something typical of entropy encoding. So all I can say for sure is that it's a hybrid of some sort, unlike anything I've seen before.
The compression scheme is based on pixels (2-bit values) and how likely it is for a color to be followed by another. The compressed stream contains information on how to configure a set of tables that will be used during decompression, until a new set of tables is defined, which can happen before the decompression of any tile. The tables have to be built before any tiles are decompressed, and after that they are reconfigured only when necessary (in theory, it's possible to not set up the tables, but I doubt anyone would decompress several blocks of data using the same statistics).
The first byte in the compressed stream is the tile count, which means that a maximum of 256 tiles are allowed per compressed block. The rest of the stream contains only 1-bit and 2-bit values. Following the tile count is a bit that selects between updating the tables (0) or decoding a tile (1).
[Updating Tables]
There are 4 tables, which I'll call A, B, C and D, each one is 4 bytes long (because there are 4 possible pixel values). Table A defines what happens after each of the 4 possible pixel values is used (index 0 tells what happens when color 0 is used, index 1 tells what happens when color 1 is used, and so on). Each pixel, after drawn, is used as an index into this table. The possible values for this table are:
%00: The color will always be followed by itself. The same pixel will be used until the end of the row.
%01: The color is always followed by a single specific color, which will be stored in table B.
%10: The color can be followed by 2 other colors, in tables B or C (depends on bit from the stream).
%11: The color can be followed by all others, and they may be in tables B, C or D (depends on bit(s) from the stream).
Tables B, C and D are used to hold up to 3 colors (depends on the mode selected) that might follow the original one. The colors that follow the original ones are represented differently depending on the original color. The following is a list of the codes of each color depending on the original one:
original %00:
%01 = %1
%10 = %01
%11 = %00
original %01:
%00 = %1
%10 = %01
%11 = %00
original %10:
%00 = %1
%01 = %01
%11 = %00
original %11:
%00 = %1
%01 = %01
%10 = %00
The tables are updated quite frequently, it seems. The initial tileset (for the Codemasters logo) contains 157 tiles, and the tables are modified 31 times during their decoding. So, in average, each set of tables was used for about 5 tiles. I wonder what criteria an encoder might use to pick the best times to reload the tables.
[Decoding Tiles]
The tiles are processed one row at a time. A bit from the stream selects between drawing a new row (0) or using the previous row again. Reusing rows is great for vertical redundancy. Rows are drawn from left to right, and 2 bits are read from the stream in order to draw the first pixel. After this pixel is drawn, it's used as an index into table A. If the code read from the table is %00 (means that the color is always followed by itself), the same pixel will be repeated for the whole row. If it's not %00, a bit from the stream selects between repeating the previous pixel (1) or processing a new one (0). Repeating the previous pixel using a single bit allows for runs of the same color to be represented in about half their size (a single bit per pixel is used instead of 2).
Code %01 means that the color in question is always followed by a single specific color (unless the previously mentioned bit causes it to repeat, which always an option no matter the color), to be fetched from table B (using the old pixel as an index). Code %10 means that the color in question can be followed by other 2 colors. A bit from the stream selects between fetching the new pixel from table B (0) or C (1). Code %11 means that the color in question can be followed by any of the other 3 colors. A 1 from the stream means that table B should be used. A 0 followed by another 0 selects table C, while a 0 followed by a 1 selects table C. The conclusion is that a color that is followed by fewer colors will be encoded with fewer bits.
The decompressor always checks for the end of the row and for the end of the tile, so that it can decide between continuing to process the current one or starting to process the next one.
[Sample Data]
The following is a very simple example of a valid compressed block of tiles. It doesn't use most of the format's features, but serves as way to illustrate a bit better how it works. The following is valid data for drawing 4 plain tiles, one of each color:
$04: block contains 4 tiles;
0: update decompression tables;
00: pixels of color 3 should be repeated for the whole row;
00: pixels of color 2 should be repeated for the whole row;
00: pixels of color 1 should be repeated for the whole row;
00: pixels of color 0 should be repeated for the whole row;
0: new row;
00: first pixel is color 0;
1111111: repeat row 7 times;
1: decode tile;
0: new row;
01: first pixel is color 1;
1111111: repeat row 7 times;
1: decode tile;
0: new row;
10: first pixel is color 2;
1111111: repeat row 7 times;
1: decode tile;
0: new row;
11: first pixel is color 3;
1111111: repeat row 7 times;
Here are the bytes that are formed by the bits above, in case anyone wants to try it out for themselves (you can patch the game replacing original data with this or simply put this somewhere in RAM and have the game decompress it from there):
$04
%00000000 ;$00
%00001111 ;$0F
%11110011 ;$F3
%11111110 ;$FE
%10111111 ;$BF
%11011111 ;$DF
%11110000 ;$F0
[Conclusion]
This compression scheme yields a respectable amount of compression. I can't say what part of it is most effective (in fact, to me it seems that in many cases representing complex rows of pixels with it might cause a great deal of expansion!), but the truth is that I couldn't achieve comparable compression ratios with any other scheme I tried.
I'm not sure if the decompressor is as well coded as it could be. I found it very hard to follow, and by reading my description of the format it seems like it didn't need to be so messy. On the other hand, the messy code might even have been an attempt to protect the format.
Anyway, I decided to study this because I really wanted to know what kind of tile compression would always beat my best one by about 8%. I'm not sure if I'll work on this any further (like coding an encoder), as I wouldn't feel well using something I didn't invent (or at least figured out myself, even if it was invented before). I will probably consider applying some of the ideas to my own format, but that's all.
Ah! This compression scheme already drove me to despair! What a great job, Tokumaru. Thanks a lot and I have no idea how did you unscramble this mess, but that's impressive - need to comment out my graph, I guess.
Thanks again!
griever wrote:
I have no idea how did you unscramble this mess
I'm not used to reverse engineering code but I was so curious that I decided to go through with it.
I basically rewrote the program several times, each time increasing the level of abstraction. The first time was basically comments to the raw assembly program. From there I replaced longer sequences of code with smaller sentences describing what was done. Then I wrote some indented pseudo-code in order to follow it better. Lastly, I made a schematic of what was done for each bit read, and if finally became clear what sequence of bits would do what. During this process, the function of every variable became clear as well.
It was a lot of work, and I can safely say I'm not looking forward to doing anything like this any time soon! =)
I'm really curious about how an optimum encoder for this scheme would work though... but I shouldn't think about this too much, or I'll end up trying when I don't really have the time! O_o
As I understand your description, it's more of a Markov-chain algorithm than anything else.
Each color is associated with a mode number in table A, which equals the number of distinct colors that can follow this color on a scanline in this block of tiles. When you compress tiles, you count the number of distinct colors that can appear to the right of this color, disregarding duplicate rows and pixels that are the same. The most common color goes in table B.
For each non-repeated row:
mode = A[color]
- mode == 0: Repeat this color to end of line.
- mode == 1: Read a bit for each pixel to determine repeat or B:
1: color
0: B[color] - mode == 2: Read a bit to determine repeat or B/C, then possibly another bit to determine B or C.
1: color
00: B[color]
01: C[color] - mode == 3: Read a bit to determine repeat or B/C/D, then one or two bits to determine which table.
1: color
01: B[color]
000: C[color]
001: D[color]
But in fact, for mode 3, tables C and D could have been computed from B, where C[i] is the lower-numbered color that isn't i or B[i], and D[i] is the remaining color. That would save 8 bits per block of tiles (no need to store D) but take slightly longer to decode.
To determine where to place the block boundaries, you could just punt and use a Pucrunch-style algorithm to represent encoding as a
search over the graph of all possible bitstreams for the most efficient path from 0 to the decoded string.
Your understanding of the compression is correct.
I can understand how to find the optimum blocks in LZ compression, because the matches have a well defined length (meaning that if you select the longest ones you'll end up with less blocks and better compression), but in Bee 52's compression the blocks can be as long as the whole stream.
To me it's a kind of paradox: In order too see how well a group of tiles compresses, you must have the tables ready. However, in order to build the tables you must gather statistics from the group of tiles that will be compressed with them. See? Since you don't know how long the block is you can't build the optimum tables for said block, and without tables you can't verify how many blocks will compress well with it.
Also, defining a new set of tables has a cost itself, which should be considered when setting a new boundary.
I'm sure there is an answer, because the data is successfully compressed with this scheme, I'm just not seeing it.
EDIT: I guess it would make sense to do something like this:
1. Try to start a new block at every single tile, with all blocks ending at the last tile, storing the cost of encoding each of these blocks;
2. Looking from the last block backwards, select the one with the best compression ratio to be the last block.
3. Repeat the process, with the new last tile being the one before the previously selected block;
4. After all the best blocks have been selected, output them in the correct order.
Did you have something like this in mind, tepples?
I had in mind something more like the
A* algorithm. After each tile, there are two paths: start a new block here or don't start a new block here. Then g(x) is the number of bits from the tiles encoded so far, and h(x) is an underestimate of the remaining data size, such as 1 bit per remaining pixel.
EDIT: I just thought up a new way to encode tables B and C for colors that use mode 2: have the two bits represent the one color that isn't i, B[i], or C[i]. Now we're down to two bytes for a new block: table A, and the information needed to generate tables B, C, and D. The more tightly the codec packs the tables, the less costly starting a new block can be.
Mode 0 can only be used for a color that appears exclusively at the right side of a tile. Most often, this would be the background color for a large area. But without large areas of repeated rows or mode 0 pixels, I can't see how this codec could pack a CHR smaller than about 60 percent. Breaking the 50 percent barrier would take some work.
tepples wrote:
Now we're down to two bytes for a new block: table A, and the information needed to generate tables B, C, and D.
Why 2 bytes? I think it will often be less. 1 byte is always needed, because we need to define how many colors can follow each color, so 2 bits x 4 colors = 1 byte. After that, it varies depending on whether colors are followed by any other or not, and if the followers will be represented in 1 or 2 bits.
Like you cleverly observed, in all modes only one color must be defined and the others (if any) can be discovered based on it.
.1 color follows the original one: the color in question is specified;
.2 colors follow the original one: the color that never follows the original one is specified, so the remaining 2 are used (the other doesn't matter, as both will be encoded using the same number of bits, but the encoder and the decoder must agree on an order);
.3 colors follow the original one: the color that will be encoded with fewer bits is specified, and the remaining 2 are used in whatever other encoder and decoder agreed on (it also doesn't matter because both will be encoded using the same number of bits);
It's really great that it's possible to encode the tables in 2 bytes or less. I believe it will very often be less, because there are always only 3 colors to pick from (the original can't be repeated), and one of them can be specified with a single bit.
I'm feeling awfully tempted to make an encoder for this. Seeing how compact the tables can be makes me want to see how much better than the original compression this can be.
Hello, guys. I've been aside from the scene for some time. But my programming skills have significantly increased.
Could you please update if unpacker/packer for this compression scheme is still not written? Were there any more efficient compression algorithms found for tiles compression?
griever wrote:
Hello, guys. I've been aside from the scene for some time. But my programming skills have significantly increased.
Could you please update if unpacker/packer for this compression scheme is still not written? Were there any more efficient compression algorithms found for tiles compression?
In general, see
"Tile compression" on the wiki. Tokumaru made an (incompatible)
modification to this codec that makes the frequency tables smaller, allowing a compressor to better adapt to changing statistics, and Bisqwit made a
encoder for this revised format.
Thank you, Tepples!
I guess, I still have a small chance to get in history and write my codec, as original Codemasters compatible tool does not exist
This could help me do a couple of small translations to mother tongue.
OK,
here is my tool for original compression scheme. I have also updated list of Codemasters games, which used this algorithm.
My tool generates optimal chunks and is always better, than original compressor. But, of course, tokumaru's and Bisqwit's tools have higher compression rates, as they are not tied to original packing scheme.