RLE tool for NES

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
RLE tool for NES
by on (#88304)
Too tired to type it all up, all information is in this download somewhere:

http://www.mediafire.com/?glfsch484xu185y

by on (#88311)
Why aren't you using zero page addressing for all the RAM you have?

For instance, CompressFlag is at $0301. This means it takes 4 cycles each time that cmp is needed. If it was on the zero page it would take 3 cycles. Let's say there are 18 "runs" of data. You lose 18 cycles for absolutely no reason. You're not tight on RAM, so why? The same with BIT DecompressSettings. Every time you do that, you lose a cycle because DecompressSettings is not on the zero page.

And in case you do decide to move those things to zero page, unfortunately nesasm doesn't give you the gains automatically.

Code:
;You have to put < before each zero page RAM label that you want to use zero page addressing for
;So do
BIT <DecompressSettings
;rather than
BIT DecompressSettings
;And when that byte is on the zero page, it will be faster.
CMP <CompressFlag
INC <DecompressOutput
;etc

Unless you are tight on zero page RAM you should do the above. It not only makes things faster, it also means your code takes up less space.

Generally, you have to choose between making your code faster or making it smaller. If you want small code, you have the right idea. But if you want fast code, avoid doing anything in a loop that doesn't need to be done.

Will the value of DecompressSettings EVER change during this subroutine? It doesn't look like it. So checking it is a waste of time for EVERY SINGLE run.

If I were you, I would check DecompressSettings immediately after you jsr to the subroutine, and then branch to a loop that never increments if the minus bit is clear, and always does if the bit is set. This means you have to duplicate some code, but it also means you only check DecompressSettings ONCE per routine.

In fact, you could ignore DecompressSettings entirely, and just make two separate routines. Think about it this way: The user has to set beforehand which one they're using. So why not save the writes required to do that, and just have them jsr to a different routine?

This plan saves the cycles for the write (6 cycles), and 4 cycles for every time BIT DecompressSettings would have happened.

This is a little strange in your C code.

You know you can just do
Code:
while(1)

Right?
Or even just
Code:
while(AlwaysOne)

There's no need for
Code:
while(AlwaysOne == 1)


Still, not bad for a generic solution. I'll mess with it a bit more and see if I can find anything else.

by on (#88312)
Yeah, I thought about everything you said before. I didn't make two routines because it just seemed like a waste, but it will make it faster and honestly I can make a bunch of programs and the user can select. I'll separate the two routines, it does make sense. And I don't know how everyone else uses their ZP, but I only use it for pointers and not much else, I put them in normal RAM because it seemed file, but yes adding the ZP addressing labels in and then moving them to ZP would work too. And I didn't even tink about the INC ZP for the other addresses, yikes! I'll fix that for this version, and then later I'll write the other decompressed programs possibly tonight, or tomorrow or sometimes soon. It should be cake, but I've been programming this for 2 days and don't want to mess with it anymore, heh. :) Thanks for the feedback! :D And I guess I could have just done that for the while loop, but I felt safer just comparing it to a number, in case I had a case where it needed set to 0 and not have a break to exit it in an if inside it.

ETA: Took ZP variables and put them into ZP mode and then moved more to ZP (All except a scratchpad byte for Y in the middle of the run decompression program, move it to ZP if you wish) and that's about it. I'll get those seperated routines done later. Anyone see anything else that can be improved on? :D

*Use the newer version. Details in first post.*

by on (#88313)
FWIW, I used a similar methodology in the FF2e intro code for Neo Demiforce. ROM space was extremely limited, and the title/intro screen (raw) was too large, so I RLE'd it. I did the compression of it by hand (yes really!), and wrote the decompression routine. It saved something like 200 bytes, which was enough.
LZMPi
by on (#88314)
I recently needed to compress something (though my need went away before I actually used the decompressor), and I stumbled upon LZMPi compress, an LZ variant. It was written for the Commodore 64, but I ported it for the NES.

It needs 64 bytes of RAM for history buffer tables ($6C0..$6FF), 35 bytes for run-time generated code (because NES's program memory is read-only), and 10…12 bytes of memory for other variables ($D0..$FF).

Source code:
http://bisqwit.iki.fi/src/lzmpi_nes.zip
Includes a compressor in C++, a decompressor in C++, and a decompressor in 6502 assembler.
The assembler program can be assembled with nescom, but I have not tested yet whether it actually runs.
The decompressor, with default settings, takes up 0x1F9 bytes of ROM space (505 decimal).

by on (#88315)
After an hour, I'm back to this tab. :D Lots of reading on that LZ-type compression, seems simple but damn effective for stuff like text in games, really nice to see. I'll get those 6502 decompressing programs written later, but I may dabble with that LZ-type compression some too, that seems like something else to learn for sure. :D Good stuff. If anybody else has sourced compression tools they'd like to share, post them here, I don't mind. I think a topic dedicated to just compression tools for games is a good idea, if anyone has more post them. :)

by on (#88316)
So when do I start porting Aplib to the 6502? I've already gotten it working on the ARM and Z80.

by on (#88317)
If you want to, go for it? I don't want anybody porting the tools for no reason or because there's a thread like this, just maybe post them if you have them done.

by on (#88318)
aPLib is already ported to 6502.

by on (#88369)
New update posted soon in first post.

by on (#88447)
Wooow.
All this talk about compression made me think I should eventually do my ultimate compression tools project discussed here.

The problem with compression is not only the complexity of the compression itself, but the complexity to choose which compression to use.
Various algorithms use various RAM amounts, are more or less CPU intensive, compress more or less the data and of course the size of the decompression code itself have to be taken into account because if it is too long it will kill the purpose of compressing data ! This is especially a problem with the Huffman tree in Huffman compression.

Another thing to take into account is if you want to compress different source of data in the same game using the same algorithm, so that you only store the decompression code once.

by on (#88457)
Bregalad wrote:
Another thing to take into account is if you want to compress different source of data in the same game using the same algorithm, so that you only store the decompression code once.

Unless the decompression code needs to access data in multiple banks, and your mapper uses 32 KiB bankswitching, and you don't have enough available RAM in $0300-$07FF to copy the code into RAM every time. Then you need to repeat the code in every bank that has compressed data. I've almost run into this with one of my multicart menu projects; I ended up moving things around so that I could load about half a kilobyte of decompression code into RAM.