Using the stack to avoid using a register

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Using the stack to avoid using a register
by on (#85037)
Maybe I'm late to the party, but I only just thought of this.

Code:
   lda [dress4],y;Get 16x16's number
   
   sta attribpalette,x
   inx
   
   dec reserved1;If this is the last 16x16 tile to update
   beq scrollxpalettes.end;branch


can become
Code:
   lda [dress4],y;Get 16x16's number
   
   pha
   
   dex reserved1;If this is the last 16x16 tile to update
   beq scrollxpalettes.end;branch

It saves 7 cycles if attribpalette is not zero page. dec becomes dex because that register is freed for 3 cycles, 2 cycles for no inx, and 2 cycles because pha is faster than absolute,x.

Naturally you have to save and restore the position of the stack pointer before you start, and go through the data backwards if you also want to read it from the stack, but... just. Ugh. Don't know why I hadn't thought of it sooner.

I guess I'm posting in case anyone else never thought of it. I am thoroughly upset, because I'd need to rewrite everything to take advantage of this and there are so many places in my code where it's useful.

by on (#85038)
I guess that the interesting thing about using part of page 1 for update buffers is that you can access them using stack operations. That helps a lot, because decoding map data for example requires significant use of the index registers, so having both of them available is certainly helpful.

You have to be careful though, in case an interrupt happens while you're writing to these buffers... For example, if you are about to write the last byte of a row of tiles and an NMI happens, that will write 3 bytes to the buffer, possibly corrupting 2 bytes of whatever comes after (or before, depending on how you look at it) it.

by on (#85051)
I get what you're saying. That's only the case if the buffer is very near overflowing into something useful, right?

For instance I'm using $0100-$017F of the stack for attribute byte mirrors and I allocate 0180-01C0 to some buffer? I'm glad you said something because, you're right, if the interrupt happened when I was writing there it would have messed up some attributes and would have taken me a while to catch why.

I now plan to do like $0184-$01C4 and of course set and restore the stack pointer very carefully. I should have enough space for what I'm doing with wiggle room between them for the return bytes left on the stack. Is there anything else I should know?