I've been very embroiled in trying to cut out the fat in my 8-way scrolling engine for The Legends of Owlia, and I made at least one really interesting discovery that is allowing my engine to perform no worse than a 2-way scrolling engine. I searched for prior discussion about this but could not find it.
The scroller in its current form is capable of uploading both a row and a column (both nametable and attribute table) to the PPU in a single frame. This accounts for the possibility that the camera may follow the hero along a perfect diagonal where both a row and a column are aligned with the edges of the screen every 8 pixels moved diagonally. (and 16 for attributes).
However, if the camera happens to follow the hero along any of the other possible diagonal lines, the row and column updates will alternate, halving the amount of CPU time needed to decode the next frame, and also freeing up a lot of time in an otherwise very tightly packed vblank.
What I discovered is, if you detect that you're starting to move along one of these "forbidden diagonals," you need only to correct it on the very first frame (by maybe a pixel or two), and then you can continue scrolling diagonally. I find that this "bump" is unnoticeable because it is so minor. That's a small price to pay for eliminating worst case performance!
I'll be happy to upload a before and after video, if anyone is interested, with the monochrome bit trick showing cpu time used by the scrolling portion of my engine. I'm guessing I'm not the first to exploit this "trick" if you can call it that, but I thought I'd share anyway...8 way scrolling used to seem so intimidating, but now I find it doesn't need to be any worse performance wise than 2 way scrolling! I was quite shocked when I realized this.
The scroller in its current form is capable of uploading both a row and a column (both nametable and attribute table) to the PPU in a single frame. This accounts for the possibility that the camera may follow the hero along a perfect diagonal where both a row and a column are aligned with the edges of the screen every 8 pixels moved diagonally. (and 16 for attributes).
However, if the camera happens to follow the hero along any of the other possible diagonal lines, the row and column updates will alternate, halving the amount of CPU time needed to decode the next frame, and also freeing up a lot of time in an otherwise very tightly packed vblank.
What I discovered is, if you detect that you're starting to move along one of these "forbidden diagonals," you need only to correct it on the very first frame (by maybe a pixel or two), and then you can continue scrolling diagonally. I find that this "bump" is unnoticeable because it is so minor. That's a small price to pay for eliminating worst case performance!
I'll be happy to upload a before and after video, if anyone is interested, with the monochrome bit trick showing cpu time used by the scrolling portion of my engine. I'm guessing I'm not the first to exploit this "trick" if you can call it that, but I thought I'd share anyway...8 way scrolling used to seem so intimidating, but now I find it doesn't need to be any worse performance wise than 2 way scrolling! I was quite shocked when I realized this.