Can't seem to beat the vblank clock

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Can't seem to beat the vblank clock
by on (#137271)
I am attempting to update the nametable in 75 places and send the sprite OAM all before time runs out but it looks like too large of a task. What is the best way to do it? Should I split the task up over a few frames?
Re: Can't seem to beat the vblank clock
by on (#137272)
Post some code? Post what you're trying to do? (i.e. is this for scrolling new tiles in during gameplay, maybe scrolling + status bar update, displaying a static title screen, something else? The answer to what I think is the best way changes depending on that.)

If it's a during gameplay thing, yes, it might be wise to split the updates across frames. If your game only scrolls in one direction (And your camera moves less than 8 pixels a frame), you'd only need to update 32 tiles+8 attribute bytes max (assuming no status bar/destructible levels).

My game can update a row and a column (so at least 78 bytes) as well as the sprites in a single frame. I use a stack based approach with unrolled loops for Y attributes. Here's a post that contains an old version of the NMI routine I used to do that. But even that wouldn't be fast enough to do 75 randomish places. And is probably overkill unless you're scrolling in two directions at once. If that's not helpful, be more specific about your goal please?
Re: Can't seem to beat the vblank clock
by on (#137273)
Any address change adds another 2 bytes you need to write (to $2006), so worst case you're looking to push 225 bytes to the PPU, which is quite a lot to do in one frame. If it's organized in strips, though, it should be easier.
Re: Can't seem to beat the vblank clock
by on (#137274)
Like most things on platforms with limited power such as the NES, there isn't a global answer that solves all cases. Each case has it's own optimal solution, which is why we really need to know what you're trying to accomplish with this.

The simplest solution would by far be to split the update across multiple VBlanks, but depending on the effect you're trying to pull off it might not look good. Background animations for example might look weird if updated in "waves". Bucky O'Hare does update large background effects progressively and it doesn't look so bad though.

If you absolutely need everything to change at the same time, you have to look into optimizing the loop that does the copying. When we're talking 75 iterations, each cycle you save will buy you a significant chunk of CPU time. If you can't optimize the loop any further, you should consider unrolling it, partially (e.g. instead of copying 1 byte per iteration, copy 15) or completely (big chain of copy commands with no loop at all) to reduce the overhead of maintaining a counter and branching back to repeat the loop.
Re: Can't seem to beat the vblank clock
by on (#137298)
In my NES port of Chu Chu Rocket, I was doing 192 tile updates in a single vblank (plus a couple of extra scanlines). This was 8 strips of 24 tiles with unrolled code doing each strip.

My code looks something like this:

Code:
DrawNtLoopTop:
   lda ntRowHighAddress,Y
   sta PPUADDR
   lda ntRowLowAddress,Y
   sta PPUADDR
   iny
DrawNt24:
   lda nameTableBuffer+0,X
   sta PPUDATA
   lda nameTableBuffer+1,X
   sta PPUDATA
   lda nameTableBuffer+2,X
   sta PPUDATA
   lda nameTableBuffer+3,X
   sta PPUDATA
   lda nameTableBuffer+4,X
   sta PPUDATA
   lda nameTableBuffer+5,X
   sta PPUDATA
   lda nameTableBuffer+6,X
   sta PPUDATA
   lda nameTableBuffer+7,X
   sta PPUDATA
   lda nameTableBuffer+8,X
   sta PPUDATA
   lda nameTableBuffer+9,X
   sta PPUDATA
   lda nameTableBuffer+10,X
   sta PPUDATA
   lda nameTableBuffer+11,X
   sta PPUDATA
   lda nameTableBuffer+12,X
   sta PPUDATA
   lda nameTableBuffer+13,X
   sta PPUDATA
   lda nameTableBuffer+14,X
   sta PPUDATA
   lda nameTableBuffer+15,X
   sta PPUDATA
   lda nameTableBuffer+16,X
   sta PPUDATA
   lda nameTableBuffer+17,X
   sta PPUDATA
   lda nameTableBuffer+18,X
   sta PPUDATA
   lda nameTableBuffer+19,X
   sta PPUDATA
   lda nameTableBuffer+20,X
   sta PPUDATA
   lda nameTableBuffer+21,X
   sta PPUDATA
   lda nameTableBuffer+22,X
   sta PPUDATA
   lda nameTableBuffer+23,X
   sta PPUDATA

   txa
   clc
   adc #24
   tax
   cpx drawLimit
   bcs DrawNtTopReturn
   jmp DrawNtLoopTop
DrawNtTopReturn:   
   rts
Re: Can't seem to beat the vblank clock
by on (#137299)
If using ca65, you can say the same thing a little more succinctly with the .repeat command:

Code:
DrawNtLoopTop:
   lda ntRowHighAddress,Y
   sta PPUADDR
   lda ntRowLowAddress,Y
   sta PPUADDR
   iny
DrawNt24:
   .repeat 24, I
      lda nameTableBuffer+I,X
      sta PPUDATA
   .endrepeat
   txa
   clc
   adc #24
   tax
   cpx drawLimit
   bcs DrawNtTopReturn
   jmp DrawNtLoopTop
DrawNtTopReturn:   
   rts