PPU frame timing

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
PPU frame timing
by on (#138790)
I ran into a strange problem with the emulator that I am developing: To make games work, I need to let the CPU run at least 250 times the number of cycles that it is suppose to run during the VBlank period. If I let the CPU run the expected number of cycles during VBlank, then the game logic runs correctly, but the graphics end up all messed up. It's almost as if it did not have enough time to do the tile memory copies or name table updates. By artificially extending VBlank with an extra 250 non-rendered lines, all the graphics work perfectly.

Any idea what could cause a strange effect like this?
Re: PPU frame timing
by on (#138793)
The number you quoted (250) is suspiciously close to the height of the PPU's picture. Is the CPU running at all between the end of one vblank and the start of the next? It's not like the ZX81 or Commodore 128 where the PPU monopolizes the bus throughout the picture. Games expect 20 lines of vblank (during which the CPU runs), 1 line of pre-render (during which the CPU runs), 240 lines of picture (during which the CPU runs), and 1 line of post-render (during which the CPU runs).
Re: PPU frame timing
by on (#138794)
tepples wrote:
The number you quoted (250) is suspiciously close to the height of the PPU's picture. Is the CPU running at all between the end of one vblank and the start of the next? It's not like the ZX81 or Commodore 128 where the PPU monopolizes the bus throughout the picture. Games expect 20 lines of vblank (during which the CPU runs), 1 line of pre-render (during which the CPU runs), 240 lines of picture (during which the CPU runs), and 1 line of post-render (during which the CPU runs).


My CPU runs 341/3 cycles for every scanline. And, I do have the 262 scanlines that you described. VBlank is 20 of them. But, as mentioned, for odd reasons, I have to pretend that VBlank is 5000 scanlines.

My CPU uses a table containing the number of cycles for each opcode (and the table is adjusted depending on memory lookups, page boundaries, branches-taken, etc.) Is there some multiplier that I am missing or something?

Edit: If an NES program has a lot of memory to copy, can it break up the copy across multiple frames with the help of some PPU flags? That is, maybe my emulator is failing because it thinks the copy is completed after the first frame of what is supposed to be a multi-frame copy.
Re: PPU frame timing
by on (#138823)
Maybe you have a counter which increments when it shouldn't and have to run until it overflows back to the proper value?
Re: PPU frame timing
by on (#138829)
Also, some instructions take variable time to complete, such as lda nnnn,x if it crosses a page. If you don't emulate the extra cycles, Battletoads will have a shaky status bar.
Re: PPU frame timing
by on (#138833)
mkwong98 wrote:
Maybe you have a counter which increments when it shouldn't and have to run until it overflows back to the proper value?


None that I can find :)

Dwedit wrote:
Also, some instructions take variable time to complete, such as lda nnnn,x if it crosses a page. If you don't emulate the extra cycles, Battletoads will have a shaky status bar.


I am computing the varying time required to complete each individual instruction. I have a table that is adjusted accordingly.

On average, an instruction takes about 4 cycles. With 29780 cycles per frame, only about 7445 instructions get to run max. When copying bulk data, what techniques are used to split the process across multiple frames. I suspect that I am missing something that allows the game to do that properly.
Re: PPU frame timing
by on (#138838)
I think I finally solved it! When the PPU is disabled (i.e. when sprite rendering and background rendering is off), the V and T registers mentioned in Loopy's scrolling doc should not be updated. Otherwise, bad things happen.