PPU timing problem - NESdev BBS

PPU timing problem
by Snaer on 2010-02-19 (#56787)

I was running some test roms on my emulator and ran into a problem.
When I ran blarggs "2.vbl_timing.nes", I got error 8 which is:
"Reading 1 PPU clock before VBL should suppress setting."
I don't know how to check this because 1 PPU clock is shorter than a CPU clock.
For example if I would use lda absolute to read, that would take 4 CPU cycles = 12 ppu cycles.
Would I then need to know exactly on which one of those 12 cycles the read occurs?
Do I have to check inside the lda function if VBL is going to be set next PPU clock?

I would also like to know exactly at what time the VBL should be set/reset and the NMI occurs.

Sorry if this has already been asked here before.

by Zepper on 2010-02-19 (#56792)

- Here's the solution I've found:
http://nesdev.com/bbs/viewtopic.php?t=3887

- It's all there. Feel free to ask. ^_^;;

by Snaer on 2010-02-19 (#56795)

hmm, maybe I'm just stupid but it's still unclear to me.

"If the VBlank flag rises at cycle 341"
Are there any "cycle 341"? If one scanline is 341 cycles then the last one should be 340, right?

and what does "2nd or 3rd byte of a LDA $2002" mean?

There are also different instructions that can read from 2002 and they have different lengths(in clockcycles). That might make it more complicated? Or maybe I just misunderstand the whole thing.

by Dwedit on 2010-02-19 (#56802)

I just fake it and assume that any instruction reading or writing to or from a hardware register is one of the 4 CPU cycle instructions, and that the read/write will happen at the end of the fourth cycle of that instruction. That's good enough to pass blargg's tests, since they just use 16-bit absolute addressing at the registers in question.

Obviously that's incorrect if it's one of the indexed instructions, but I've never seen any game access the PPU or Mapper IRQ instructions with an indexed instruction.

Here's something to think about when you emulate events that happen in the future, such as Interrupts and the like:

I'll use something on the Gameboy as an example, but I'm sure there are also NES games that do things like this.

The ROM repeatedly reads from the Interrupt Flag to see if an interrupt happens. Then the interrupt would happen. The CPU enters the interrupt handler, which acknowledges (clears) the interrupt. Then it returns to the code that's polling for interrupts.

On hardware, the interrupt would happen some time after the Interrupt Flag read instruction starts to execute. So when it reads the flag, it will read that an interrupt has taken place.

On some kinds of emulators, this would be an infinite loop, since it reads the emulated interrupt flag, then the cycles remaining runs out, then it triggers the interrupt event, so the game code polling the interrupt flag never sees anything happen.

If you don't want to simulate the CPU cycle by cycle and find when in time the reads and writes are within each instruction, there's something else you can do.

Have a 'Pre-Event' happen before the real event happens.

For example, you have a 'Interrupt is going to happen in 4 CPU cycles' event happen before the Interrupt event in your emulator code. The "Interrupt is going to happen soon" event will change the emulator's Interrupt Flag value, so the instruction that polls it will finally get a different value.
This also lets you create correct "Point of no return" situations, where you have a write to a "Disable Interrupts" register executing, but there are 3 CPU cycles left. The Naive approach would run the code that disables interrupts run, then see that there are -1 cycles remaining, so trigger the interrupt. But we just disabled interrupts, so that won't happen. That's not correct.
You hit the 'point of no return' 4 CPU cycles before the interrupt happens, so the code to disable interrupts will still run, but the interrupt will still happen as well.

by tepples on 2010-02-19 (#56811)

Dwedit wrote:
The ROM repeatedly reads from the Interrupt Flag to see if an interrupt happens. Then the interrupt would happen. The CPU enters the interrupt handler, which acknowledges (clears) the interrupt. Then it returns to the code that's polling for interrupts.

That's not reliable (not saying that bugs don't slip into released games like Ms. Pac-Man). Case in point:
Code:
:
lda interrupt_flag
bmi :-

If the interrupt happens during the bmi instruction, the CPU gets in and out of the ISR (which acknowledges the interrupt) before the main thread can see that the interrupt happened.

by Dwedit on 2010-02-19 (#56812)

Sqrxz for GBC used that kind of code to wait while fading the screen in. I don't think the programmer cared how many frames would be missed, just as long as it provided a delay.
No, I still haven't added the correct code to handle that situation into Goomba.

by Zepper on 2010-02-19 (#56813)

Snaer wrote:
Are there any "cycle 341"? If one scanline is 341 cycles then the last one should be 340, right?

- Right. You count 0,1,2...340.

Quote:
and what does "2nd or 3rd byte of a LDA $2002" mean?

- LDA = 1 byte (3 ppu clocks)
- 02 = +1 byte (3 ppu clocks)
- 20 = +1 byte (3 ppu clocks)

- The VBlank can rise during the last 6 ppu cycles. I hope to not be taking a wrong path, since nobody has commented it..? Works fine, I wouldn't consider it an "hack".

by Snaer on 2010-02-22 (#56922)

Hmm this is complicated.
So the only case when this happens is when lda $2002 starts on ppucycle 329 of line 240? or do I get it all wrong?

by Zepper on 2010-02-22 (#56933)

Quote:
"Reading 1 PPU clock before VBL should suppress setting."

- Read means LDA $2002, and 1 PPU clock before would occur right before effectively reading 2002.

- It's not over complicated. ^_^;;

by Snaer on 2010-02-22 (#56934)

hmm...
As I've understood it, the read occurs at the end of the fourth cc of lda $2002, that must be on the 12th ppu cc (i guess). then it must be as I said in the previous post. Is that correct?
(sorry if my questions are silly, I just need to get things clear)

by Zepper on 2010-02-22 (#56944)

- As far as I understood the things, 1 PPU clock before VBlank means an effective read(2002) before the VBlank flag rising (1 PPU cycle).

- Of course, there's a problem: alignment of CPU and PPU. Anyone else could give an hand? ^_^;;

by Snaer on 2010-02-23 (#56963)

What's an effective read?

by Zepper on 2010-02-23 (#56982)

Code:
LDA $2002
# address R/W description
--- ------- --- ------------------------------------------
1 PC R fetch opcode, increment PC
2 PC R fetch low byte of address, increment PC
3 PC R fetch high byte of address, increment PC
4 address R read from effective address

- Source: http://www.viceteam.org/plain/64doc.txt

by Snaer on 2010-02-24 (#57023)

ah! thanks^^

by cpow on 2010-06-20 (#63151)

Snaer wrote:
hmm...
As I've understood it, the read occurs at the end of the fourth cc of lda $2002, that must be on the 12th ppu cc (i guess). then it must be as I said in the previous post. Is that correct?
(sorry if my questions are silly, I just need to get things clear)

The simplest way to handle it is to keep track of the PPU frame-cycle (ie. keep a counter that you reset to zero at your 'start of frame') in your PPU object and, on any read of $2002 [which presumably is passed to or handled by your PPU object in some fashion] check to see if the read is occurring on the cycle of interest and handle that special case.

Keeping track of it from within your CPU object is the wrong place...

That way it doesn't matter what the instruction is that is used to do the read. The PPU object just checks for reads of $2002.