NMI/BRK timing and such

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
NMI/BRK timing and such
by on (#110785)
I have been trying to get my NMI/BRK timing just right and am a bit stumped (it always seems just a hair off, not matter what I do), so I have a few questions, the first I think is the easiest:

1. Since IRQ/NMI/RESET are basically special cases of BRK. Does they still check if an interrupt should happen during the last cycle? I would assume yes. I doubt it matters much since Interrupts will be disabled by that time, but perhaps NMI could do something weird there? I dunno.

2. The normal operation of execution always has the first cycle do a read from PC and then increment PC. During an IRQ/NMI/RESET, the CPU will force the IR to be $00 in order to trigger the interrupt dispatch logic. That's fine. My question is **Does the CPU still do the read from PC like normal, skip incrementing PC and THEN force the IR to $00. Or is forcing the IR to $00 pretty much the only thing it does during that cycle if an interrupt is pending?

4. At the moment I assume that the forcing of the IR to $00 is just about the only real thing that happens during that cycle (it obviously does some other stuff to make the BRK act like an IRQ/NMI, but I'm keeping it simple). Based on that assumption, here's my core execution logic for executing 1 cycle of the CPU. Is there any obvious flaw? NOTE: instructions $100, $101, $102 are special indexes for RST/NMI/IRQ:

Code:
void clock() {
   if(current_cycle == 0) {
      if(rst_executing) {
         instruction = 0x100;
      } else if(nmi_executing) {
         instruction = 0x101;
      } else if(irq_executing) {
         instruction = 0x102;
      } else {
         instruction = read_byte(PC++);
      }
   } else {
      // execute the current part of the instruction
      execute_opcode();
   }

   ++current_cycle;
}


Thanks guys.
Re: NMI/BRK timing and such
by on (#110786)
Here is my interrupt service routine, that passes all the IRQ tests:

Code:
private void IsrIrq()
{
    PeekByte(pc.uw0); // 2 wait states at the beginning of irq
    PeekByte(pc.uw0);
    PokeByte(sp.uw0, pc.ub1); sp.ub0--;
    PokeByte(sp.uw0, pc.ub0); sp.ub0--;
    PokeByte(sp.uw0, sr); sp.ub0--;

    sr.i = !nmi; // todo: make sure i isn't set on nmi

    var vector = nmi ? 0xFFFAU : 0xFFFEU;
    nmi = false;

    pc.ub0 = PeekByte(  vector);
    pc.ub1 = PeekByte(++vector, true);
}


that method is called if the variable "interrupt" is true. interrupt is set on the last cycle as: "interrupt = nmi || (irq && !sr.i);". BRK has a similar vectoring scheme where the vector can be replaced by 0xFFFA if nmi occurred. In the code above PeekByte(addr, true); will do the last cycle interrupt flag polling. So in my code the ISR does poll at the last cycle.
Re: NMI/BRK timing and such
by on (#110789)
I just want to make sure we are testing the same stuff. Which IRQ tests do you pass?

I am specifically looking at the ones which deal with NMI and BRK happing at the same time and the BRK vectoring to NMI instead. Likewise, this is what my BRK routine looks like at the moment (but doesn't quite pass, it's off by a cycle or 2):

Code:
void op_brk() {
   static uint16_t vector_address;

   switch(current_cycle) {
   case 1:
      // read next instruction byte (and throw it away),
      // increment PC
      vector_address = IRQ_VECTOR_ADDRESS;
      read_byte(PC++);
      break;
   case 2:
      // push PCH on stack, decrement S
      write_byte(S-- + STACK_ADDRESS, pc_hi());
      break;
   case 3:
      // push PCL on stack, decrement S
      write_byte(S-- + STACK_ADDRESS, pc_lo());
      break;
   case 4:
      // push P on stack, decrement S
      write_byte(S-- + STACK_ADDRESS, P | B_MASK);
      set_flag(I_MASK);
      break;
   case 5:
      // fetch PCL
      set_pc_lo(read_byte(vector_address + 0));
      break;
   case 6:
      LAST_CYCLE; // this macro updates the various "nmi_pending/irq_pending/rst_pending" flags
      // fetch PCH
      set_pc_hi(read_byte(vector_address + 1));
      return;
   default:
      abort();
   }
   
   if(nmi_pending) {
      if(current_cycle < 5 && current_cycle > 1) {
         vector_address = NMI_VECTOR_ADDRESS;
         nmi_pending = false;
      }
   }
}
Re: NMI/BRK timing and such
by on (#110791)
proxy wrote:
I just want to make sure we are testing the same stuff. Which IRQ tests do you pass?


blargg's cpu interrupts tests, which contains the nmi/brk test.

Your brk function looks okay, are you sure that your nmi timing is correct?
Re: NMI/BRK timing and such
by on (#110793)
heh, no I'm not sure that my NMI timing is correct. But I'm not sure where it could be wrong. Is there any particulars that I should know of besides:

1. At cycle 0 if an NMI has been determined to be pending, IR is forced to $00 instead of a normal instruction fetch.
2. The NMI process is more or less the same as BRK (except it doesn't increment PC and vectors differently)
3. An NMI is pending if at the start of the last cycle of an instruction the NMI line is asserted.


Anything I'm missing timing wise?

I do pass the CLI latency tests, so I believe that point 3 is done correctly.

I currently Pass:
IRQ Flag Timing Test by Shay Green (30 Jun 2005) (PD).nes
IRQ Handler Test by Shay Green (30 Jun 2005) (PD).nes
IRQ Flag Operation Test by Shay Green (30 Jun 2005) (PD).nes

I fail:
NMI Suppression Timing Test by Shay Green (6 Nov 2005) (PD).nes (#3)
Disable NMI Timing Test by Shay Green (6 Nov 2005) (PD).nes (#2)
NMI Timing Test by Shay Green (6 Nov 2005) (PD).nes (#3)
Re: NMI/BRK timing and such
by on (#110795)
Until your NMI timing is right I wouldn't expect to pass the BRK/NMI test.

NOTE: The following is conjecture, but seems to go along with observations.

All interrupts use the same logic, with a few differences depending on the type of interrupt. For example RST forces /RW high during the pushes (This is why SP is decremented by 3, but nothing is written to the stack, and also why the i flag is set on resets). I believe the decision for which vector to use isn't made until after the return address and flags are pushed, which is why NMI can beat an IRQ or BRK. There are 4 actions that can trigger this logic, IRQ, NMI, RST, and a BRK (Software interrupt) instruction. The logic doesn't increment the PC, so there is some difference between IRQ/NMI/RST and BRK as far as that is concerned. So I suppose a "universal" interrupt service routine logic could look something like this (pseudocode):

Code:
void isr(bool brk = false)
{
    // 2 wait states would be consumed by BRK opcode, so the caller of this function for NMI/IRQ/RST would have to do the same before calling
    rw = rst ? 1 : 0;

    put_addr_on_bus( ( sp-- ) | 0x100 );
    put_data_on_bus( pch ); // not sure if the real hardware has this data on the bus with RW set to 1, just makes for cleaner code.
    put_addr_on_bus( ( sp-- ) | 0x100 );
    put_data_on_bus( pcl );
    put_addr_on_bus( ( sp-- ) | 0x100 );
    put_data_on_bus( sr | ( brk ? 0x10 : 0x00 ) );

    sr.i = true; // not sure whether i gets set on nmi or not

    u16_t vector = rst ? 0xFFFC : nmi ? 0xFFFA : 0xFFFE; // simulate vector priority
    rst = false;
    nmi = false;
    irq = false; // for devices with persistent logic, IRQ will be held high until acknowledged, this is partly why the i flag matters :)

    rw = 1; // read

    put_addr_on_bus( vector + 0 );
    pcl = get_data_from_bus( );
    put_addr_on_bus( vector + 1 );
    pch = get_data_from_bus( );
}


That SHOULD emulate all the quirks observed in the IRQ logic as long as you do the polling correctly on the last cycle. BRK would just do opcode fetch, data fetch, then call that as "isr( true );", your actual interrupt check would do two reads from the PC before calling the above as "isr( );". If anyone has any problems with what I just said, please let me know :)
Re: NMI/BRK timing and such
by on (#110804)
I did some tracing in Visual 6502. With a CLI+BRK, asserting NMI (which is active low btw) within the interval denoted by "----" below causes the BRK to be missed (each [] is a CPU tick):

Code:
     --------------------
[CLI][CLI][BRK][BRK][BRK][BRK][BRK][BRK][BRK]


When the BRK is missed, you also get a weird mix of NMI and BRK that's exactly like a BRK only it branches to the NMI vector instead. Here's a tick-by-tick breakdown of BRK from http://nesdev.com/6502_cpu.txt along with what seems to be going on:

Code:
 #  address R/W description
--- ------- --- -----------------------------------------------
 1    PC     R  fetch opcode, increment PC
 2    PC     R  read next instruction byte (and throw it away),
                increment PC
 3  $0100,S  W  push PCH on stack, decrement S
*** At this point, the signal status is queried to determine which interrupt vector to use ***
 4  $0100,S  W  push PCL on stack, decrement S
 5  $0100,S  W  push P on stack (with B flag set), decrement S
 6   $FFFE   R  fetch PCL
 7   $FFFF   R  fetch PCH


The BRK-specific behavior (incrementing of PC and pushing B -- NMI doesn't do that) seems to be triggered when NMI occurs within the interval above.

Is this information available anywhere on the wiki btw?
Re: NMI/BRK timing and such
by on (#110808)
To clarify, the BRK isn't really "skipped" - it's just hijacked by the NMI so that it uses the NMI interrupt vector instead of the BRK one.
Re: NMI/BRK timing and such
by on (#110810)
Just for another verification: Are you saying that it begins execution at the NMI vector, but in the flags pushed on the stack, the B bit is set?
Re: NMI/BRK timing and such
by on (#110811)
lidnariq wrote:
Just for another verification: Are you saying that it begins execution at the NMI vector, but in the flags pushed on the stack, the B bit is set?


Yup - the BRKy behavior extends to the point where it pushes with the B bit set even though it branches to the NMI vector.
Re: NMI/BRK timing and such
by on (#110814)
I experimented with some slower instructions before the BRK, and the condition for getting the glitchy behavior seems to be that NMI is asserted at or after the last tick of the previous instructions (or put another way, at or after the BRK opcode is fetched).

With LDX zero,y, which takes 4 cycles, you get this interval for example:

Code:
                                    ---------------------------
[LDX zero,y][LDX zero,y][LDX zero,y][LDX zero,y][BRK][BRK][BRK][BRK][BRK][BRK][BRK]
Re: NMI/BRK timing and such
by on (#110815)
It may be more accurate to say the the flags are polled in each cycle except for the last cycle due to the 2-stage pipeline the CPU has.

Example:

Code:
and #$nn takes [b]3[/b] cycles
1. opcode fetch
2. data fetch
3. perform and


While cycle 3 is occurring, the next instruction is being fetched, since the accumulator will never be important during that cycle. Now imagine that the cpu was allowed to be interrupted during that opcode fetch (AND's last cycle), and an interrupt indeed occurred. The PC would now no longer point to the next opcode (unless you got lucky and it was an implied instruction, but even then it'd be skipped outright). So the CPU doesn't allow interrupts to occur during that cycle. Instead if an interrupt was detected in the first 2 cycles (Or for ease of emulating, the cycle before the last), it forces the instruction register to 0, inhibits pc increments and executes a BRK instruction (The reason for the 2 reads at the beginning of the ISR i noted earlier). :D

edit: typos
Re: NMI/BRK timing and such
by on (#110818)
I'm pretty sure there's a test ROM for this hijacking behavior as well, in case you want to emulate it properly.
Re: NMI/BRK timing and such
by on (#110819)
blargg wrote:
I'm pretty sure there's a test ROM for this hijacking behavior as well, in case you want to emulate it properly.


Would be nice to have on the wiki too, in case it isn't already there and I'm not blind. I'm assuming NMI can hijack IRQ in a similar way too, and that IRQ can hijack BRK (though it wouldn't be as visible since they use the same interrupt vector. Might cause you to "miss" the IRQ perhaps...).
Re: NMI/BRK timing and such
by on (#110820)
Quote:
I'm assuming NMI can hijack IRQ in a similar way too, and that IRQ can hijack BRK (though it wouldn't be as visible since they use the same interrupt vector. Might cause you to "miss" the IRQ perhaps...).

Wouldn't matter with BRK and IRQ, since BRK hijacking IRQ would just mean that it would just delay the real IRQ vectoring until the BRK returned. IRQ hijacking BRK would be nasty, since it would be as if the CPU skipped the BRK. I believe this is what happens, though looks like ROM doesn't test it.

cpu_interrupts_v2

It even checks an NMI occurring during each cycle of an IRQ. And even checks the interrupt-delaying effect of the last cycle of a taken branch. Crazy.
Re: NMI/BRK timing and such
by on (#110821)
blargg wrote:
Quote:
I'm assuming NMI can hijack IRQ in a similar way too, and that IRQ can hijack BRK (though it wouldn't be as visible since they use the same interrupt vector. Might cause you to "miss" the IRQ perhaps...).

Wouldn't matter with BRK and IRQ, since BRK hijacking IRQ would just mean that it would just delay the real IRQ vectoring until the BRK returned. IRQ hijacking BRK would be nasty, since it would be as if the CPU skipped the BRK. I believe this is what happens, though looks like ROM doesn't test it.

cpu_interrupts_v2

It even checks an NMI occurring during each cycle of an IRQ. And even checks the interrupt-delaying effect of the last cycle of a taken branch. Crazy.


Nice. :)

Could copy the same info to the wiki so it's a bit easier to find I guess.
Re: NMI/BRK timing and such
by on (#110823)
I just realized I've been misreading the Visual 6502 traces. You get the previous opcode in the Execute field (which I'm guessing is the instruction reg) while the opcode fetch is taking place, which threw me off.

If you consider the opcode fetch as part of the instruction (like most docs do) the timing is actually like this:

Code:
          --------------------
[CLI][CLI][BRK][BRK][BRK][BRK][BRK][BRK][BRK]


And similarly:

Code:
                                                --------------------
[LDX zero,y][LDX zero,y][LDX zero,y][LDX zero,y][BRK][BRK][BRK][BRK][BRK][BRK][BRK]


This in turn gives

Code:
 #  address R/W description
--- ------- --- -----------------------------------------------
 1    PC     R  fetch opcode, increment PC
 2    PC     R  read next instruction byte (and throw it away),
                increment PC
 3  $0100,S  W  push PCH on stack, decrement S
 4  $0100,S  W  push PCL on stack, decrement S
*** At this point, the signal status is queried to determine which interrupt vector to use ***
 5  $0100,S  W  push P on stack (with B flag set), decrement S
 6   $FFFE   R  fetch PCL
 7   $FFFF   R  fetch PCH


Bit more intuitive. :)
Re: NMI/BRK timing and such
by on (#110826)
I created a wiki page at http://wiki.nesdev.com/w/index.php/CPU_interrupt_quirks (linked from http://wiki.nesdev.com/w/index.php/CPU). Please check for inaccuracies.
Re: NMI/BRK timing and such
by on (#110827)
Wiki wrote:
The RTI instruction affects IRQ inhibition immediately. If an IRQ is pending and an RTI is executed that clears the I flag, the CPU will invoke the IRQ handler immediately after RTI finishes executing.
The CLI, SEI, and PLP instructions effectively delay changes to the I flag until after the next instruction. For example, if an interrupt is pending and the I flag is currently set, executing CLI will execute the next instruction before the CPU invokes the IRQ handler. This delay only affects inhibition, not the value of the I flag itself; CLI followed by PHP will leave the I flag cleared in the saved status byte on the stack (bit 2), as expected.


This is easily explainable. /IRQ and /NMI are latched internally during the first half of each cycle. The actions that occur during CLI, SEI, and PLP occur during the second half. Because of this, the old value of the i flag is used when masking /IRQ. Any interrupt that is pending when CLI is executed will be masked, then i will be cleared, and one more instruction will execute (during which, i will now be clear, and /IRQ will not be masked, and the interrupt will be executed afterwards).

RTI on the other hand has 2 cycles after pulling the flags off the stack (Pull pcl, pull pch), so it will mask /IRQ with the expected value. And interrupts will occur immediately after, if they are now enabled.

EDIT: The latching process mentioned above should be:

Code:
irq_latch = irq & ~i;
nmi_latch = nmi & ~nmi_latch;

the above assumes irq and nmi aren't inverted, and that a value of 1 means "active"


EDIT2: The above also explains why reading $2002 will "suppress" NMI. NMI will be pulled low, $2002 will be read with bit 7 set, clear bit 7 (and thus pulling NMI back up) before the CPU notices that it was ever low. Effectively suppressing the NMI, even though VBL was read as set. I assume some race condition also explains why reading $2002 will "suppress" setting the flag too.
Re: NMI/BRK timing and such
by on (#110837)
Very good stuff guys. Back to me trying to pass the test ROMs, I've gotten closer but not quite right. When running blargg's VBL/NMI test suite I get the following results now:

01-vbl_basics.nes = PASS
02-vbl_set_time.nes = PASS
03-vbl_clear_time.nes = PASS
04-nmi_control.nes = PASS
05-nmi_timing.nes = PASS
06-suppression.nes = PASS
07-nmi_on_timing.nes = FAIL
08-nmi_off_timing.nes = PASS
09-even_odd_frames.nes = PASS
10-even_odd_timing.nes = FAIL


But here's the weird thing. The only way I was able to make it pass 5 and 6 was by slightly delay the PPU triggering an NMI.

Here's the timing I used to have (Timing is in : PPU(y, x) notation, I consider VBLANK start to be y == 0):

Code:
PPU (0,0) = idle
PPU (0,1) = status_ |= 0x80; if ($2002 & 0x80) NMI(); execute_0_or_1_cpu_cycle();
...


Which was always a cycle or two off in either direction when running the NMI/VBL tests (depending on tweaks I made to my core.

But when I changed it to this:

Code:
PPU (0,0) = idle
PPU (0,1) = status_ |= 0x80; execute_0_or_1_cpu_cycle();
PPU (0,2) = execute_0_or_1_cpu_cycle(); if ($2002 & 0x80 && status_ & 0x80) NMI();
...


or equally as far as the CPU is concerned:

Code:
PPU (0,0) = idle
PPU (0,1) = status_ |= 0x80; execute_0_or_1_cpu_cycle();
PPU (0,2) = execute_0_or_1_cpu_cycle();
PPU (0,3) = if ($2002 & 0x80 && status_ & 0x80) NMI(); execute_0_or_1_cpu_cycle();
...



Suddenly I pass more tests! I would have figured that the NMI was tied directly to the both status & 0x80 and $2002 & 0x80 both being set and would happen IMMEDIATE. Is there actually supposed to be a 1 or 2 PPU tick delay?
Re: NMI/BRK timing and such
by on (#110865)
I pass all the tests with no delay whatsoever. I used to have a two dot delay, but I recently rewrote all my interrupt stuff to make it deadly accurate and it passes all the tests. The biggest change I made was emulating the processors on half cycles. Each read/write I run the PPU for 6 master cycles (for ntsc), poll the interrupt flags, then run the PPU for another 6. So basically 1.5 dots, check flags, another 1.5 dots. AFAIK there is no important delay associated with NMI, IRQ and RST.
Re: NMI/BRK timing and such
by on (#110867)
Hmm, I don't get it then :-/.

I run the PPU 1 dot at a time, every 3rd PPU dot, I run the CPU for 1 cycle. In other words, since right now I'm only concerned with NTSC, single PPU cycles are my basic unit of execution. I'm not sure where I'm going wrong with that.

What I was doing previously was on the first line of the vblank @ dot 1, I set $2002.7 and at that same dot, if $2000.7 is also set, I would signal the CPU to do an NMI. So the absolute earliest an NMI could occur is literally on dot 1 of the first line of the vblank (assume that the CPU was about the fetch a new instruction when it happened).
Re: NMI/BRK timing and such
by on (#110870)
And just to clarify, given a zero-based numbering system, your vblank would occur on scanline 241, dot 1?
Re: NMI/BRK timing and such
by on (#110877)
beannaich wrote:
I pass all the tests with no delay whatsoever. I used to have a two dot delay, but I recently rewrote all my interrupt stuff to make it deadly accurate and it passes all the tests. The biggest change I made was emulating the processors on half cycles. Each read/write I run the PPU for 6 master cycles (for ntsc), poll the interrupt flags, then run the PPU for another 6. So basically 1.5 dots, check flags, another 1.5 dots. AFAIK there is no important delay associated with NMI, IRQ and RST.


So to clarify, you poll the interrupt flags after the first half-cycle of each CPU cycle? I'm guessing the real 6502 does that, but I wasn't sure.
Re: NMI/BRK timing and such
by on (#110882)
Actually, I suspect the interrupt delay behavior after CLI and PLP might not require half-cycles at all to understand. Check the following Visual 6502 log (CLI + NOP + NOP + ...):

Code:
    cycle   ab      db      rw      Fetch   pc      a       x       y       s       p               Execute irq
C---0       0000    58      1       CLI     0000    aa      00      00      fd      nv‑BdIZc        BRK     1 <- Interrupt poll #1
L   0       0000    58      1       CLI     0000    aa      00      00      fd      nv‑BdIZc        BRK     1
I   1       0001    ea      1               0001    aa      00      00      fd      nv‑BdIZc        CLI     0 <- Interrupt poll #2
 \--1       0001    ea      1               0001    aa      00      00      fd      nv‑BdIZc        CLI     0
N---2       0001    ea      1       NOP     0001    aa      00      00      fd      nv‑BdiZc        CLI     0 <- Interrupt poll #3
O   2       0001    ea      1       NOP     0001    aa      00      00      fd      nv‑BdiZc        CLI     0
P   3       0002    ea      1               0002    aa      00      00      fd      nv‑BdiZc        NOP     0 <- Interrupt poll #4
 \--3       0002    ea      1               0002    aa      00      00      fd      nv‑bdiZc        NOP     0
I---4       0002    ea      1       NOP     0002    aa      00      00      fd      nv‑bdiZc        NOP     0 <- Interrupt poll #5
R   4       0002    ea      1       NOP     0002    aa      00      00      fd      nv‑bdiZc        NOP     0
Q   5       0002    ea      1               0002    aa      00      00      fd      nv‑bdiZc        BRK     0 <- Interrupt poll #6
|   5       0002    ea      1               0002    aa      00      00      fd      nv‑bdiZc        BRK     0
|   6       01fd    ea      0               0002    aa      00      00      fd      nv‑bdiZc        BRK     0 <- Interrupt poll #7
|   6       01fd    00      0               0002    aa      00      00      fd      nv‑bdiZc        BRK     0


Notice that the I flag only drops after the CLI finishes executing. Hence interrupt poll #2 during the CLI won't trigger the "do IRQ" logic, and once it does trigger at interrupt poll #3, the NOP has already begun executing, and so will need to finish.

(Note that instructions begin when they are Fetch'd and not when they are Execute'd. That tripped me up at first.)
Re: NMI/BRK timing and such
by on (#110884)
ulfalizer wrote:
Note that instructions begin when they are Fetch'd and not when they are Execute'd. That tripped me up at first.


I have to take issue with this, call it nitpicking if you want. But each instruction ends by fetching the next instruction. It doesn't begin by fetching itself. If you look at the Execute column, which is actually executing the PLA lines for the opcode stored in IR, then you'll see that the 2nd cycle of CLI clears the i flag, as it's fetching the NOP. This cycle (opcode fetch) won't poll the flags. It also won't increment PC if there was a hardware interrupt pending and nothing is inhibiting it. Instead IR will be forced to $00 (BRK).

EDIT: Moderators feel free to break this into two topics so we aren't hijacking proxy's thread.
Re: NMI/BRK timing and such
by on (#110885)
beannaich wrote:
ulfalizer wrote:
Note that instructions begin when they are Fetch'd and not when they are Execute'd. That tripped me up at first.


I have to take issue with this, call it nitpicking if you want. But each instruction ends by fetching the next instruction. It doesn't begin by fetching itself. If you look at the Execute column, which is actually executing the PLA lines for the opcode stored in IR, then you'll see that the 2nd cycle of CLI clears the i flag, as it's fetching the NOP. This cycle (opcode fetch) won't poll the flags. It also won't increment PC if there was a hardware interrupt pending and nothing is inhibiting it. Instead IR will be forced to $00 (BRK).

EDIT: Moderators feel free to break this into two topics so we aren't hijacking proxy's thread.


Are you sure the fetch cycle won't poll the flags? Maybe it's just considered too late at that point to take effect immediately.

It gets a bit confusing since e.g. http://nesdev.com/6502_cpu.txt does list the opcode read cycle as part of the instruction, and looking at that, no other useful work seems to get done during an opcode fetch. Seems fair to consider the fetch as part of the instruction to me at least, other than perhaps from a hw perspective (not sure how it's handled there. Does each instruction have "fetch next instruction" as an explicit step or something?).
Re: NMI/BRK timing and such
by on (#110886)
ulfalizer wrote:
Are you sure the fetch cycle won't poll the flags? Maybe it's just considered too late at that point to take effect immediately..


Means the same thing, they aren't polled because it would be too late. :)
Re: NMI/BRK timing and such
by on (#110895)
@beannaich, that's exactly when it happens. I've made my PPU timing match ulfalizer's PPU.svg chart exactly.
Re: NMI/BRK timing and such
by on (#111012)
proxy wrote:
@beannaich, that's exactly when it happens. I've made my PPU timing match ulfalizer's PPU.svg chart exactly.

The only thing that I can suggest at this point, is making a debugger. See exactly why the test is failing you. I've had tests fail before because individual lines of code were executing in the wrong order. Not because anything was functionally wrong, it was just a small oversight that made me believe my emulator was WAY off in execution. Failing a test doesn't always mean you failed it for the reason given, that's just what it detected.
Re: NMI/BRK timing and such
by on (#111017)
Yea, I think I am going to head down that route. One question, I was just re-reading the NMI/IRQ parts of the nesdev wiki and noticed something I didn't before (perhaps it was updated?).

Is NMI/RST also checked on the last cycle of an instruction? Or is it just IRQ? I've been checking all 3 on the last cycle historically cause I figured that they all shared similar logic.

Also, I know that you can toggle $2000.7 during the vblank and it will trigger multiple NMIs. (http://wiki.nesdev.com/w/index.php/NMI).

Blargg's test say that there is a 1 instruction latency. Do we know if this really one instruction or is it a 1 cycle delay? I ask because I can't imagine how the hardware would be work to make it always one instruction given that the instructions are variable length.

Thoughts?
Re: NMI/BRK timing and such
by on (#111023)
Which test claims a one-instruction latency (so I can fix the description if it does)? It's more likely that it describes some cases where there is a latency of one instruction due to it being asserted just after the checking near the end of the previous instruction. Also, the only thing authoritative about the test ROMs is that they pass on a NES; all else are claims that can only be informally tested (the failure descriptions and readme with the tests) and are subject to my own errors.
Re: NMI/BRK timing and such
by on (#111024)
Blargg, here's an exurp from one of your readme's for the ppu_vbl_nmi_suite. Thanks for your feedback.

04-nmi_control
--------------
Tests immediate NMI behavior when enabling while VBL flag is already set

2) Shouldn't occur when disabled
3) Should occur when enabled and VBL begins
4) $2000 should be mirrored every 8 bytes
5) Should occur immediately if enabled while VBL flag is set
6) Shouldn't occur if enabled while VBL flag is clear
7) Shouldn't occur again if writing $80 when already enabled
8) Shouldn't occur again if writing $80 when already enabled 2
9) Should occur again if enabling after disabled
10) Should occur again if enabling after disabled 2
**11) Immediate occurence should be after NEXT instruction**

PS: your tests are awesome :-D
Re: NMI/BRK timing and such
by on (#111029)
If you've never looked at the source to the tests, please do, since they're pretty clean. Here's 11:
Code:
        set_test 11,"Immediate occurence should be after NEXT instruction"
        jsr begin
        delay 200       ; VBL flag set during this time
        ldx #0
        lda #$80        ; enable NMI, which should result in immediate NMI
        sta $2000       ; after NEXT instruction
        stx nmi_count   ; clear nmi_count
        ; NMI should occur here
        lda nmi_count
        jeq test_failed
Re: NMI/BRK timing and such
by on (#111035)
Right, so from that, it seems that the test is expecting a 1 instruction latency. So I guess the question is, does anyone know if it is really one whole instruction? or is it a specific number of cycles?
Re: NMI/BRK timing and such
by on (#111038)
I believe it's always one because the NMI is asserted sometime after the $2000 write at the end STA, which is after NMI is polled AFAIK, thus the 2A03 doesn't see NMI until it checks during the next instruction's execution. Maybe if you could do the $2000 write before the last cycle of an instruction (INC? one of those double-write ones) it'd occur before the next instruction.
Re: NMI/BRK timing and such
by on (#111047)
.
Re: NMI/BRK timing and such
by on (#111067)
A limitation of blargg's test roms is that they often only test observed behavior without giving information on the underlying signal levels and timings that cause the behavior. This is of course also a strength since it means that you can be sure that something is broken if your emulator fails a test.

Knowing more about the underlying signal levels and timings and having it written up on the wiki would be nice though, to eliminate some of the guesswork. Visual 2C02 and http://wiki.nesdev.com/w/images/d/d1/Ntsc_timing.png shows you the timing on the PPU side, but there's still trickiness involving how e.g. a CPU read might align itself with a flag setting or clearing. (For this reason I think you ought to think in terms of "a read starting at tick n" instead of "a read at tick n", since a read is not an atomic operation with the PPU running at 3x the CPU speed.)

I'm going to focus on coding for a while, but it'd be great if someone else would pick up. :)

I've updated http://wiki.nesdev.com/w/index.php/PPU_frame_timing with the latest information I know of btw.
Re: NMI/BRK timing and such
by on (#111068)
If you knew what happens when a read starts n master clock ticks before/after different flags change (i.e., whether the read sees the new value or the old, and what special behavior triggers), I guess that would cover everything you need to know. Would probably be easiest to figure out with a test program and a logic analyzer.
Re: NMI/BRK timing and such
by on (#111074)
Quote:
A limitation of blargg's test roms is that they often only test observed behavior without giving information on the underlying signal levels and timings that cause the behavior. This is of course also a strength since it means that you can be sure that something is broken if your emulator fails a test.

Yep. I try to break the test into many steps so that the one that fails will give more insight, but it's still often pass/fail. Giving detailed information might involve pages of code rather than the really concise ten-line test style I've developed. More recent timing test do tend to take the approach of determining the timing and displaying it, then just checksumming the output, so that you can see a little more of what's going on. If someone had days to devote to writing a single test...

Quote:
If you knew what happens when a read starts n master clock ticks before/after different flags change (i.e., whether the read sees the new value or the old, and what special behavior triggers), I guess that would cover everything you need to know.

Yes, I'd love for these timing things to be documented in terms of absolute hardware timing. The way it is now, it's entirely circular, only stated in terms of CPU behavior. The things it's timed relative to are also CPU behavior, so you can't pin it down. Only indirectly by looking at relative timings to something else, and how that is timed (perhaps to yet another thing), can you deduce what of the many possible timings will fit all these.
Re: NMI/BRK timing and such
by on (#111076)
blargg wrote:
Yes, I'd love for these timing things to be documented in terms of absolute hardware timing. The way it is now, it's entirely circular, only stated in terms of CPU behavior. The things it's timed relative to are also CPU behavior, so you can't pin it down. Only indirectly by looking at relative timings to something else, and how that is timed (perhaps to yet another thing), can you deduce what of the many possible timings will fit all these.


With a logic analyzer you could get an absolute reference for where you are PPU-wise by watching stuff like PPU memory accesses at least, if that's hard to get otherwise. That could then be used to infer where a later read starts, and the resulting behavior could be checked in the program.

Electronics isn't exactly my forte unfortunately. :|

Second best option would be Visual 2C02 and timed reads I guess, but I'm not sure if reads in Visual 2C02 are exactly like CPU reads, and Quietust mentioned some bug with reads being slightly shorter...
Re: NMI/BRK timing and such
by on (#111100)
blargg wrote:
Yes, I'd love for these timing things to be documented in terms of absolute hardware timing.

Would that be master clocks (periods of the 21,477,272.72~ Hz input clock)? That would be nice to have, I only recently was given the information that phi2 doesn't have a 50% duty cycle (it's really 62.5%, or 7.5/12 master cycles are "high"). That means a lot for emulator developers looking to do cycle perfect implementations (a pipe dream, but it's still pursued).

Every sentence in the above paragraph ends with a parenthesis..:D
Re: NMI/BRK timing and such
by on (#111208)
beannaich wrote:
blargg wrote:
Yes, I'd love for these timing things to be documented in terms of absolute hardware timing.

Would that be master clocks (periods of the 21,477,272.72~ Hz input clock)? That would be nice to have, I only recently was given the information that phi2 doesn't have a 50% duty cycle (it's really 62.5%, or 7.5/12 master cycles are "high"). That means a lot for emulator developers looking to do cycle perfect implementations (a pipe dream, but it's still pursued).

Every sentence in the above paragraph ends with a parenthesis..:D


It'd only need to be accurate down the master clock, since that's the highest resolution you'll ever deal with in practice (due to varying CPU/PPU clock alignment relative to the master clock).

It would probably be easiest and safest to do with a logic analyzer (to figure out the alignment and get an absolute reference for where you are PPU-wise) and a test program that e.g. reads later and later and sees what happens. By resetting a few times to get different alignments you could probably nail stuff down to the master clock. A nice thing about this approach is that you're still only testing CPU behavior, which makes it harder to mess up.

Results would be stuff like "a read starting at master clock n after <some convenient absolute PPU reference> gives behavior x". You could then pick some convenient alignment to use in your emulator and infer the timings for it (or emulate them all if you're feeling OCD :P).