NMI/IRQ edge cases and last cycle testing

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
NMI/IRQ edge cases and last cycle testing
by on (#21883)
This is for a bug I recently came across with Mortal Kombat II. It applies both to NES and SNES, as they have virtually identical underpinnings, so I'm hoping someone here has experimented at this low level of detail and can shed some insight on this matter for me.

This is really deep into clock-perfect timing, so this is not for the inexperienced or faint of heart. You probably won't be able to help unless you really know this stuff inside and out.

Here is my IRQ log:
http://byuu.cinnamonpirate.com/temp/irq.txt

The second block is the problem.

Code:
80f826 stx $4200     [$804200] A:00c2 X:0020 Y:0000 S:01fb D:0000 DB:80 nVMxdIzC V:215 H:1346
- 1346 <opfetch>
- 1352 <$00 fetch>
- 1358 <$42 fetch>
-    0 <$4200 write start>
-    6 <$4200 actual write>
<lastcycle test>
-    6 <$4201 write start>
-   12 <$4201 actual write>
-   12 <opend>
80f829 cli                     A:00c2 X:0020 Y:0000 S:01fb D:0000 DB:80 nVMxdIzC V:216 H:  12


$4200 is the SNES NMI/IRQ enable register. As we know, a write that happens immediately at the same time as the lastcycle test does not get processed on that lastcycle event. It's as if the $4200 register holds its' previous value for the IRQ test (meaning lastcycle is tested first before the write takes event), or the IRQ test is simply blocked for this lastcycle event (eg a simple delay is needed).

What I do currently is implement a tiny delay by 2 clock ticks, to prevent the lastcycle test from succeeding here, because the former would require SNES CPU pipeline emulation to detect lastcycle edge before the previous cycles' write completes. The problem is that I then test to trigger IRQs again two ticks later, and now I see that V=215, and VIRQs are enabled on line 215, so I internally say "ok, we need to generate a VIRQ". cli doesn't occur until V=216, but an IRQ triggers here anyway because once the /IRQ line is raised, it stays raised. Hardware disagrees with me on this specific test. Note that on real hardware, if the lastcycle test were to happen 6 clocks sooner, it really would fire a second IRQ. So yes, this game will actually completely break, even on real hardware, if timing is adjusted by a mere four hertz on a 21mhz clock. And people say cycle-accurate emulation isn't necessary :P

Anyway, I'm kind of at a loss on how to fix this. The reason it sees V=216,HCLOCK=6 as being valid for V=215 IRQ is because of the delay between your IRQ/NMI positions and when they actually trigger. For NMI, the delay is 2 clocks, just like the NES NMI. For SNES IRQs, the delay is 10 clocks for VIRQ, 14 clocks for [V]HIRQ. So, subtract 10 from V216, HCLOCK6, and you get V215,HCLOCK1360. It thinks V==VTIME, so /IRQ gets ticked.

What I'm thinking is happening, is that because the write happened to $4200 right at the lastcycle edge, it isn't acknowledged here, and the lastcycle event doesn't set the flag to say "ok, we need to trigger another IRQ". And indeed, that's what happens with bsnes now. However, on the next clock, which is not a lastcycle edge, bsnes tests again (it tests every clock because range testing is a major PITA), and sees "ok, NOW the IRQ is valid, let's trigger it on the next lastcycle edge". Note that even range testing wouldn't fix this, as in this case if the SNES really were range testing (highly improbable), the range from immediately after the lastcycle edge would be tested and valid, so the IRQ would trigger anyway on the next lastcycle edge.

Is it possible that the real hardware doesn't actually raise the NMI/IRQ pins immediately when the clock hits their positions, and instead only raises them when the IRQ values are within range during lastcycle edges? The only way to verify this for sure would be with an oscilloscope watching the /NMI and /IRQ lines.

If bsnes were to ignore the "IRQ needs to fire" message at that lastcycle edge due to $4200 write at the same time, and then continued to ignore the message until the next lastcycle edge, no IRQ would trigger. Does this seem like the correct fix, or am I perhaps missing something else?

The problem with this theory is that it would screw with the four-clock "line raised to acknowledgement" delay. You know, say with NMI, it gets raised at clock 2, and then after the delay, the NMI won't fire until clock 6. So I would have to test the IRQ every clock tick, and then when the /IRQ line ticks, and the four-clock delay completes, then set another intermediate variable that says "ok we have an IRQ ready, but we need to wait until lastcycle to actually set the flag that says the IRQ has to fire, and give lastcycle a chance to dismiss this flag". Yet another layer of indirection to triggers the interrupts, doesn't sound correct to me either.

by on (#21886)
Wow, a game for SNES that's as cycle-picky as Battletoads? Never thought I'd see the day...

The NES's IRQ handling is very different from the SNES, so it's very hard to rely on the NES architecture in figuring out what's going on in your case. For one thing, there's no such thing as HIRQ's, and VIRQ's exist only with certain cartridge mappers (MMC3, MMC5, MMC6, and certain third-party mappers) and only work when the PPU is rendering (i.e. no IRQ's during V-Blank, except through the "manual" IRQ trigger that can be done with the MMC3/6).

If I understand right, you're getting an IRQ when there shouldn't be one and are wondering what may cause it not to fire. My guess is that there's probably a latency in the IRQ being enabled. What happens if you turn on an HIRQ on the precise cycle it would fire? What happens if you enable it two clocks before/after that time?

by on (#21890)
Oh yes, it's quite picky. It sets up an HIRQ to get to the near-end of the scanline, then enables VIRQs on the next scanline. You can change the HIRQ position (each 1 value change results in a 4-clock timing difference, plus rounding). Go to 0x86db in the Mortal Kombat II ROM, and this is the 16-bit HIRQ dot position to trigger at. It should be 0x18,0x01 (0x0118 HIRQ). Move it one forward, and it works in bsnes. Move it one backwards (to 0x17), or maybe two to 0x16, and the game will break on real hardware. Wonderful design, hm?

Anyway, I think I have an idea of what it might be. Have to wait until I get home tonight to test on hardware, but I'm thinking my WAI implementation timing might be wrong. If so, that would be a ~6-12 clock timing difference, which would be enough to "fix" MK2, and hopefully show that there isn't a (known) problem with my current IRQ implementation.

How does the NES implement WAI? Does it only execute the opcode one time and "stick" inside that opcode until an interrupt flag is set, or does it keep executing the opcode without incrementing regs.pc, over and over until an interrupt triggers? Or does it not even have WAI? Heh.

Lastly, from what I've seen here, the NES and SNES interrupt mechanisms seem extremely similar to me. Obviously the NES only has NMI, but it has the same "test on last cycle edge", the same "wait 4 clocks from triggering", the same "read NMI status during those four clocks of /NMI hold time, and the value stays set", the same ability to retrigger NMIs multiple times during the same frame, etc.

by on (#21891)
Just my $0.02 about this : I've a NTSC Mortal Kombat II cartridge working fine on my PAL SNES, so it just cannot be as tricky as Battletoads & cie.

However, HIRQs and VIRQs are generated by the SNES PPU itself, not by the cartridge, so they are just in sync for both NTSC or PAL, however, the CPU frequency is different I guess, just like the NES but 2x faster (or 3x for FastROMs).

by on (#21893)
Lame! The NES has no WAI instruction. How do you reliably poll when an NMI triggered on the NES, other than by making the NES NMI routine set a RAM variable to tell you it triggered, then? If you poll the NMI triggered register you have both the potential to block the NMI from occurring (I think), as well as to read the value as being set twice. Quite dangerous.

As far as MK2 working on a PAL SNES, sure. They have the same number of clocks per scaline. It's the number of scanlines per frame that is different. This HIRQ routine should perform identically on both systems. Even though the S-CPU frequencies between NTSC and PAL are slightly different (the NTSC one being optimized for NTSC color subcarrier crystal clock speed * 6; no idea where the PAL timing comes from), the IRQ and the V/H counters both use the same oscillator, so the results would be identical.

It may not be as tricky as Battletoads, but failing because timing is off by one five-millionth of a second is pretty bad nonetheless. The other emulators run this game because they don't support an edge case with IRQ triggering, and so the IRQ that "should" fire in them does not. I couldn't tell you which edge case they're missing, since I didn't write the code for them. I tested this by setting my test ROM to trigger a little sooner, where the interrupt should fire on hardware, and no other emulator fires the second IRQ at all. I do so love when bugs allow games to work, heh. Makes me feel better about all of the comments such as, "but such and such can run it just fine, and with only a 66mhz toaster oven!"

by on (#21894)
byuu wrote:
Lame! The NES has no WAI instruction. How do you reliably poll when an NMI triggered on the NES, other than by making the NES NMI routine set a RAM variable to tell you it triggered, then?

Commercial NES games that don't use the run-entire-game-in-an-NMI-handler method of Super Mario Bros. detect vblank by having the NMI routine change the value of a variable in RAM.

Quote:
I do so love when bugs allow games to work, heh. Makes me feel better about all of the comments such as, "but such and such can run it just fine, and with only a 66mhz toaster oven!"

A simplified model of the hardware has its uses.

by on (#21895)
Quote:
Lame! The NES has no WAI instruction. How do you reliably poll when an NMI triggered on the NES, other than by making the NES NMI routine set a RAM variable to tell you it triggered, then?

Another option is to go the Final Fantasy way : Call a subroutine wich enables NMI, and turn itself in an infinite jump. When NMI hits, clear the flag disables NMIs, discard the saved adress and status on the stack and return from the first subroutine.
However, myself, I prefer have some code in NMI and some other code out of NMI, it keep things well organised and simple.

by on (#21896)
Quote:
Commercial NES games that don't use the run-entire-game-in-an-NMI-handler method of Super Mario Bros. detect vblank by having the NMI routine change the value of a variable in RAM.


Ah, what I was thinking, then. That works too, but has additional overhead since you basically have to test, loop, test, loop, test, loop; rather than just test, test, test. So you potentially lose more time using the - lda $00 : beq - approach. Ah well.

Quote:
A simplified model of the hardware has its uses.


Absolutely. Especially for purity's sake. I've actually considered doing that, writing an emulator that ignores all of the pesky complex edge cases; I just don't have the time to maintain two emulators myself.

The idea would be: it would run on slower hardware, it would allow you to see if your own personal game (or someone else's) relied on an edge case (hence it probably won't work in many emulators and you should fix it), etc.

However, I don't agree with any current SNES implementations of such a concept. They all go for compatibility, add hacks, and add just enough accuracy to get games working, rather than keeping the design pure and simple.

For the SNES, I'd recommend:
- no bus hold delays for H/DMA
- synchronization on opcode edges and each H/DMA byte transfer only
- simplistic NMI / IRQ handler with no edge cases
- no DRAM refresh or long dots, no missing dots, no variable scanlines on interlace mode
- overscan enable is cached, takes effect at the start of the next even frame
- all PPU regs completely locked outside of blanking period
- DMA/HDMA/NMI/IRQ trigger points are always constant and where expected
- V/H blank flag set positions always constant and where expected
- clock rates run at manual specifications, rather than observed average variances (EWJ2 (E) breaks using stock S-SMP clock speed, relies on the fact that 100% of oscilloscope-tested SNES units run ~0.125% faster than stock speed)
- etc

I would welcome, encourage, and contribute free code to such a project, however. It would be cool if others would go back and patch other poorly programmed games to work under this model, as well. Authenticity is nice, but it's quite different when the original game only works by fluke.

Both extreme accuracy and extreme simplicity have their purpose in my book. Middle of the road emulators only help out in the interim while low-end hardware is not capable of running the accurate emulators, they will not serve a purpose in the future, same as NESticle, as soon as even cell phones can run the accurate ones at full speed. Let's hope Moore's Law keeps up though so that can become a reality :)

by on (#21904)
byuu wrote:
I would welcome, encourage, and contribute free code to such a project, however. It would be cool if others would go back and patch other poorly programmed games to work under this model, as well.

It had better work on the full model (and on a Super NES as well), or Bananmos will b*tch your head off.

Quote:
Both extreme accuracy and extreme simplicity have their purpose in my book. Middle of the road emulators only help out in the interim while low-end hardware is not capable of running the accurate emulators, they will not serve a purpose in the future, same as NESticle, as soon as even cell phones can run the accurate ones at full speed. Let's hope Moore's Law keeps up though so that can become a reality :)

The problem is that it becomes harder to run homebrew code on platforms with a D-pad as newer, faster platforms come out. GBA homebrew was easy after 6 months, but it took roughly 18 months (May 2006) to make homebrew on Nintendo DS easy to get started with. PSP is just an endless game of cat-and-mouse with Sony. Mobile phones are often locked down to run only applications approved by the network operator, and (except for N-Gage) they don't have a D-pad.

by on (#21916)
Found the problem. My IRQ support is fine, it was due to bad WAI timing. Not that anyone in the NES emu dev forum cares, but the proper way to handle WAI is:

Code:
wai(0xcb) {
//last_cycle() will set event.wai to false
//once an NMI / IRQ edge is reached
1:event.wai = true;
  while(event.wai) {
    last_cycle();
    op_io();
  }
2:op_io();
}


Guess it was an SNES problem after all, sorry about that.

Quote:
The problem is that it becomes harder to run homebrew code on platforms with a D-pad as newer, faster platforms come out. GBA homebrew was easy after 6 months, but it took roughly 18 months (May 2006) to make homebrew on Nintendo DS easy to get started with. PSP is just an endless game of cat-and-mouse with Sony. Mobile phones are often locked down to run only applications approved by the network operator, and (except for N-Gage) they don't have a D-pad.


Yes, DRM will always be beatable, but what most people don't understand is that as the industry refines it, it will become harder and harder to circumvent, and take more and more time. Eventually, products will have to be virtually ubiquitous to get hacked. Sony's bullshit cost them a customer for their PSP. I might pick one up after the product is discontinued, and the final BIOS is hacked, but not sooner. I've personally always considered phones phones, so I'm not too upset, but it would be nice to be able to run third party apps on them. I really can't see why they would be so against outside applications that would make their phones more valuable to own. There's no risk of piracy in allowing homebrew for them.

by on (#21917)
Quote:
I might pick one up after the product is discontinued, and the final BIOS is hacked, but not sooner.

That's just what I told to myself. I agree 100% with you on this point, heh.

by on (#21920)
byuu wrote:
I've personally always considered phones phones, so I'm not too upset, but it would be nice to be able to run third party apps on them. I really can't see why they would be so against outside applications that would make their phones more valuable to own. There's no risk of piracy in allowing homebrew for them.

They run the risk of people downloading homebrew instead of buying marked-up apps from the network operator's online store.