I modified the way that my emulator detects NMI. If (NMI_occurred AND NMI_output) changes from false to true, I set an NMI request flag in the CPU that is checked between instructions. The NMI handler sets the flag to false.
With that in place, there should not be any lost NMI events resulting from a race condition between reading the PPU Status Register ($2002), which has the side effect of setting NMI_occurred to false, and detecting the start of VBlank.
With that, I still had to introduce 2 minor hacks. First, instead of setting NMI_occurred to true on dot 1 of scanline 241, the emulator sets it to true on dot 17 of that scanline. With the hack in place, the NTSC and PAL versions of Marble Madness and Battletoads appear to function correctly.
The second level of Battletoads appears to be sensitive to NMI timing, sprite 0 hit detection and sprite overflow detection. If NMI timing or sprite 0 hit detection is slightly off, the stage can freeze. If the overflow detection is slightly off, then enemy hit detection can fail completely, making it impossible to advance (the player nor the enemies can get injured).
The second hack was to set the sprite overflow flag at dot 256 (at the beginning of HBlank). But, it gets computed within the first 64 dots of the scanline.
I would obviously like to remove these hacks at some point. My emulator only passes a subset of Blargg's timing tests. But, FCEUX 2.2.2 seems to fail the same tests that my emulator does. So, I do not know which tests that the emulator really needs to pass to improve things. Suggestions are welcome.
Also, level 3 of Battletoads contains a bug. When multiple rats are on the screen at the same time and the player punches one of them, the center (far) brain-like, background glitches. You can see the effect in
this video of the game running on an actual NES. We should probably mention that on the
Game Bugs wiki page, especially since I spent a while trying to get rid of the effect until I reproduced it on FCEUX and Nestopia.