PAL NES, sprite evaluation and $2004 reads/writes

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
PAL NES, sprite evaluation and $2004 reads/writes
by on (#192566)
I'm trying to get a better idea of how PAL consoles behave when it comes to sprite evaluation and $2004 reads/writes.
In this old post, jsr said the oam_test on their PAL NES returned bad values from the middle of line 3, to the start of line 8.

As far as I can tell, this implies that, when rendering is disabled on a PAL NES, sprite evaluation is only active from around scanline 264 to the end of vblank (scanline 310). Implementing this in Mesen gives roughly the same result as jsr described:
Attachment:
oam_read_001.png
oam_read_001.png [ 1.66 KiB | Viewed 11246 times ]

So it seems like when vblank ends, sprite evaluation also stops if rendering is disabled - is this correct?

Quotes from a few pages on the wiki:
Quote:
(1) To compensate for PAL's longer vblank period, the 2C07 always enables the OAM refresh logic, regardless of whether rendering is enabled.
(2) Additionally, it will continue to refresh during the visible portion of the screen even if rendering is disabled. Because of this, OAM DMA must be done near the beginning of vertical blank on PAL, and everywhere else it is liable to conflict with the refresh
(3) In the 2C07, sprite evaluation can never be fully disabled, and will always start 20 scanlines after the start of vblank
This mixes sprite evaluation and oam refresh - are they 2 unrelated processes? These make it sound like sprite evaluation is always on during scanlines 0-239, even if rendering is off?
But if that's the case, it would contradict jsr's results for their PAL console. If sprite evaluation can't be disabled, the $2004 reads should give invalid results until the last line of the test.

Could someone confirm the output of blargg's oam_read test on a PAL console?

Also relevant is this video of a PAL NES running the "Quantum Disco Brothers" demo with a powerpak. This means the demo works correctly on a PAL console. This demo performs a sprite DMA during vblank which ends around scanline 264, cycle ~60. If I turn on sprite evaluation on scanline 263, the demo breaks (artifacts appear on the screen, or the scrolling stops). Turning on sprite evaluation at the start of scanline 264 makes it work (which means the last writes of sprite DMA may or may not be important in this case - so can't use this as a precise measurement).

So far, what I'm understanding from all this is that:
-It looks like PAL sprite evaluation can be turned off during scanlines 0-239 if rendering is disabled (how does that make sense vs OAM decay, since apparently PAL OAM ram can't decay, even with rendering disabled?)
-Sprite evaluation/oam refresh begins around the start of scanline 264 during vblank.
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#192568)
Sour wrote:
This mixes sprite evaluation and oam refresh - are they 2 unrelated processes?
Sprite evaluation is the only mechanism for OAM DRAM refresh in the 2C02. I see no reason to think that part is different in the 2C07.

Reading at least one byte out of every row out of OAM DRAM at least every ≈1.5ms is necessary to prevent data loss. They could have gotten really clever, and just run sprite evaluation once every 20 scanlines or so to prevent data loss. But I don't think they were that clever.

Can't help with the other questions...
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#192593)
Sour wrote:
Could someone confirm the output of blargg's oam_read test on a PAL console?

I ran it way back when: viewtopic.php?p=68971#p68971

I'm not all that convinced that sprite evaluation is active all the way 20 scanlines after vblank, it's conjecture. Most of what we know about this is based on my limited tests.

If you assume that OAM can't decay, it'd be fairly easy to write a new test for this. Simply wait varying times from the start of vblank with rendering off, read a byte (or two) from OAM and see if the byte(s) got corrupted. The data should tell us on which PPU cycles (or scanlines) sprite evaluation is active. (This is assuming that merely setting OAM_ADDR for reads and/or reading via OAM_DATA won't be able to somehow mess up the refresh and possibly corrupt the data.)
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#192597)
Per the Micro Machines synchronization trick, we know that reading $2004 during rendering (and thus, presumably OAM refresh also) just returns the current byte in the OAM's data bus.

I've been meaning, and failing to get around to, writing a simple test that would do nothing but
* Make sure that the $2004 is readable
* Preload OAM with a simple table where the first byte is $FF and the remaining bytes are all $00, or something else unambiguous
* Wait for vblank, and start a loop that checks whether the value read from $2004 is $FF or not (approximately equivalent to "rendering is inactive or active") that would last just a little over 312 scanlines on a Dendy, and save the result
* Upload the result to the screen

Should be good enough to let us know what scanlines refresh is active...
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#192630)
I've edited blargg's original test to make it read a specific address ($80) in OAM, which has to contain a specific value (also $80). The rest of OAM contains $00.

I'm terrible at NES coding, so I built 8 different copies of the test for different test cases:
-NTSC vs PAL
-Rendering enabled vs disabled
-First 256 scanlines after vblank vs the next 256 scanlines after that

NTSC or PAL is to try to read the OAM around the same cycle on each scanline.
On NTSC, the read occurs around cycle 80. On PAL, it occurs around cycle 100 at first, and slowly drifts to around cycle 130 (because I suck at this - but this shouldn't matter much)
And then I need 2 tests because it would take me hours to figure out how to modify the code to log more than 256 values :)

So the code reads OAM, stores the value, wastes time until the next scanline, and repeats, 256 times. The "Rows257-512" versions just add a 256 scanline delay at the start of the test to resume where the last test left off.
If the value read is $80, a dash is shown, otherwise a star is shown. Each character equals 1 scanline - the first character being scanline 241 (i.e after vblank flag is set).

It seems to be working properly on Mesen. I also tested it on puNES and Nestopia UE and got the same results from both. When rendering is enabled, NTSC/PAL both work correctly during all of vblank. With rendering disabled, NTSC/PAL both work the whole time. Which is pretty much what I expected from emulators.

Could I ask you guys to run these on NTSC/PAL and see what the results are?

Also: the test only writes to OAM RAM a single time. This means that I would expect the NTSC test with rendering disabled to fail completely.
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#193199)
It'd really be appreciated if someone with the proper setup for it could run at least the PAL tests.

I'd run them myself if I could, but I'd have to spend a couple hundred dollars on a PAL console, a flash cart, and a bunch of other stuff.
Also, sorry for the bump - but if at all possible I'd like to avoid having wasted 2-3 hours writing tests that nobody will ever run :p
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#193233)
I can run them at some point, I just don't have my PAL NES hooked up right now.
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#193244)
OK, here are some results.

PalRenderingDisabled_Rows1-256.nes
- All "*"

PalRenderingDisabled_Rows257-512.nes
- All "*"

PalRenderingEnabled_Rows1-256.nes
- 24 "-", the rest were "*"
- Quite reliable from different powerons/resets, sometimes a a few "-" appear at the bottom (probably coincidental).

PalRenderingEnabled_Rows257-512.nes
- Very dependent on poweron/reset:
- - Sometimes all "*"
- - Sometimes 55 "*", then 25 "-", the rest "*"

NtscRenderingEnabled_Rows1-256.nes
- Dependent on poweron/reset:
- - Sometimes all "*" from poweron, but changes to 20 "-" and rest "*" from reset (toggles between those two options randomly)
- - On some powerons works reliably (20 "-" and rest "*" over many resets).

NtscRenderingEnabled_Rows257-512.nes
- 5 "*", then 21 "-", the rest "*"
- After a few resets all "*" (probably poweron/reset dependent like the one above).

NtscRenderingDisabled_Rows1-256.nes
NtscRenderingDisabled_Rows257-512.nes
- Sometimes all "*", sometimes all "-" (the "-" seems to be less common).
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#193250)
Thanks a lot for taking the time to run these!

The results for PAL are a bit surprising though.

There are 2 ways I can explain the tests with rendering disabled displaying all stars. Either using sprite dma when rendering is disabled either doesn't work properly, or there is some sort of OAM decay going on. The decay theory would imply that it might decay after 300+ scanlines with rendering disabled and no accesses to OAM - but this probably contradicts your previous test?

I expected the 257-512 PAL test w/ rendering enabled to always be ~55 '*' + ~25 '-' + rest is stars. I'm not sure what could make it return all stars - decay seems like an unlikely culprit - maybe PPU/CPU alignment like what seems to be the case in the NTSC tests?

Either way, this seems to confirm that reading OAM via $2004 on PAL NES should be (mostly) reliable from scanline 240 to 264 (25 scanlines - the 257-512 test sometime displaying 25 stars implies that scanline 240 is also safe). So sprite evaluation (or whatever is actually happening during vblank) seems to start around scanline 265 (or in the last half of scanline 264).
So the "Quantum Disco Brothers" demo working on a PAL NES makes sense.

For NTSC:
-NTSC disabled returning all stars is what I expected due to OAM decay.
-Seems to be affected by ppu/cpu alignment (the same was true with blargg's original test too)
-Reading from $2004 (when it decides to work) is safe for 21 scanlines, from 240 to 260 (i.e all scanlines outside the visible picture except the prerender line)
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#193255)
Sour wrote:
There are 2 ways I can explain the tests with rendering disabled displaying all stars. Either using sprite dma when rendering is disabled either doesn't work properly, or there is some sort of OAM decay going on. The decay theory would imply that it might decay after 300+ scanlines with rendering disabled and no accesses to OAM - but this probably contradicts your previous test?

Yeah, it does. And I think given how simple that test is (it's oam-decay-test from here), I think it's unlikely to have a bug in it. It basically just wrote $00 to $2001 right after VBL when A was being pressed on the controller (otherwise it wrote $14 to enable rendering).

Some other hypotheses (might have some holes in them):
(1) OAM refresh is always active when rendering is disabled.
(2) The writes to $2003 and/or reads from $2004 somehow mess up the refresh and/or the data.

Quote:
I expected the 257-512 PAL test w/ rendering enabled to always be ~55 '*' + ~25 '-' + rest is stars. I'm not sure what could make it return all stars - decay seems like an unlikely culprit - maybe PPU/CPU alignment like what seems to be the case in the NTSC tests?

Yeah that one's strange, because the 1-256 test always worked as expected. (This makes me think CPU/PPU alignment shouldn't be an issue.)

The 2nd hypothesis at least partially agrees with this. If the sequence in the program is (1) Wait for VBL, (2) Wait for 256 scanlines, (3) Start reading OAM, there's a good 56 scanlines of rendering time before the next VBL to corrupt the data in OAM before landing on VBL (where the reads would work if it wasn't for the corrupted data). The corruption wouldn't be an issue for the 1-256 test, because the expected "good" values already appear at the beginning of the test before any (possible) corruption takes place. But this all is just guessing, of course.

I guess more testing is needed. It might be a good idea to isolate the reads, i.e., wait for VBL, wait some time, read a byte, and repeat until all timing offsets have been covered.

Hypothesis 1 should be easy to falsify by disabling rendering, waiting for VBL, doing an OAM DMA, waiting for VBL, and enabling rendering. If the sprites show up OK, then refresh certainly wasn't active all the way. (In fact, I think oam-decay-test already falsifies this hypothesis, because it comes straight out of disabled rendering state to VBL, and only uploads the sprites once before entering the main loop.)
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#193262)
thefox wrote:
Some other hypotheses (might have some holes in them):
(1) OAM refresh is always active when rendering is disabled.
(2) The writes to $2003 and/or reads from $2004 somehow mess up the refresh and/or the data.
OAM refresh being always active when rendering is disabled would also mean that blargg's original test shouldn't have worked. So like you said, between your test and this, it's unlikely that OAM refresh would always be enabled.

Quote:
The 2nd hypothesis at least partially agrees with this. If the sequence in the program is (1) Wait for VBL, (2) Wait for 256 scanlines, (3) Start reading OAM, there's a good 56 scanlines of rendering time before the next VBL to corrupt the data in OAM before landing on VBL.
That's what the test does, but it seems unlikely that normal rendering would corrupt the contents of OAM if the CPU is just busy wasting cpu cycles?
It's also not a "perfect" continuation of the previous test - the first test, by the time it reaches the 256th scanline will be reading $2004 around cycle 130. The 2nd test will resume on the 257th scanline, but around cycle 100 (and it will drift back to cycle 130 or so by the 512th cycle) - but it seems fairly unlikely that this would have any big impact..


I think the results might be my own fault though - I forgot to write $00 to $2003 before writing to $4014.
Since rendering is never enabled throughout the test, maybe the value in OAMADDR isn't guaranteed to be 0? Which would screw up the DMA transfer and make the entire test display stars (for both NTSC and PAL). If I tweak Mesen to have OAMADDR = 40 as a startup value, the rendering enabled tests still pass, but the rendering disabled tests fail (all stars).

I recompiled all 8 tests to add a write to $2003 before the write to $4014. This should fix any potential issue with the oam address possibly not being set to 0 due to rendering always being turned off. Hopefully that changes the PAL results with rendering off - otherwise there's something more complex going on that I'm not quite understanding.
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#193277)
OK, well this is interesting.

PalRenderingDisabled_Rows1-256v2.nes
- 24 "-", 46 "*", rest "-"
- The results were fairly consistent across resets/powerons.
- Once or twice got 24 "-", rest "*" after reset.

PalRenderingDisabled_Rows257-512v2.nes
- 80 "-", 46 "*", rest "-"
- Results were consistent wrt poweron/reset.

...

So, seems like 24 scanlines after start of VBL it does "refresh" (or whatever) until the end of VBL, and nowhere else. Now, not sure how this fits together with oam-decay-test. It seems like there are 312-46 = 266 scanlines without refresh, yet the sprites somehow survive over to the next frame.

(BTW, please use shorter filenames the next time. It's a PITA telling them apart in PowerPak otherwise because it only displays ~30 or so characters. :))
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#193281)
So I guess we've proven that sprite evaluation can actually be disabled on a PAL NES - except between scanlines 265 to 310 (assuming that "Sprite evaluation" is what is actually running during those scanlines).
Like you said though, it's a mystery how the ram's contents survived the 1-frame delay between the $4014 write and the start of the test. I'm not sure any sort of test rom would be able to figure that part out, though.

So to sum up, the most common results for these tests should look like this:
Attachment:
results.png
results.png [ 9.49 KiB | Viewed 10891 times ]

This is what I get in Mesen at the moment (if I turn on OAM decay emulation for NTSC).
Although it looks like sometimes the OAM decay on NTSC isn't quite as fast as you would expect (and you end up with a screen of dashes for the rendering disabled test, instead of stars)?

Thanks again for taking the time to run these!
Sorry about the long filenames - I'll keep that in mind for the next time.
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#193318)
Sour wrote:
Although it looks like sometimes the OAM decay on NTSC isn't quite as fast as you would expect (and you end up with a screen of dashes for the rendering disabled test, instead of stars)?

Yeah, I got that a few times. Although the fact that this test only reads one value ($80) and from a single memory address ($80) might have something to do with it.
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#193420)
I did some improvements to the tests, which should (hopefully) give similar/better results.

Changes:
-The tests now read 4x from OAM per scanline at the following addresses and expect the following values:
Addr : Value
$22 : $A0 (Note: The test writes $BC, but expects $A0 to be read due to the missing bits in OAM - so not emulating the missing bits will cause the tests to fail)
$80 : $80
$A0 : $FC
$E0 : $55
(All other addresses are initialized with $C3)

The results are XORed together, and the code expects the end result to be $89 ($A0^$80^$FC^$55 = $89)
This means it should be more sensitive to OAM decay, among other things (I would expect the NTSC rendering off test to display all stars most of the time with this).

-Less jitter in the PAL test - the first OAM read occurs between cycle 80-90, the 4th read is between cycle 177-187.
-They display "Passed" when they match the patterns shown on the screenshot I posted above.
-Shorter filenames :)

On Mesen, these changes give the exact same results as before, and I'd expect the results to be fairly similar on a NES too (hopefully more consistent results due to reading 4 different addresses?).

If you get a chance, it'd be interesting to see how these behave on NTSC & PAL - no rush though, whenever you have the time to spare is fine!
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#193460)
I adapted a new notation below, hopefully it's self explanatory. Basically, 24-, * = "24 -, rest *".

PalRendOff1_v3.nes
- 24-, *
- I'm quite sure I got a 24-, 46*, - pattern when the first few times I tried this, but after the 24-, * first appeared it always appeared, over MANY tries (powerons/resets). (I may have been running a wrong test by accident the first few times, who knows.)

PalRendOff2_v3.nes
- 80-, *
- Not really consistent with results from PalRendOff1_v3.

PalRendOn1_v3.nes
- This worked as expected: 24-, *

PalRendOn2_v3.nes
- *
- Certainly not expected, looks like something goes wrong while it tries to read during VBL.

Good news is that the results were consistent across resets/powerons (discounting the first test, but that could've been my mistake). Unfortunately they don't line up with previous results.
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#193548)
Well, that's... confusing. I'm not really quite sure what to make of those results.

blargg's framework is able to print the values of registers on the screen, so I'll try to come up with a slightly different test that displays the values actually read from OAM - maybe it'll help figure out what's actually going on.
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#215313)
Sorry for reviving a dead topic.

Question. Reading from 2004 and Micro Machines?

why?

And could it be used as a second "sprite zero" style midscreen hit?
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#215314)
See - viewtopic.php?p=142001#p142001

If you can guarantee that a specific byte occurs exactly once in OAM, and not as the Y coordinate, and the value is not the $FF "empty" value, then ... I think you might be able to poll reads from $2004 to find out when that sprite is being drawn?

Or maybe not; the rate at which the CPU can poll may well just be simply too slow to catch the exact moment that the right byte shows up in OAM evaluation.
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#215315)
What Micro Machines does: viewtopic.php?p=67668#p67668

It's just a simple +1/+0 CPU cycle adjustment based on what value is returned by PPU. I actually did the exact same thing in a demo I did once (before I knew what Micro Machines did...), although I don't think I ever released it in any form.
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#215328)
What about Super Cars? It's definitely trying to time a border at the top of the screen via $2004 reads.
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#215352)
tokumaru wrote:
What about Super Cars? It's definitely trying to time a border at the top of the screen via $2004 reads.

As far as I can tell, it's polling $2004 to wait for the start of pre-render line, then uses a timed loop to go the rest of the way. (They could have polled for sprite 0 hit flag to be cleared instead, had they known about that.)
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#215403)
Well that's disappointing.
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#229362)
Sorry for the additional bump, but I'm trying to get a better understanding of the PAL timings indicated by these results compared to prior testing referenced in the wiki. On the errata page, the wiki says OAM is inaccessible from scanlines 21 through 70, but these results seem to contradict that, showing that the actual region is 25 through 70. The valid 24 scanline region is larger than the NTSC's 20 scanlines, 2557.5 cycles (= 24 * 106 9/16) vs 2273.3 (= 20 * 113 2/3). I take this to mean that if sprite DMA can finish within vblank on NTSC, it should also be safe on a PAL system. Is this all correct?

I ask because while I'd like sprite DMA done last during vblank to use the guaranteed even cycle for quickly reading input on NTSC, I've seen frequent mention that sprite DMA should be done first on PAL and would prefer to avoid special-casing it if I can.

Edit: Fixed some typos.
Re: PAL NES, sprite evaluation and $2004 reads/writes
by on (#229363)
Fiskbit wrote:
Sorry for the additional bump, but I'm trying to get a better understanding of the PAL timings indicated by these results compared to prior testing referenced in the wiki. On the errata page, the wiki says OAM is inaccessible from scanlines 21 through 70, but these results seem to contradict that, showing that the actual region is 25 through 70. The valid 24 scanline region is larger than the NTSC's 20 scanlines, 2557.5 cycles (= 28 * 106 9/16) vs 2273.3 (= 20 * 113 2/3). I take this to mean that if sprite DMA can finish within vblank on NTSC, it should also be safe on a PAL system. Is this all correct?

The wiki information is likely not accurate down to a single scanline in this case. It's just roughly correct, PAL PPU does (has to) do something like this during vblank because of DRAM decay. What you said seems like a fair assumption (in fact, it would make sense from Nintendo's point of view to ensure that NTSC code stays compatible with PAL in this manner). So I'm fairly sure if your OAM DMA fits within vblank on NTSC, it should also work on PAL, regardless of where you do it.