I've been going over everything in my emulator again trying to see if there were any open ends or inconsistencies I could find. This test stands out as being pretty important but untested.
Just as a refresher, this test was originally
put out by fred in response to
this thread. The idea is that latching the second $2006 write is delayed by a few ppu ticks.
However the results were never verified on real hardware (that I know of), and the test seems to have changed somewhat since it's original creation (there are now 2 'unstable' offsets.)
Can anyone run this test on console and verify the results?
Also this isn't the only ppu register that might be experiencing this effect. I think it would be useful for example to modify the bit that is being changed in blargg's original nmi_sync test rom to gauge different relative timings of when writes to PPU registers take effect.
If anyone has any other ideas about this I'd really be interested to hear them.
I guess
2nd2006_next_level.nes is invalid ROM
I tested it on Famicom AV via Everdrive N8 and InviteNES flashcartridges
and got blank screen in both cases.
BTW, nestopia says it is corrupt file.
256inc.nes is valid
2nd2006-256.7zHere is AVI capture (FCAV+EDN8)
Eugene.S wrote:
I guess 2nd2006_next_level.nes is invalid ROM
I tested it on Famicom AV via Everdrive N8 and InviteNES flashcartridges
and got blank screen in both cases.
BTW, nestopia says it is corrupt file.
It is incorrectly marked as having 2 16KiB PRG banks but only has one. You can fix the header accordingly and it will work.
Cool thanks! That certainly raises more questions then it answers. Why do they turn blue? Curious.
The other file should have 16k PRG and 8k CHR, so the header is wrong. Byte 4 should be 1 instead of 2. If you can get this version working and get some results for each of the 8 alignments that would be immensely helpful. EDIT: Oops, ninja'd.
EDIT: For what it's worth, VisualNES displays the same blue triangles occasionally as well. (attached) So, something interesting is happening here.
Here's a little more info. On my emulator, the $2006 latch is delayed by 3 ppu ticks. For this test ROM, those latches occur at the following 10 cycles (all on scanline 2):
Quote:
252
254
255
257
258
260
261
263
264
266
the video from Eugene demonstrates consistent changes to blue triangles every 10 frames. So, one of the above cycles is responsible for it.
Awesome thanks! That's really interesting. I didn't expect the text to go away sometimes. 0_0
Well, I know for certain now that my emulator is way off in dealing with these things, time to get at it.
(That video format is fine once I found the codec)
So it looks like the automatic update done by the ppu is simple blocking the latching from $2006 from occurring altogether in the cases where the triangles turn blue. Not sure if it's losing a race condition or if it's intentional, but that's pretty clearly what's happening. Neat!
It looks like matching the console behaviour should be pretty simple and deterministic.
I'll start thinking about what other latching behaviour cases would be good to have. $2007 comes to mind as well as $2005, but not sure they'll be as easy to test.
I'm guessing all my recent posts on this forum has been lamenting the fact that I haven't tested this on real hardware but I'll say it again! Haha.
I'm sadly not in a great position to get a NES and don't know anyone with an everdrive, so the test is what it is, sadly.
Great to see more interest in researching the results however!
These kinds of tests are kind of the last frontier for NES research, if you have any other ideas for similar tests please do put them together and hopefully we can get them run. The results so far are already very interesting.
ok, I can get the exact same behaviour as in the videos if I do the following things.
-Delay latching of $2006 by 3 ppu cycles (this was already known and expected to be correct.)
-Have a race condition between the h and v increments on cycle 256 and the latching from $2006 occurring on the same cycle that results in all related variables being reset to 0.
Obviously that second one is highly speculative. But that is what results in correct behaviour according to the two tests.
@fred: In order to better understand this behaviour I think we'll need some tests that happen on different locations. How hard would it be to modify the test rom to occur in the middle of the scanline for example instead of at the end? It would also help to have more geometry so $2006 can be set to different values and see the effects.
Seems like we're only scratching the surface here.
The second detail there seems to be on track to explain glitching when scrolling vertically in Zelda, which I've been working on recently. I discuss that in
this thread, which includes test ROMs that can reproduce this on real hardware. The ill-timed writes cause some bits to get cleared, but not necessarily all of them; the screen sometimes ends up at $2100 in the nametable. Here are some twitch clips of the glitching in action:
one two. From what I've seen in Zelda, when $22xx is written, the split goes to $2000 in the nametable, and when $21xx is written, it goes to $2100.
Edit: Fixed the second Twitch link.
@alyosha: The test is a bit cumbersome to modify but I've been wanting to improve it, so I'll give it a go.
Haven't kept up on NES dev for a while so i'm a bit rusty though, haha.
Edit: as previously mentioned, the text is shaking... hmm haha, something is wrong. It didn't do that before, but it's hard to figure out exactly what's wrong.
I need to understand both the goal and the code again so I can make it simpler. I remember running into branches crossing pages being a big problem for exact timings...
Fiskbit wrote:
The second detail there seems to be on track to explain glitching when scrolling vertically in Zelda, which I've been working on recently. I discuss that in
this thread, which includes test ROMs that can reproduce this on real hardware. The ill-timed writes cause some bits to get cleared, but not necessarily all of them; the screen sometimes ends up at $2100 in the nametable. Here are some twitch clips of the glitching in action:
one two. From what I've seen in Zelda, when $22xx is written, the split goes to $2000 in the nametable, and when $21xx is written, it goes to $2100.
Yeah the 'everything goes to zero' thing is just a guess that works for those 2 test roms. Probably the behaviour is a bit more complicated (but likely still deterministic, since so far everything matches frame for frame and pixel for pixel.) We'll need some more test cases to figure it out. Also, pretty crazy coincidence that we were looking at the same problem at almost the exact same time.
I tried out your test roms, but the one that's labeled 'delay' seems to be acting without glitches, while the one labeled 'normal' does hit race conditions, but not at cycle 256. These glitches happen on the h-increments at cycle 215, 223, 239. So, maybe they are labeled backwards?
Anyway, it would be very helpful to have hardware captures of these cases too to see how they are different from the 256 case.
@Eugene.S can you help with this?
@fred: Cool thanks! good luck, I'm sure that stuff is pretty complicated to get right.
I feel like we're really on to something here!
EDIT: VisualNES doesn't seem to work with the split_scroll test roms for some reason. It doesn't split the screen or scroll at all. That's kind of disappointing. (Unless it also doesn't split/scroll on hardware, in which case I don't understand what's happening.)
So, per
PPU scrolling:
On dot 256, the fine Y coordinate is incremented within V
On the second write to $2006, T is copied into V
So we've got a situation where parts of the bits in V are both incremented and loaded at the same time...
Looking at Visual2C02, vramaddr_t12 (fine Y lsbit) is copied into vramaddr_v12 when copy_vramaddr_vscroll is true.
vramaddr_v12 toggles after pclk0, then node 2101, pclk0 again, and inc_vramaddr_vscroll.
If either xxx_vramaddr_vscroll would cause the node to be pulled low ... the result should be low, if I'm reading this correctly.
Alyosha_TAS: The two test ROMs try to replicate the scroll behavior in Zelda. split_scroll_normal.nes is normal vertical scrolling without the added DPCM delay, with some garbage at the end of the last scanline of the stationary section caused by an early split (which is normal in Zelda). The issue does not occur with this timing. split_scroll_delay.nes adds a variable delay (between 8 and 22 CPU cycles, chosen by 15-bit LFSR) to try to reproduce the issue. This delay removes the garbage and, on real hardware, successfully triggers the glitching seen in Zelda.
Here's a clip of scroll_split_delay.nes being run on an Everdrive. I'm not sure why it's not running correctly in VisualNES, but will try to look at that hopefully by the end of the weekend.
Tonight, I created a test ROM that allows for selection with the joypad of different PPU_ADDRESS values for the split, and which utilizes blargg's PPU/NMI synchronization code to target specific dots. I'll post it here when I nail down the timings by testing on real hardware; unfortunately, my Everdrive is still in the mail, so I'm currently relying on others for my hardware testing, which is a significant bottleneck.
Stepping through the split_scroll_delay video frame by frame, the glitch seems pretty consistent in what it does for each value we write to PPU_ADDRESS. Here, the first number is what we attempted to write to the register, and the second is what it behaved like. I didn't notice any times where one value glitched in a different way. I had to refer to the Twitch clip for G because I didn't see it glitch in this video.
Code:
$0100 (8) never glitched
$0120 (9) -> $0100
$0140 (A) -> $0100
$0160 (B) -> $0100
$0180 (C) -> $0100
$01A0 (D) -> $0100
$01C0 (E) -> $0100
$01E0 (F) -> $0100
$0200 (G) -> $0000
$0220 (H) -> $0000
$0240 (I) -> $0000
$0260 (J) -> $0000
$0280 (K) -> $0000
$02A0 (L) -> $0000
$02C0 (M) -> $0000
$02E0 (N) -> $0000
$0300 (O) -> $0100
$0320 (P) -> $0100
$0340 (Q) -> $0100
$0360 (R) -> $0100
$0380 (S) -> $0100
$03A0 (T) -> $0100
Edit: Notably, the PPU would be advancing to $0100 at the point where the split happens, so I think these results make sense given lidnariq's post.
Alright, i've gotten a grip of my code and annotated it a bit better for easier updating, i hope haha.
I took the liberty to make the text static and moved the triangles somewhere closer to the middle of the screen also.
Quote:
modify the test rom to occur in the middle of the scanline for example?
Could probably do that, but I don't see what that would do? But then again, I'm rusty at NES stuff haha.
Quote:
It would also help to have more geometry so $2006 can be set to different values and see the effects.
Did you have something in mind? I tried placing a square on the second nametable so you can see it moving up/down 1 pixel and shake.
If my hypothesis that this is a bus conflict between the vramaddr_v increment and copying vramaddr_t to vramaddr_v is correct, then it might be possible to coax out bus conflicts in the middle of a rendering scanline too.
Hmmm, this doesn't appear to be quite the same thing as the original tests.
According to my emulator read out, the glitches that would result from the ppu scroll tests would be happening because of the v -> t operation at cycle 257. The previous tests were testing glitches at the increments at 256. In fact, split_scroll_delay amazingly misses 256, hitting 255, 257, and 258 instead.
So, it looks like something is probably happening there too.
Actually, I tried varying power up timing and it turns out that this effects what cycles the glitches hit at. So while it's not impossible that some glitches might be happening on 257, it seems less likely now (since the original tests hit 257 sometimes as well and nothing happened on them.) I think we just need a more comprehensive test for this.
Also curiously, split_scroll_normal is hitting what would be h increments at various points in the scanline, but if it is expereincing any conflicts, we aren't testing them so far.
I'll try to go over this a bit more over the weekend when i have more time.
@fred: yeah as libnariq says, we likely need some cases where $2006 is latching things that would be effecting h at midscaline points. the present tests do not accomplish this (since they are writing things with 0 for those.) It would help if the shapes on a scanline were noticably different, so we can see any glitchiness very clearly.
EDIT: also thanks again Eugene.S for the frame by frame videos.
Quote:
I didn't notice any times where one value glitched in a different way. I had to refer to the Twitch clip for G because I didn't see it glitch in this video.
All of my tests was recorded on Famicom AV, it have:
RP2A03H - "rev H" CPU
RP2C02H-0 - "rev H" PPU
Alright, here's a new test ROM for NTSC that hopefully provides enough flexibility to nail this down. It hasn't been run on real hardware yet, but I don't have any reason to think it won't work there.
This ROM uses blargg's PPU/NMI synchronization code to target pairs of neighboring dots, and allows nearly all of the split parameters to be customized with the joypad so it can occur anywhere onscreen and split to any target location. Up/down adjust the split by 113 CPU cycles and left/right adjust by 1. Select changes the nametable for the pre-split region. A/B adjust the value written to PPU_ADDRESS by $20, and start increments it. All of the code paths should be appropriately timed so they won't affect the location of the split. If timings are the same on console, this should start up around scanline 63, dots 252 and 253.
I went over the frame by frame for scroll delay v1 test, and yeah it looks like the glitch is properly characterized by doing the v increment but then '&' the result with the latched value. My emulator gets exact results by doing this. This also still gives correct results for the original 2 tests.
hopefully the v2 test will give final confirmation on this. (On my emulator though it powers up to scanline 64 cycle 255/256)
Then we just need to see if any similar effects happen for h.
Alyosha_TAS wrote:
I went over the frame by frame for scroll delay v1 test, and yeah it looks like the glitch is properly characterized by doing the v increment but then '&' the result with the latched value. My emulator gets exact results by doing this. This also still gives correct results for the original 2 tests.
Alyosha_TAS wrote:
Also curiously, split_scroll_normal is hitting what would be h increments at various points in the scanline, but if it is expereincing any conflicts, we aren't testing them so far.
I
think that the bus conflicts should only be in the "V" bits (0x7BE0) on dot 256, and should only be in the "H" bits (0x041F) in the middle of the scanline. The other bits look like they should be correctly copied from T to V.
lidnariq wrote:
I think that the bus conflicts should only be in the "V" bits (0x7BE0) on dot 256, and should only be in the "H" bits (0x041F) in the middle of the scanline. The other bits look like they should be correctly copied from T to V.
Yeah that definitely would make the most sense. Would you expect $2005 to behave similarly? That seems like the next obvious choice to test.
I can record this ROM on Saturday. What values you're interesting?
Alyosha_TAS wrote:
Yeah that definitely would make the most sense. Would you expect $2005 to behave similarly?
Let me think this through:
The conflict comes specifically from the copy_vramaddr_Qscroll signal being true at the same time as the inc_vramaddr_Qscroll signal.
inc_vramaddr_hscroll is true on dots 328, 336, and 0,8,16,...248, 256
copy_vramaddr_hscroll is true on dot 257 or on the second write to $2006
inc_vramaddr_vscroll is true on dot 256.
copy_vramaddr_vscroll is true during part of the pre-render scanline, or on the second write to $2006
So, I don't see an obvious way to tickle this on anything but the second write to $2006.
—
Now, separately ... in the NES, M2 is true for 1 7/8 pixels (7.5 XCy), and the exact amount of overlap of the M2 cycle during dot 256 should have analog effects. The hazard should only happen during pclk1=the second half of each pixel (2 XCy), so ... I'd be idly curious how the four different 2A03-vs-2C02 phases work out. No idea how to set up that test, though.
edit: per
this thread, note that ulfalizer's graph needs the CPU cycles to be shifted one by half crystal cycle, there should be three race-y cases:
M2 falls 1 XCy after dot 255 and pclk1 (overlap of 1 XCy ; alignment A)
M2 rises 1.5 XCy after dot 255 and pclk1 (overlap of 0.5 XCy ; alignment B)
M2 rises 0.5 XCy after dot 255 and pclk1 (overlap of 1.5 XCy ; also alignment A)
The remaining two alignments should always have dot 255 and pclk1 happen without an edge of M2 during
edit2: The more I look into this in Visual2C02, the more I'm confused by the "write to $2000 on dot 255" glitch.
Yo, sick test fiskbit! That totally obsoletes my crude attempts haha!
I've thought about it a little... here's a guess as to how the "write to $2000 at the end of a scanline" glitch happens:
* Writes to PPU registers always happen during multiple pixels
* There is no alignment of CPU and PPU such that pclk1 is only true once while M2 is true
* Krzysiobal has found that the data bus is not actually valid when M2 rises
* Blargg found that only two of the four CPU-PPU phases cause it
so...
* CPU writes invalid data (upper byte of address, per capacitance) to PPU register $2000 on dot 257, and this bad value shoots through straight from _db0 to _vramaddr_t10 to _vramaddr_v10 because of timing. Which exact two phases cause this result is unclear ...
* On dot 258, the write is still happening, the data is now valid, and the PPU loads the correct value into T, meaning that it will be copied correctly to V on subsequent scanlines.
If this interpretation is correct, then writing to the PPU register mirror at $2100 should cause the glitchy single scanline to instead source from the right nametable.
@lidnariq: you seem to be really good at understanding what the circuits in the ppu are supposed to do, do you have any idea why VisualNES is failing to scroll in the first version of the tests fiskbit posted? That error seems pretty serious and makes me a bit wary of VisualNES.
@Eugene.S: well the first thing we need is just a video of what the test looks like on poweron. This will hopefully confirm the initial timing. On my end it powers up to the glitchy state right away. I think the v bit effects are pretty well settled, so I'll look for conditions where the h bit should be effected. Also I guess we have to test the pre-render line where v=t continuously.
EDIT: hmmm, actually maybe the test needs to be modified to test the h bits.
Alyosha_TAS wrote:
That error seems pretty serious and makes me a bit wary of VisualNES.
To be fair, you should be wary of Visual NES either way, there are a number of issues with it (e.g palette is often incorrect, etc.) that I haven't taken the time to analyze in more details ever since I initially made it. It may or may not have more issues than the standalone ones - I wish I had the proper knowledge to get it fixed, but unfortunately, I've forgotten most of the minimal knowledge of transistors and circuits that I had :p
The scrolling isn't happening because the sprite 0 hit isn't happening in VisualNES.
Since Fiskbit's test doesn't rely on sprite 0, it might work, but requires user interaction.
lidnariq wrote:
The scrolling isn't happening because the sprite 0 hit isn't happening in VisualNES.
Since Fiskbit's test doesn't rely on sprite 0, it might work, but requires user interaction.
Oh, ok well that settles that. Thanks.
@Sour: That's ok, I sure as heck can't fix it. It did perfectly duplicate 3 out of the four tests though, so that's not bad.
For the write to $2000, does anybody have the ability to run blargg's test in this post
http://forums.nesdev.com/viewtopic.php?p=112503#p112503 ?
Theoretically every 3rd scanline should be glitched out, I think. (And only half the times the machine comes out of reset)
If my understanding is right, then changing two bytes: (6502 addresses 0xE41F and 0xE434 = file offsets 0x642f and 0x6434 =
lda #1 and
sta $2000) to instead be 0 and $21 should cause the glitch to happen the other direction.
(No photos or videos are necessary, just verbal verification)
Relevancy: I think this glitch should happen with
allthe first write
s to $2005 or the first write to $2006 also...
Eugene.S: It's hard to say what specific values to test in split_scroll_test_v2 because it likely doesn't match Mesen timing exactly and different power-on states may affect it. You can get an idea of where you are by looking at the split behavior; on a given scanline, there will be a point where either the split will jitter because the two target dots are on either side of the end of the visible scanline, or adjusting by 1 CPU cycle left or right will cause a scanline to be added or dropped. I tried to time the split to occur by default right around the start of hblank on scanline 63, which is the last scanline of the 7's row. If the split occurs after hblank begins, the first scanline of the 8's will be cut off.
If you go down 1 scanline (moves split left by 2 dots) and right 1 CPU cycle (moves split right by 3 dots), you'll advance 1 dot forward on that next scanline. If you do this across several scanlines around the start of hblank, we can get perhaps get an idea of where this stuff is happening. And, doing this at different parts of the screen and with different write values can help verify that the behavior is actually an AND of the incremented value and the write value. I don't know if it's important to do this with different CPU/PPU alignments.
Also interesting would be test results across several power cycles of the two versions of blargg's PPU_CONTROL write test. I've attached my modified version below, and the original is
here. As lidnariq stated, we just need confirmation that the glitch happens on the $2100 version like it does on the $2000 version, so video isn't necessary.
fred: Thanks for the kind words. Your tests were super useful in getting us this far, too! Also helped improve my timing substantially, since I previously had no idea blargg's synchronization code existed.
Alyosha_TAS: If you have specific requests for what modifications you'd like for testing additional cases, please let me know and I can try to work them in or make a different version for them.
Alyosha_TAS & lidnariq: If user input can't be done in VisualNES, the default values for scroll_nametable_high, scroll_nametable_low, nametable, scanline_delay, and cycle_delay can be modified in the assembly file. Assembling with
snarfblasm can be done by placing the binary into the source directory and running
snarfblasm.exe split_scroll_test.asm in command prompt. It will output the result as split_scroll_test.bin. Adding additional LDA's for setting these variables at this point is fine; this part of the code is not timing-critical.
lidnariq: Interesting hypothesis about the $2000 writes. For modifying blargg's test, the correct changes are actually ROM file $6434 = #$00 and $6444 = #$21. I've attached the modified ROM below. If this works out, then the glitch can be avoided when using the SMB-style NMI handler by writing to the PPU_CONTROL mirror that matches the current nametable X, which is an easy workaround. Regarding this happening with $2005/2006, as well, we should be able to test the $2006 case with the current split_scroll_test_v2. The first write occurs 12 dots before the second (and probably seems like 15 because the main effect we see from the second write in the test should be 3 dots early, I guess?).
It is so very satisfying to have one's hypothesis verified.
Here's a pair of screen captures from Eugene.S's video
Can confirm that the scroll_v2 test powers up in the same state as I see on emulator, and that 'inc then &' correctly reproduces the results pixel for pixel.
Cool stuff!
Ok, so thinking about the rest of my hypothesis.
The shoot-through glitch can only happen on dot 257, on those bits that are reloaded from T to V and are part of the horizontal position. It can't happen on the second write to $2006 even if it collides with dot 257, because the T is copied to V for the entire duration of that write, and the wrong value at the beginning can't get stuck in V for one scanline.
The second write to $2005 will also have no effect, because it only sets vertical position bits.
So, describing the glitch we "should" see on a first write to $2005 that collides with dot 257, it should end up setting just the following scanline to a coarse X of (upper byte of address of PPU mirror)÷8 - i.e. ranging from 4 to 7, ultimately. There is no "V" for a wrong fine X to get caught in, so that will produce the expected result. (i.e. writing to $2005 should cause a glitched coarse X of 4. Writing to $3F05 should cause a glitched coarse X of 7)
On a first write to $2006 that collides with dot 257 ... the only horizontal bit affected by the first write to $2006 is the one that selects the nametable, so writing to the PPU register mirrors at $2006 vs $2406 should have the same effect as writing to $2000 vs $2100.
Both of these assume that there isn't another write to $2006 copying from T to V before the background tile fetches start for the next line.
So I guess at this point we want tests for to verify these?
* If the second write to $2006 lands ends either during dot 256 or between dots 256 and 257 ends during the right half of dot 256 or the left half of dot 257, the resulting value in both T and V will be the result of the bus conflict between the incremented X/Y location and the written value
* If the second write to $2006 lands on a whole bunch of other X coordinates (multiples of eight, except 256 through 320 inclusive), the resulting value in both T and V will be the result of the bus conflict between the X location and the written value, but the Y location will be loaded as intended
* A write to $2000(✓), or a first write to $2005 or $2006, that starts in the middle of the right half of dot 257 will end up writing open bus to V, and then the desired value will be in T during dot 258.
I've modified blargg's test to handle the first-write case for $2005 and $2406, which was a little more involved than the other modification. For both, #$00 is written to the register, $2002 is read after the write to clear W, and the timing has been changed to accommodate this new read. These new tests are attached below. I believe they're correct, but I'd like if someone could verify that they are successfully testing the cases we care about.
For the two $2006 second-write cases, I think split_scroll_test_v2 can handle them, but if it can't, I can update it to do so if someone can explain what it's lacking. Alternatively, I could try to hack up blargg's test further to get those writes to hit specific dots, which might be a better idea just for ease of testing.
I think it would be good to modify the scroll test to check h-scroll as well. As in writing to the lower its of $2006 (and having some horizontal scroll apparent on screen of course.)
It seems highly likely that the results will agree with the v-scroll ones, but having a dedicated test for would be good.
The start button can be used to increment the value being written to $2006 by 1, which should allow for h-scroll tests.
Oh I didn't even try that, shows my reading skills.
Looks like there is an interesting case for v2 scroll at:
4A
40
00
012X
My emulator and Mesen seem to disagree here (and apparently not because of 2006 writes?)
@Eugene.S: Do you think you can make a video for this case?
er... something should have shown up. Either that or I'm completely wrong, but I can't see an obvious reason how.
Given that the odds of the effect showing up at all is only 50% for any given reboot... I was already kinda surprised that it worked in both ppu_2000 and ppu_2100.
—
Tangentially related, this makes me really curious about the breakdown of pclk0 vs pclk1 in the 2C07 / UA6538.
I have original PAL NES, but still waiting 72 to 60 pin converter from ebay.
Note that these tests won't function properly on PAL as-is, but may still be able to provide useful results without being retuned. The ppu_xxxx_glitch tests will drift, but it looks in Mesen like they'll occasionally hit the correct dots. split_scroll_test_v2 will fail its NMI/PPU sync, so there will be much more jitter, and the scanline delay isn't tuned to the different ratio of dots to CPU cycles.
For the new 2005 and 2406 tests, did anyone verify in Mesen that the tests do what we expect? I think they're right, but especially with them not working on real hardware like we expected, I'd like to be sure I didn't misunderstand the requirements.
They looked like they put the write of the right value at the right time to me, but I was just using Mesen's event viewer.
For me, I am seeing the first write to $2005 happen one tick too late.
Oops nevermind, it's happening on the correct tick.
I finally got all the equipment I need to run some tests on my NTSC frontloader via Everdrive. Here are some results.
ppu_2005_glitch.nes & ppu_2406_glitch.nes
After many resets, I can confirm that the $2005 and $2406 tests do indeed show the glitching that lidnariq predicted for some (but not all) PPU alignments.
split_scroll_test_v2.nes
When powering on, at 40,39, the 2 dots I hit are right before the increment (scanline lost) and right after. The latter dot is the one that causes the glitch. This shows onscreen as vertical jitter plus the glitch. Moving down and right (41,3A) puts us one dot further, so both target dots are after the increment, which makes the left one cause the glitch (but there's no vertical jitter).
At 40,39,00, if I try the range 0101-011F, all of these cause glitching except 0102, which just moves the post-split section 2 tiles left without any glitching.
40,39,00,0800 causes glitching, but 40,39,02,0800 does not, showing the glitching with just a nametable select bit.
Positioning at any place where the first tile post-split on the target scanline flickers can cause coarse x glitching for the full screen post-split. This can be seen at 40,11,00,0104. 41,12,00,0104 also causes this, so it's the right dot for the former and left for the latter. This confirms the coarse x behavior that had been guessed earlier in this thread.
Some emulator comparisons:
- 40,39,00,0100: Post-split, the screen should jitter vertically by 1 scanline. puNES 0.100 matches this. Mesen 0.9.7 does not; it must be at 41,3A,00,0100 for this to happen.
- 3F,10,00,0100: Post-split, the target scanline should jitter horizontally by 1 tile. Mesen and puNES both do this at 3E,10,00,0100, a position at which real hardware consistently skips the first tile post-split.
- 40,11,00,0100: The first visible tile of the target scanline post-split should flicker. Mesen does this at 3F,11,00,0100 and puNES at 3E,11,00,0100 (both should skip the first tile on that scanline post-split).
There appears to be another PPU alignment where the jitter+glitch occur at 3F,38 and glitch at 40,39, instead of 40,39 and 41,3A as I had described earlier.
If anyone's interested in results for specific values, I can test them and report results, but I don't expect to have the means to capture video of this anytime soon.
Fiskbit wrote:
ppu_2005_glitch.nes & ppu_2406_glitch.nes
After many resets, I can confirm that the $2005 and $2406 tests do indeed show the glitching that lidnariq predicted for some (but not all) PPU alignments.
If you had to take a guess, did it feel like it was roughly half the time, or roughly one quarter the time?
As far as I know, the different alignments should be all equally likely...
I've tested how often I get each result across batches of 30 resets. I only tracked number for the first several tests, and then started tracking order, as well. I also did tests on how often I get jitter when starting up split_scroll_test_v2. Here are my results in the order I got them:
Code:
(y = yes glitch, n = no glitch)
2000: y = 15, n = 15
2005: y = 7, n = 23
2100: y = 5, n = 25
2406: y = 9, n = 21
2000: y = 15, n = 15
2100: y = 5, n = 25
2005: y = 9, n = 21
2406: y = 8, n = 22
2100: y = 10, n = 20 (nyyny nnnny nyyyn ynynn ynnnn nnnnn)
2406: y = 8, n = 22 (nnnny nnyny nynnn nnnnn ynnnn yyynn)
2005: y = 9, n = 21 (nnyyn nnnyn nnynn nnnny ynyny nnnyn)
2000: y = 17, n = 13 (yyyyy nnyny ynyny yyyny nnnnn nnyyy)
(y = yes jitter, n = no jitter)
sst2: y = 24, n = 6 (yyyyy yynyy yynyy yynyy nnyyy yynyy)
sst2: y = 26, n = 4 (nyyyy yyyyy yyyyn nyyyy yyyyy yyyyn)
sst2: y = 23, n = 7 (yyyny yyyyy yyynn ynyyy yyyyy ynnyn)
If we assume the 4 PPU alignments are equally likely, this feels to me like everything here is around 1/4 except the 2000 test, which is 1/2. I expected 2000 and 2100 to have the same results and am a bit confused by them differing, though.
Assuming my mental model is correct, there's two cases here:
A) M2 rises 1/2 master cycle before dot 257
B) M2 rises 3/2 master cycle before dot 257
The CPU/PPU's pulldowns are a little stronger than its pullups (because NMOS)... Maybe, because this is analog race condition land anyway, the $2100 leaves $21 on the bus during φ1 and the CPU can pull down the least significant bit in case B but not case A, before it gets stuck inside? Whereas for the reverse, $2000 leaves $20 on the bus and the CPU can't pull the line up early enough?
I don't suppose anyone (Quietust? Kevtris?) knows of ways for software to detect which of the four alignments one's NES is in?
This post by blargg indicates there is a way to detect the alignment and that he may have made a test ROM for it, though I haven't yet found evidence of such a ROM being posted. The data also shows that alignments may have different likelihoods, though I'd like to see more testing done on that to be sure. I'll continue digging into alignment detection, but it's definitely at the edge of my understanding at the moment, so any help would be appreciated.