So...why not? May or may not implement this, but now I want to play with it. Generally, I've been preserving a subpalette for the menu area. I got the thought about possibly trying to swap the palette after the menu draw, since that area is always static, to maximize colors in gameplay area. Game has no scrolling. Started searching around the forums a bit and found some tips and tricks (and nightmares). I figured I'd toss it out there to see if anyone could help walk me through it.
I saw Tokumaru's responses about preloading A,X,Y with the values, that way you could set $2006 to $3fxx, bit the first write to %2007, then fire AXY as successive writes to $2007 to minimize time this would take. So...conceptually, I would...
1) Load palette with menubar subpal
2) Wait for sprite zero hit (which I'd put at the end of the menu bar), which would look something like:
Code:
checkSpriteZnotHit:
bit PPUSTATUS
bvs checkSpriteZnotHit
checkSpriteZhit:
bit PPUSTATUS
bvc checkSpriteZhit/code]
Then wait for hblank (how?)3) Turn rendering off, then push the three new values to the subpal
Then wait for next hblank (how?)4) Turn rendering on and reset scroll.
That seems...too easy. Granted, I'm not trying to do any sort of scrolling or anything. Is it really that easy? I'm reading about waiting for hBlank, but haven't ever had to do that. How does one know when the hBlank fires (understand the concept of an hBlank, but don't know how to know when it fires like vBlank, or if it's even necessary). Or is it that sprite0 hit *forces* an hblank or something?
Thoughts?
Thanks!
JoeGtake2 wrote:
Then wait for hblank (how?)
There's no signal in NES which tells you that hblank has started. You're going to have to use a timed delay loop (well, it doesn't
have to be a loop specifically) after the sprite 0 hit. How long a delay depends on your sprite's X position. (E.g., if your sprite is at X=200, you're going to have to wait approximately 256-200=56 pixels to reach the right side of the screen, i.e., the start of hblank. That comes down to 56/3 ≈ 18.67 CPU cycles on NTSC machines, so about 9-10 NOP instructions would do the job in this case.)
The debugger in Mesen has a pretty cool feature that lets you see the screen as its drawing (Options -> Draw Partial Frame in the debugger window). That could come in useful when debugging things like this. You can also see the PPU cycle count (0-341 -- anything over 256 means you're in hblank).
Thanks - yeah, that's getting into stuff I haven't really pushed into yet. Hm....
Alright, i could draw this sprite at, say, 240...then i'd have 16 pixels...5.3 cpu cycles...2-3 NOP instructions...
And then at that point, do the steps mentioned...but no need to wait until the next hblank in step 3, as it should all fit in that tiny hblank, correct?
Again...that kinda feels too easy...
Sprite 0 hit happens at a certain horizontal position on the scanline, and hblank begins at x=256. Each CPU cycle is worth three pixels. So if you have sprite 0 hit at (say) x=76, that means 256 - 76 = 180 pixels or 60 cycles from that point to the start of hblank. The end of your waiting loop will probably use 5 to 11 of those cycles.
Code:
lda #$C0 ; $80 is vblank; $40 is sprite 0
s0wait:
; The read and branch total 7 cycles, with the actual PPU port
; read on the fourth cycle of 7.
bit PPUSTATUS
beq s0wait
; At this point, one of two things has happened: either sprite 0 hit,
; or sprite 0 missed and vblank began instead. Occasionally
; sprite 0 may miss in lag frames or in the first frame before the
; OAM DRAM controller becomes stable. Distinguish these and skip
; mid-frame effects if it missed.
bmi sprite0_missed
; Now we're 5 to 11.67 cycles (15 to 35 pixels) after sprite 0.
; Do sprite 0 stuff.
So there are two ways to buy enough cycles to set up the PPU port write sequence to happen during horizontal blanking: sprite 0 on the left side of the same scanline or on the right side of the previous scanline.
Awesome guys...good info!
Yeah, tepples, my thought was last pixel of former scanline.
Any suggestion as to generally *where* to place this in relation to other code in order to minimize potential for gremlin chaos? haha
There's a quirk of the PPU's pixel pipeline such that a hit on x=255 (and only x=255) won't register. If you're looking to use a whole line to prepare, put the overlap at x=252-254 or the like.
So, I'm getting hung up right out of the gate.
In my NMI, rendering is off. This code just causes the game to hang indefinitely:
Code:
WaitNotSprite0:
LDA $2002
AND #%01000000
BNE WaitNotSprite0
WaitSprite0:
LDA $2002
#%01000000
BEQ WaitSprite0
LDX #$10 ;; arbitrary - variable
WaitScanline:
DEX
BNE WaitScanLine
I'd delved deeper but kept freezing, so i figured I'd just start whittling away. It's the WaitSprite0 that causes it to hang. I also made absolute sure, in case it made a difference, that Sprite0 was being drawn on the screen (by literally drawing it just prior to this code). It still froze, gray screen. This is without any sprite0 effect...just the above code inserted in the NMI just prior to palette loads in otherwise working code.
Any thoughts? Thanks!
This has nothing to do with the hanging, but here's a quick tip: to reduce the latency when detecting sprite 0 hits, use BIT instead of AND to test the flag:
Code:
WaitSprite0:
BIT $2002
BVC WaitSprite0
BIT copies bits 7 and 6 of the read value to flags N and V, respectively, so you can test those bits more quickly. But even when you need to test bits other than 6 and 7, BIT still results in a shorter loop since you can preload the bit mask, reducing the latency anyway:
Code:
LDA #%00100000
WaitSpriteOverflow:
BIT $2002
BEQ WaitSpriteOverflow
This helps keeping your raster effects timed from sprite hits or overflows more stable.
As for the hanging, you do know that a sprite 0 hit only happens if an opaque sprite pixel overlaps an opaque background pixel, right? Transparent pixels (color 0) don't count.
Actually, I did know that, but I was absently not thinking about the fact that only my MainGame state was drawing sprite 0...forgot to check the game state before doing this call. Oops! Thanks for the tip. Now let's see if I can get it working
I would actually recommend testing both vblank and sprite 0 hit together in the second wait loop:
Code:
lda #%11000000
:
bit $2002
beq :-
If done correctly it should only ever exit the loop due to sprite 0 hit, but vblank is an appropriate fall-back in case some disaster causes that sprite 0 to miss. The fallback result wouldn't be pretty, probably it will drop to 50% framerate and be scrolled completely wrong, etc. but the game won't lock up, and if the player can manage to get themselves where it gets restored, they can get back on track without having to abandon and reset.
Just in case, might also be worth saying that you can have "opaque" pixels hit that are the same colour as the background, they just can't be colour index 0. Also the sprite background priority bit doesn't affect the hit test so you can put it "under" the background to hide it as well.
I'm making a bit of progress...right now, in just trying to successfully read a hit without glitching or locking up.
I can visibly see where I'm getting a split (right now, ugly horizontal line that flickers...i'm sure that's a timing issue) but unfortunately, the bigger problem is that:
a) I get a push (like the scroll is getting messed up...can mess with that a bit later)
b) it seems like everything *above* the line is drawing from the wrong table, even though there haven't been any writes to 2000 (and even hard-writing something just before the sprite zero check doesn't help this).
Any ideas?
**** Actually, not sure what magic just happened but just going back through seemed to fix things. Huh. Ok. I think I'm right there.
********** Nope, nevermind. Here's what's happening.
(rendering is off)
Code:
;; checks game state...if not a screen where sprite 0 is drawn, skip this code
WaitSpriteNot0:
BIT $2002
BVS WaitSpriteNot0
LDA #%11000000
checkSprite0:
bit $2002
BEQ checkSprite0
;;;;;; some arbitrary numbers in a dummy loop here to try to get to the end of the scanline
;;;;; should change a color in the last sub palette.
;;;;;; this works with no issue if I don't do the sprite zero check
;;;;; so something with the sprite zero check is the problem?
DoSingColorSwap:
LDA $2002
LDA #$3f
STA $2006
LDA #$0c
STA $2006
LDA #$30
STA $2007
If I skip the sprite 0 loop and just do the single color swap, no problem. Work's just fine, for the entire screen. But with the sprite 0 hit and trying to separate the screen, the color does not change, I get *animated junk* at the top, scroll value was pshed off, and it seems that in some spots it is drawing the right value (f5, which should be a blank tile) but from the wrong pattern table. I tried different versions of the sprite zero check discussed here. All with the same result.
Currently, i am drawing sprite zero at #$d0 if it helps (and it is definitely colliding with background).
JoeGtake2 wrote:
(rendering is off)
I find this statement a bit confusing, because rendering must be on for the sprite hit to happen, and off for the palette update to happen, so rendering must be turned off after the hit and before the palette update. Is this what you're doing? Also, after the update, the scroll has to be restored (since you trashed it when updating the palette), and rendering turned back on.
Can you show the complete code for the split, even if it contains temporary junk?
Sorry - that was a misplaced note to myself. Yes, rendering is on for the eval.
Essentially, I load the palettes (all) then turn back on rendering, do the checks for sprite 0 hit, turn rendering back off, and then load the *other* palette* once the hit fires. I have tried zeroing out $2005 afterwards to no avail.
Right now, it looks like I'm getting closer, but my scroll is still off. Hm.
You seem to be on the right path but this post may help you :
viewtopic.php?f=2&t=13188
Yeah, I looked at that for a lot of reference. Everything seems fine, except for the sprite 0 hit. If I take the sprite zero hit out, i can change the palettes no problem. But as soon as i do the sprite 0 hit stuff, I crash and get all sorts of funky things happening with scroll. Hm.
Stupid question : are you using a fully transparent sprite ?
glutock wrote:
Stupid question : are you using a fully transparent sprite ?
I mentioned the transparency thing and he said he was aware of it, so I guess this isn't the problem.
Here's another idea: are you enabling rendering on the first frame when the hit is supposed to happen during vblank? If you're not waiting for vblank to turn rendering on, that may cause the background to shift down for a frame, and that may prevent the hit from happening.
JoeGtake2 wrote:
I have tried zeroing out $2005 afterwards to no avail.
Setting scroll mid-frame requires the full $2006-2005-2005-2006 cycle. $2005 alone only updates the X scroll; the Y stored in the temporary scroll register until the end of vblank where it finally gets copied in. (You might get away with just two writes to $2006 though, if you're just trying to use 0,0.)
So...I've gone back to a much earlier (read: simple) version of the code. After a lot of trial and error, I found that putting in some flags to make sure that the nametable AND zero sprite are drawn prior to checking for the hit got me past the *freezing* in that version. It's possible that a sprite-clear function in my current version of the code at time of screen load is causing the main issue...though I thought I disabled that. Anyhow...it looks like it's not necessarily the method implementation that's the problem, but something that is countering it somewhere in code (my guess, just a single frame where sprite zero is being *cleared* in my object drawing loop, and drawn off screen).
However, things still got weird when I tried to fix the scroll. Any writes to $2006 seem to botch my CHR-Ram graphics load. And zeroing $2006 / $2005 / $2005/ $2006 (in any combination of order, via experimenting) still did not correct the scroll properly (changed it, but still was not in the right place, and like I said, it seemed to stop my chr-Ram graphics from loading / loaded in null values). If I had to venture a guess, I would think maybe the first NMI was being fired in the middle of the graphics load and that I should similarly flag whether or not all graphics were loaded before checking sprite zero hit? Maybe the vblank was striking between 2006 reads elsewhere causing chaos?
**** Update - never mind...found this issue. It had to do with soft 2001 update conditional...just had to move this inside the conditional.
Ok - now I'm not *that far* away from implementing this. I'm circling the drain and generally am able to substitute values for the bottom half to get different palettes from the menu bar. I'm having trouble (presumably) with getting the hblank timing right, and still having slight issues getting the scrolling right. Perhaps filling in these gaps in my understanding will help me get this fully functional. Mind you, a lot of this is from your collective posts, searching the forums, and trial and error, so there may be some ugliness or redundancy here...this is more the blunt instrument approach, and now I need someone with a shaper set of proverbial tools.
First I want to get things lining up right, and then I'll worry about protections against sprite zero misses.
In the vBlank, I take care of the initial run through the palettes. After which, the code looks something like this:
Code:
LDA readyToCheckSpr0 ;; making sure nametable is loaded and sprties are drawn before checking for a sprite 0 hit
;; early vBlanks were happening before nametable / sprites were being loaded / drawn
;; causing hit to fail and the game freezing.
BEQ skipWaitingForSpriteZeroHit
;; turn on rendering to check for sprite 0 hit
LDA #%000111110
STA $2001
LDA #%10010000
STA #2000
;; check for sprite 0 hit
WaitNotSprite0: ;; I know there is a better way to do this as described...again, this is blunt-instrumenting
LDA $2002
AND #%01000000
BNE WaitNotSprite0
WaitSprite0: ;; I know there is a better way to do this as described...again this is blunt-instrumenting
LDA $2002
AND #%01000000
BEQ WaitSprite0
;;;;; sprite 0 is now detected (it is at #$b0 - i think the first pixel hit is a few pixels later), so now wait until end of scanline for beginning of hblank
LDX #$08 ;; arbitrary number here as to how long it takes to get to end of scanline
WaitScanline:
DEX
BNE WaitScanline:
;;;;;;;; presumably, now we are in hBlank, if the above wait was timed right.
;;;;;;; my guess is I should put the writes to 2006 and turning off rendering *before* waiting for the scanline so less is done
;;;;;; during the hblank, and this can substitute for some of the loop time...is that sound?
bit $2002
LDA #$3F
STA $2006
LDA #$00
STA $2006
LDA #$00
STA $2001
STA $2000
;;;;;;;; change the palette
bit $2007
bit $2007
bit $2007
LDA #$18 ;; brown
STA $2007 ;;; now the last value in the first subpalette should be *brown*
;;;; now we have to set 2006/2005/2005/2006 to match the scroll position...
;;;; this part has been really tricky. I'm not sure exactly where or what to adjust.
;;; for now, I'm doing two writes to $2006, i've tried sandwiching two writes to $2005 between, I've tried many
;;;; arbitrary values but am not ever really getting results I expect.
;;; this gets me close to lined up:
LDA #$00
STA $2006
LDA #$c8
STA $2006
;;; now that scroll is lined up, wait for the hblank again to continue on?
LDX #$08 ;; i know this value will have to be played with to get it right
WaitScanLine2:
DEX
BNE WaitScanLine2
skipWaitingForSpriteZeroHit: ;; if the nametable / sprite 0 were not loaded/drawn yet, from above.
Again, this gets me super close. You can see in the image the brown - it is replacing the *peach* color spot (in the topmost part of the screen, peach is subpal0, color 3...below brown is subpal0 color3...Eureka, it works!
but...not quite there...)
Attachment:
close.png [ 3.42 KiB | Viewed 2256 times ]
So...the immediate questions become, how do I better finely adjust the y-scroll (played around with writes to $2005...this didn't seem to work, so I suppose I'm doing something wrong), and also, how to adjust my scan-line waits correctly to properly get to start of hblank?
Additionally - I liked the idea of loading A,X,andY with the color values and firing them off, but in this context I don't see how that would be possible since I'm using two of those values in the scanline wait for hblank loop. Would love thoughts on that.
One thing that's important to note is that hblank is really, really short, and there's no time to do everything in one go without glitches either in the scanline before or the one after. You may need 2 or 3 scanlines to accomplish what you want cleanly.
From the screenshot it looks like you're enabling rendering way too long after hblank, well into the visible picture, which is why the graphics for that scanline are shifted.
The simplest solution would be to use 1 hblank to turn rendering off, the next to write the colors, and a third one to enable rendering (after setting the scroll during the scanline).
The challenge of changing the palette mid-screen is precisely the fact that it can only be done during hblank (or there'll be "rainbows"), and that 1 hblank is way too short to do anything useful, so you absolutely do need a "break" in the graphics so you can spread the work across several scanlines. This makes the effect unfeasible in the middle of busy images.
Yeah, I figured I'm was cramming too much in there. I had a feeling that was one of the problems...
So then the solution would (or could) be something like...
Code:
;;;;;; HBLANK WAIT 1
LDX #$08 ;; i know this value will have to be played with to get it right
WaitScanLine:
DEX
BNE WaitScanLine
;;;;;;TURN RENDER OFF
;;;;; HBLANK WAIT 2
LDX #$08 ;; i know this value will have to be played with to get it right
WaitScanLine2:
DEX
BNE WaitScanLine2
;;;;;;;; update colors
HBLANK WAIT 3
LDX #$08 ;; i know this value will have to be played with to get it right
WaitScanLine3:
DEX
BNE WaitScanLine3
;;;;; Turn on rendering / fix scroll
Which would give me two *black scanlines*, correct? I mean, theoretically this would be acceptable - the bottom most row in the HUD, the topmost row in the playfield...would be a solid black line that would be fairly negligible.
Still two questions:
1) Is there a better way to do the hblank waiting?
2) How to correctly adjust the scroll, if I'm indeed doing something wrong (keeping in mind that the game has no scrolling).
Thanks!
That's about right.
JoeGtake2 wrote:
;;;;; Turn on rendering / fix scroll
Fix the scroll first (preferably *before* the hblank), turn on rendering second.
Quote:
1) Is there a better way to do the hblank waiting?
Your loops on X are indeed too freaking coarse if that's all you're using. At 5 cycles per loop iteration you're working in increments of 15 pixels, not great. You can do the bulk of the waiting like this, but add some "padding" afterwards with single cycle precision (increments of 3 pixels). Also, these timed loops are incompatible with PAL consoles, where scanlines are ~7 cycles shorter, so you have to account for that if you plan on making a multi-region program.
Quote:
2) How to correctly adjust the scroll, if I'm indeed doing something wrong (keeping in mind that the game has no scrolling).
If you want to set the scroll to a tile boundary (which should be enough for status bars), simply writing the NT address of that tile (minus $2000) to $2006 will do the trick. If you need pixel-perfect precision, however, then you'll need to master the dreaded $2006/5/5/6 trick (it's quite simple once you do get it).
Cool. For single cycle increments, just load dummy values into the accumulator or something? Never really done cycle counting like this, not sure what opcodes = how many cycles, etc. Best practice? Is there a source with this so I could evaluate how many cycles each part of this is taking? I'd imagine that would take at least a *little* of the arbitrary guess work out of what I'm currently doing...haha.
I imagined some of this would have to be re-figured for PAL version, but would likely be a matter of simply adjusting values to compensate?
JoeGtake2 wrote:
bit $2002 ; 4
LDA #$3F ; 2
STA $2006 ; 4
LDA #$00 ; 2
STA $2006 ; 4
LDA #$00; 2
STA $2001 ; 4
STA $2000 ; 4
;;;;;;;; change the palette
bit $2007; 4
bit $2007; 4
bit $2007; 4
LDA #$18; 2
STA $2007; 4
;;; this gets me close to lined up:
LDA #$00; 2
STA $2006; 4
LDA #$c8; 2
STA $2006; 4 -> 56cy, 168pixels, twice as long as the actual hblanking period
You don't have the time to do all this extra stuff during a mid-screen palette change if you can't tolerate a blank line.
Here's my annotated visual description of how the title screen for Indiana Jones and the Last Crusade does it.
Note that that loop is
1- exactly timed, so there's no poll loop for sprite 0
2- uses A, X, and Y
3- reloads as little as possible
4- doesn't use 2006/5/5/6 but just 2006/6
Thanks for posting that - I think, due to the fact that other things will be going on (and I'm perfectly ok with a single black line...even two wouldn't be so horrible, since it bottoms out with a black line for separation as it is) separating the menu from the action, the looser i could be with the already tight timing, the better.
I'm doing some research on opcodes and their cycle count. Tokumaru suggested padding with single-cycle precision (so I could be finely adjusting the hblank wait time to 3 pixels rather than 15 pixel increments a time in my loop...totally understand that). However, after quick researching, I don't see any operations that would take a single cycle. The smallest seems to be two, so how would I do this?
STA address,X/Y always takes 5 cycles instead of 4 ... although X or Y has to be a known value so that you can know the resulting address at build-time .
They also cause an extra dummy read, from address+X or address+X-256, so can't be used safely to slow a write to $2007
JoeGtake2 wrote:
1) Is there a better way to do the hblank waiting?
The X counter loop is okay to do the bulk of the cycles but the end of it, and then you can use a couple of finer instructions at the end to finish it off with an exact number of cycles.
There is no 1 cycle wait (but you can do without).
If waiting 2 cycles,
NOP is the best choice. No side effects.
For 3 cycles there's a lot of options but they all have side effects.
BIT ZP only affects flags,
STA ZP only overwrites one byte in memory, etc.
I use this reference constantly:
http://www.obelisk.me.uk/6502/reference.htmlAll branch instructions are problematic (but not impossible) for cycle counting: different cycle count if branch taken, additional different cycle count if branch goes to a new page, etc.. When writing cycle timed code, you should try to protect yourself against accidental page crossings:
https://forums.nesdev.com/viewtopic.php?f=2&t=14622Finally, using breakpoints in a debugger that can count cycles is a good idea. You can use it to check your work, or even do the counting for you.
JoeGtake2 wrote:
The smallest seems to be two, so how would I do this?
If all you had was 2 and 3 cycle instructions:
2 = 2
3 = 3
4 = 2 + 2
5 = 2 + 3
6 = 2 + 2 + 2 = 3 + 3
7 = 2 + 2 + 3
etc...
Yes, there are no single cycle instructions, but you never have to wait a single cycle. The amount of time to wait can be an even or an odd number of cycles though, so in order to wait 3 more pixels you can, for example, replace a NOP (2 cycles) for a BIT $ff (3 cycles), and you'll effectively have waited 1 more cycle. As long as the minimum padding is 2 cycles, you can increase it in single cycle increments.
Ha. That's funny. I didn't realize NOP was actually an instruction. I just thought it was an abbreviation for a family of potential operations that acted as null. This is where having studied computer science probably would've paid off! haha
And...whoa! After counting cycles (and understanding how that works) and doing the math on the scroll...........
Attachment:
Whoa.png [ 3.49 KiB | Viewed 2426 times ]
Eureka! The brown text is the same palette spot as the pink text above. Obviously, this is just a dummy - and that is a dummy sprite compared to the one I would use...but...ha! YES! And in context, that single black line at the bottom of the menu bar would be a complete non-issue
You guys are awesome. The real question is...do I want to go into the REAL game, start futzing with my object drawing routines (starting them all at 0204), find all my sprite-blanking routines, add the checks for nt and sprite0 being drawn, and add this to an already crowded NMI..........
Hmmmmm......still though, very cool to *get* this. As always, you guys rock. Thank you for your constant feedback and unparalleled patience
JoeGtake2 wrote:
I didn't realize NOP was actually an instruction.
Every CPU I've learned machine code for has had an offical NOP instruction (may or may not have that specific name, though). They're very useful in a lot of cases, not just delay, but for example patching an executable either temporarily for debugging or to fix something without having to or being able to recompile from source, etc.
JoeGtake2 wrote:
And...whoa!
Nice, it's finally working!
Just one more thing: if you do decide to put this in an actual game that will be released, be sure to test on real hardware! Fiddling with the PPU mid-frame is one of the hardest things to time right, and several emulators get it wrong (FCEUX for example, despite being an *awesome* development tool, is notoriously bad at replicating these minute timing aspects of the PPU).
Oh, another thing: as discovered semi-recently, disabling rendering mid-frame can corrupt the OAM *once rendering is enabled again* (so even if you do a sprite DMA while rendering is off, they will still get corrupted), so you should either understand the technical reasons why this happens and avoid the specific situations that cause this (I still haven't fully understood this!) or keep sprites sprites disabled (i.e. masked) after the split to hide any possible sprite glitching for the remaining of the frame. AFAIK, no emulators will replicate this behavior, so again, testing on hardware is highly recommended.
Yeah, my order of operations for tomorrow is:
1) flash this to cart, test it on my top loader. If it works...
2) Iterate the actual game in its current form and spend likely the whole day moving some memory around, cleaning up my NMI to make it a bit more economical, tracking down places that would write to 0200, making contingency cases for when sprites might not be drawn...etc...hopefully get it working...
3) Get it looking good (right sprite in the right place with background...maybe even prioritized behind...so as not to be just some anomalous sprite dangling there)
4) flash to cart and test again on hardware.
If, after a good 10 hour day tomorrow it's not working right, I just might jettison it, but at least now thanks to you guys here I have an accurate record of how I got it working, and *get* how I got it working (and how to adjust accordingly to get it working again.
I didn't quite get your meaning on the last point. You're saying that this could corrupt any sprite draws that follow this during that frame? Because obviously, using this as a top menu bar...that would mean...*all* other sprites in the gameplay area. Is this what you mean?
Also, would you suggest an emulator that supports mapper 30 that might be a good second-check? I have a lot of other emus, but none that support mapper 30, which is why I've relied on a thing Shiru built for us for SUPER birds eye view quick proto, FCEUX to get closer and debug, and then hardware as a final look. It might be good to have another intermediary.
JoeGtake2 wrote:
I didn't quite get your meaning on the last point. You're saying that this could corrupt any sprite draws that follow this during that frame?
Yup. There's this nasty thing about the OAM that if you turn rendering off at certain points of the scanline, it starts overwriting OAM data when you turn rendering back on, be it on the same frame or the next, regardless of what you do while rendering is off... tepples discovered this when he was turning rendering off early to increase VRAM bandwidth and observed flickering sprites on the next frame.
Quote:
Because obviously, using this as a top menu bar...that would mean...*all* other sprites in the gameplay area. Is this what you mean?
In this case, you better read up about the glitch and how to avoid it:
viewtopic.php?t=4647From memory, you have to keep the number of sprites in the scanline where rendering is turned off to a minimum (ideally zero, but you obviously need at least one for the hit) and turn rendering off as late as possible but still before hblank (i.e. x <= 255), but don't quote me on that.
Quote:
Also, would you suggest an emulator that supports mapper 30 that might be a good second-check?
I'm not familiar with mapper 30 at all, so I really don't know. Hopefully someone else can help you with this.
JoeGtake2 wrote:
Also, would you suggest an emulator that supports mapper 30 that might be a good second-check? I have a lot of other emus, but none that support mapper 30, which is why I've relied on a thing Shiru built for us for SUPER birds eye view quick proto, FCEUX to get closer and debug, and then hardware as a final look. It might be good to have another intermediary.
Mapper 30 is just UNROM with 2 additional optional things:
1- CHR-RAM bankswitching
2- self-flash capability
Caitsith2 had added support to FCEUX as of april 2014, so modern releases should have it...
Mesen doesn't have self-flashing but does support CHR-RAM bankswitching.
I don't think the previous generation of best-in-breed emulators are supported enough to have it.
But if you're not relying on those two behaviors, you should be able to just run it as mapper 2 for testing.
tokumaru wrote:
In this case, you better read up about the glitch and how to avoid it:
http://forums.nesdev.com/viewtopic.php?t=4647From memory, you have to keep the number of sprites in the scanline where rendering is turned off to a minimum (ideally zero, but you obviously need at least one for the hit) and turn rendering off as late as possible but still before hblank (i.e. x <= 255), but don't quote me on that.
It's not as bad as that, I think? Just turn rendering off (i.e. write $2001)
after pixel
192 240 on a scanline and you should be fine, unless I'm mistaken. This is easy to time if you're already trying to set up an hblank timing.
Reference:
https://wiki.nesdev.com/w/index.php/Errata#OAM_and_Sprites(Edit: corrected 192 to 240, and additional note moved below.)
Annnnnd first step = a failure. You are correct. On hardware, I had the glitch that tells me timing was, indeed, off. Drat. Curses. Foiled.
Attachment:
IMG_2910.JPG [ 1.15 MiB | Viewed 2392 times ]
I'd imagine this would be a fairly obnoxious series of trial and erroring to get exactly right (changing values, flashing, testing...rinse, repeat)...then the time it would take to modify the code...then the potential for the sprite flickering issue...maaaaaaaybe it's not so big a deal. At least I figured it out on a conceptual level.
(Continued from above...)
FWIW, there are only very few games that actually do mid-screen palette changes, and I don't think any of them use sprites below the palette change
Actually, come to think of it, you may have to turn rendering back on later than pixel
192 240 as well to keep from coming online mid-evaluation too (though you might as well do it in hblank for cleanliness?)... since sprite evaluation is buffered for one line also, I am not sure if you'll get 1 line of pre-split sprites in your buffer still when you resume rendering... you could turn on background but hide sprites for one line just to flush it if that's the case... can't recall if this is really an issue though? (It's been a while since I tried it.)
JoeGtake2 wrote:
Annnnnd first step = a failure. You are correct. On hardware, I had the glitch that tells me timing was, indeed, off. Drat. Curses. Foiled.
I'd imagine this would be a fairly obnoxious series of trial and erroring to get exactly right (changing values, flashing, testing...rinse, repeat)...then the time it would take to modify the code...then the potential for the sprite flickering issue...maaaaaaaybe it's not so big a deal. At least I figured it out on a conceptual level.
If you
are using FCEUX, use the "new PPU" setting if you're not but... I don't think it's really quite adequate for this kind of timing.
Nintendulator has given very good/close results in my experience with this, not sure if it's 100% but it's as good as I've seen when comparing.
I would strongly suggest that you step through the code in Nintendulator and pay very particular attention to what pixel of the scanline the writes are taking place. Make sure it's within the hblank pixel range, and try to leave a margin for error on both sides. (Measure it many times, there is going to be jitter of several pixels from your BIT loop too, pushing it back and forth, make sure you find both the low and high range of this jitter for the write timing!)
One other thing you may notice on hardware is that the NTSC PPU's "hblank" area isn't entirely blank, the image starts and ends early so often "in hblank" effects like this can still be seen at the edge of the picture (some TVs or capture device will show more of this than others).
rainwarrior wrote:
It's not as bad as that, I think? Just turn rendering off (i.e. write $2001) after pixel 192 on a scanline and you should be fine, unless I'm mistaken.
The wiki says "x=192 for lines with no sprites, x=240 for lines with at least one sprite", and while he could trigger the sprite 0 hit using the bottom row of the sprite to avoid evaluating it on the scanline where rendering is turned off, this is for a status bar at the top of the screen, so there may be sprites scrolling from/to the top of the screen and crossing the boundary of status bar, where rendering is turned off.
JoeGtake2 wrote:
Annnnnd first step = a failure.
Yeah, raster effects can be annoying like that to code. I've had a lot of headaches with these kinds of glitches in the past (don't be surprised if you get it to look right on hardware and then it looks wrong on emulators!), enough for me to give up on anything that turns rendering off/on mid-frame. I'll still do tamer raster effects, such as scroll changes or anything that doesn't interfere with the PU's work (color emphasis, grayscale, bankswitching, etc.) because the timing constraints aren't so severe in those cases.
tokumaru wrote:
The wiki says "x=192 for lines with no sprites, x=240 for lines with at least one sprite"
Pardon that omission, was working from memory and was only skimming the reference I provded. I thought the margin was wider.
So yeah, 240 not 192.
rainwarrior wrote:
Nintendulator has given very good/close results in my experience with this, not sure if it's 100% but it's as good as I've seen when comparing.
A few years back when I was experimenting with sensitive timing like this, Nestopia always gave me the closest results compared to what I saw on real hardware, but unfortunately it didn't have any debugging features at all. Nintendulator was pretty close too. I can't vouch for any of the more current emulators, because I haven't done this kind of testing in a while.
Yeah, what I'm vouching most for is Nintendulator's debugger's pixel timing feature. Like, don't try and do it by just visual output alone, get your writes timed within its hblank (which you can only verify the specific timing of in the debugger). If you get them centred in that range (hopefully with a little bit of padding on both ends), that has the best chance of transferring to hardware correctly.
A visual test alone is not good enough. "This appears to work on FCEUX/Nestopia/Nintendulator/etc." has far less chance of transferring correctly to hardware than something you've intentionally designed to fit in the centre of a range of timing tolerance.
Also, jitter is important to account for too. The BIT/BEQ loop is 7 cycles, so that's 21 pixels your timing is going to move back or forth. Make sure you verify/measure this and account for it. Centre your timings based on the middle of this range, and choose your margins to accommodate this range. (If you do LESS stuff per hblank you can have a bigger margin, too.)
edit: corrected another mistake
rainwarrior wrote:
Yeah, what I'm vouching most for is Nintendulator's debugger's pixel timing feature. Like, don't try and do it by just visual output alone
The problem is that a pixel timing feature doesn't mean much if the visual output doesn't match what the hardware shows. It's not that Nintendulator is way off, so it's debugging features are indeed *very* useful, but in my old tests, Nestopia consistently outperformed Nintendulator for tiny timing issues like this, even if not by much, so I thought I'd mention it.
Quote:
The BIT/BEQ loop is 6 cycles
7 cycles actually, since the branch is continuously taken while waiting for the hit, so 21 pixels. But yes, it's very important that you account for this jitter.
tokumaru wrote:
rainwarrior wrote:
Yeah, what I'm vouching most for is Nintendulator's debugger's pixel timing feature. Like, don't try and do it by just visual output alone
The problem is that a pixel timing feature doesn't mean much if the visual output doesn't match what the hardware shows. It's not that Nintendulator is way off, so it's debugging features are indeed *very* useful, but in my old tests, Nestopia consistently outperformed Nintendulator for tiny timing issues like this, even if not by much, so I thought I'd mention it.
In my experience the visual output will usually match Nintendulator if you're within the tolerance range. (With enough leeway you can get it to work universally across hardware and many emulators too.)
If you're at the edge, that's where you tend to get problems. ...and if you're just going by visual you can't tell the edge from the middle of tolerance. The debugger makes that easy to target.
That's the problem with Nestopia for testing. Whether or not is has more accurate timing (unsure), it has no debugging features.
Of course, you could try to feel out the edges of the timing by writing a ROM that lets you adjust it on the fly (and outputs the relevant), and you could find the low/high point where it breaks. That'd probably get you the best info from a hardware test, and you could evaluate emulators with it too... (maybe one of us should write something like this... an hblank timing explorer ROM)
rainwarrior wrote:
Nintendulator's debugger's pixel timing feature
Is this an actual feature, or are you referring to the "ticks" textbox showing the CPU/PPU timing? I took a look and wasn't really able to see anything that Nintendulator has that Mesen & FCEUX's debuggers don't have in terms of showing the timing where instructions occur vs the PPU? (FCEUX's relatively low accuracy aside)
Mostly asking since if this is an actual useful feature in Nintendulator that I'm not aware of, I'd try to add something similar to Mesen.
Sour wrote:
rainwarrior wrote:
Nintendulator's debugger's pixel timing feature
Is this an actual feature, or are you referring to the "ticks" textbox showing the CPU/PPU timing? I took a look and wasn't really able to see anything that Nintendulator has that Mesen & FCEUX's debuggers don't have in terms of showing the timing where instructions occur vs the PPU? (FCEUX's relatively low accuracy aside)
Yes, just showing the current pixel position of the scanline in the debugger while stepping is what I'm referring to. FCEUX has it as well, just the timing seems to be shifted slightly. I think generally the relative CPU timing of all of them is perfect but the specific timing of synchronization events like sprite 0 hit or when exactly PPU writes take effect, etc. is the source of divergence between emulators and each-other and hardware.
Edit: I've started a test ROM, not really any kind of conclusive result yet but trying to see how emulators differ in this regard:
https://forums.nesdev.com/viewtopic.php?f=3&t=16308