NMI and Main Loop - NESdev BBS

NMI and Main Loop
by Celius on 2008-03-16 (#31702)

So I've heard many things about how the NMI should do visual and audio updates, and the main loop should do all game related calculations. The NMI should remain enabled. I wish I could agree with this statement, but I can't, because I just don't know how you can call the NMI to update that stuff when you can't guarantee that you won't be updating garbage because the main loop isn't done calculating stuff.

For example, my RPG project calls for a possibly very long map decompressing routine that may take longer than a frame (At the very worst). I do this in my endless loop and shut NMIs off, because if I were to have NMIs enabled, and not be done decompressing the map, it would be putting garbage on the screen. What is a standard setup for making this all work, and could someone explain how they can guarantee garbage won't be on screen?

Re: NMI and Main Loop
by tepples on 2008-03-16 (#31704)

Celius wrote:

It's called locking. When the main thread is ready to make the transfer buffers inconsistent, it sets a flag stating so. Once the transfer buffers are consistent again, the main thread turns off that flag. The NMI thread checks the flag and delays updates if it is on.

Ordinarily, on an NES game using a split main/NMI structure, the NMI thread is not interruptible. It runs with IRQs blocked (flags.I = 1), and it finishes within 2300 (NTSC) or 7500 (PAL) cycles of hard real time so another NMI won't happen. But if you do make the NMI interruptible, such as in an NMI-only engine, you'll have to use INC and DEC to implement locking.

by Bregalad on 2008-03-16 (#31705)

Well, I currently do the opposite (sorta), whenever I want strings of data to be transftering to $2007, I set a flag. Once the main thread sets the flag, it's assumed that the data is ready to be transfered. The main thread can do all sorts of calculations using the same buffer until 2 conditions :
- Make sure previous data has been effecively transfered to the NMI update routine (so that you don't skip graphics updates)
- The flag is clear, so that the NMI routine will never try to read this data.

The easiest way to handle it is to have the NMI routine immediately clear the flag when it's set, so that you always know :
- If the flag is set, data is ready to be transfered
- If the flag is clear, either no string is pending, or the main thread is currently working on it.

So my NMI routine just checks a set of flags and update data to PPU if those are set (with some kind of priority order), and clear them. If all flags are clear, the NMI does nothing, exept write to $2005 and $2000 and do sound. This allow the main thread to do PPU writes as long as all flags are clear (and the rendering is disabled), but still NMI on (so with music), as $2006 remains untouched.
Of course you're free to do any NMI routine that does anything in any order and do like you want. There is many differenty ways to code the same thing. However, once you've chose one, it's big trouble to change.

by Celius on 2008-03-16 (#31708)

I guess you can use flags to indicate when updating should occur. Actually, my NMI system would allow for me to forget about flags, and just stick updating routines in the indirect jump list (My NMI routine is just a list of 16 indirect jumps). But if you're not ready to update sprites, I think it'd be really bad to update the scroll and the background, because you'd end up with jumping sprites when scrolling. So basically I think you have to be ready to update the whole screen if you want to update it at all. But I suppose music can be updated every frame.

If that's the case, where you really only want to update the whole screen at once, then your screen has to be ready to update in less than a frame if you want the framerate to be over 30FPS. I just don't get how you can update some things and not others without causing visual problems. And also, isn't what you update really dependent on game logic? So to me, I'm still seeing it as you need to complete a "game frame" before you can update. By game frame, I mean a complete cycle of the game loop. It still seems like everything needs to be updated if anything is updated.

by Bregalad on 2008-03-16 (#31711)

Quote:
I just don't get how you can update some things and not others without causing visual problems.

I update scroll registers every frame, so that the screen doesn't risk to shake or anything like this. This doesn't cause any visual problems. I update sprites only when they are ready. Anyway, in 99% of the case your game loop should complete within a frame, else your game will be buggy anyway. It's just good to design it so that it still look alright when it slows down.

by tokumaru on 2008-03-16 (#31712)

Bregalad wrote:
I update scroll registers every frame, so that the screen doesn't risk to shake or anything like this. This doesn't cause any visual problems.

I've seen this cause weird effects in games before. Can't name any though. But you know how it looks, when a block that should be part of the scenery shakes a bit compared to the ones that are actually drawn with background tiles.

It's probably not an issue with Bregalad's game because he has a very efficient engine for a somewhat simple game (for example, there is no scroll and enemy code running at the same time), and all updates are probably done within the time of a frame.

In my game, I only update the PPU if all the data is ready, or it'd look weird. I still have NMIs enabled at all times, because I need to set up the IRQ responsible for the top border that masks the top 16 scanlines. When the IRQ fires, I also handle the music, because the music shouldn't slow down in case the game does.

In fact, the music is a very good reason to have NMIs always enabled... If a frame is not enough for all your calculations, you may not do anything with the graphics during VBlank, but at least call your music routine so that it doesn't slow down too.

by Celius on 2008-03-16 (#31713)

Bregalad wrote:
I update scroll registers every frame, so that the screen doesn't risk to shake or anything like this. This doesn't cause any visual problems. I update sprites only when they are ready. Anyway, in 99% of the case your game loop should complete within a frame, else your game will be buggy anyway. It's just good to design it so that it still look alright when it slows down.

Well if you update the scroll registers, you have to update the sprites too. So if you're not ready to update sprites, and you update the scroll registers, that could lead to some shaky sprites. But I'm assuming you somehow don't allow the screen to scroll if your sprites aren't completely updated?

I'm trying to look at some code from other sources to get it into my mind how partial updates are okay. It's obvious that somehow, it's okay, and pulled off correctly.

It's true that a game should take about a hardware frame to execute a software frame. I have to think more about this, for some reason I find it confusing.

by Bregalad on 2008-03-16 (#31714)

Well, updating the scroll registers don't mention what value you put in them. Whenever the sprites will scroll at the same time as the main screen depends only on when you write the new scroll value to your buffer, and not if your NMI routine always writes to $2005 or not (mine does, there's no flag to prevent it). Effectively my game doesn't run scroll and enemies at the same time (it's one or the other), but if I were to introduce scroll in my main loop then I guess the best would be to place it right after sprite updates so that if the object handling and sprite updating part takes too long (which is definitely the only 2 parts that actually take very long), then the scrolling will be updated right after them so that, in practice, the NMI is unlikely to happen right between the new sprites are ready and before the new scroll values are ready.

Anyway do like you want, and modify your NMI engine in function of your game's needs.... I made a pretty general purpose NMI handler myself, but nobody have to. For exapmple Final Fantasy does everything in its main code and the NMI does nothing but return. If the main loop were to ever slowdown, everything, including music and sprites would. However, this never happens because a RPG is usually made of short frames. This worked in 3D-World Runner and Rad Racer because they wrote them in a way so that they would never slow down (in fact pretty much everything should be timed in those game because of raster effects). However, if Square were to code a more advanced platformer like Kirby, they'd most likely have to change their tactics.

SMB, Zelda and Metroid are games where the music can lag when the game does, and it really sounds bad (especially in SMB where the status bar also shakes).

by Celius on 2008-03-16 (#31715)

tokumaru wrote:
In my game, I only update the PPU if all the data is ready, or it'd look weird.

This is exactly what I'm talking about. But I suppose it wouldn't matter as much if the sprite's graphics/positions in relation to the map weren't completely ready. The sprite drawing routine needs to happen as often as the scroll is set, since the sprite drawing routine is responsible for placing sprites in relation to the scroll. And updates should be ready to happen as often as the scroll is set, because if the scroll is changed, exposing a new row/column of tiles, the background will have to be updated. And in order to update, you must have all of your PPU data ready, because if you don't, you may be updating half of a row/column and missing attributes and stuff like that. It all has to be ready, pretty much.

by Bregalad on 2008-03-17 (#31738)

I'd personally recommand to code your stuff as naturally as possible (and as optimised as possible) while avoiding all overhead. Since the game will look wrong when slowing down, does it really matter if it looks a little more wrong or a little less wrong ? The answer is obviously no, as your game shouldn't slow down anyway, and it shouldn't crash or flicker if it does, but nobody would exept it to look "all right" since it's lagging anyway.

For short, just code stuff naturally, and if bugs are seen, then fix them. As long as no bugs are seen you're fine, and you don't need overhead (exept for optimising which is another problem).

by Celius on 2008-03-17 (#31740)

All I'm wanting to do is make sure that JUST IN CASE there needs to be any slow down, the game will be able to slow down just a little bit without causing a catastrophe. And I am also wanting the game to mostly run at 60FPS. I'm trying to avoid running at a constant 30FPS, or running at 60FPS and updating garbage on the screen when it takes a little longer than a frame.

I have to try and avoid one of these problems, otherwise I'm going to end up with either really annoying slow framerates, or disaster. Neither one is good.

I suppose I can just update the PPU data right away after the NMI, and then the rest of the frame would just be object calculations. And thankfully, my sprites' graphics are represented with one byte, so if an NMI were to occur in the middle of me assigning a new graphic to a metasprite, it would show up as one graphic, or another. If it were represented with two bytes, then I'd have some trouble. Because I could update one byte, and the other would remain un-updated while the NMI is called. This could result in really really bad things. I'd have to guarantee pretty much that the PPU data, excluding sprites, would be updated in less than a frame. This is very possible for my platformer.

I suppose I see how it can work a little bit. I just don't see how you could update some PPU stuff, and not other PPU stuff. Like if you scroll over, you HAVE to be ready to update the background including the attribute table. And if you're touching sprite coords in relation to the map, you have to set a flag that states you can't scroll over. It gets back to two byte writes. If you update one byte, and not the other while being interrupted by the NMI, your sprite will end up in a place that you're just not wanting it to end up. Although, it would only be for a frame, and people might not even notice it.

by Bregalad on 2008-03-18 (#31769)

Then it's quite simple to do what you describe. You routine that waits until a NMI has passed most certailnly relies on a flag to do so. So in the NMI engine, if you see this flag is in the position where the main programm was waiting, you do all PPU and sprites upadtes before returning, else you only do music and then retrun. I don't handle it that way myself because it disalows any hope to get an update if the main programm does something else than an "official" wait, but if you want to do it that way it's fine. If I remember correctly Mega Man worked a similar manner.

If I'm not mistaking you're living in the USA, so that's a free country (at least from what I've heard, and they're even proud of it to a point where they make statues about it), so it should be allright for you do do it the way you want without having anyone torture you or anything.

by tepples on 2008-03-19 (#31805)

Bregalad wrote:
If I'm not mistaking you're living in the USA, so that's a free country (at least from what I've heard, and they're even proud of it to a point where they make statues about it), so it should be allright for you do do it the way you want without having anyone torture you or anything.

Except perhaps copyright owners, who can threaten you with the DMCA or something. But does that apply here?

by Celius on 2008-03-19 (#31840)

I think I'll be fine if I do it this way. I am well aware that I can do it any way I want, but that's not the point. I just wanted to know what most games do for this, and how they do it. If I have anymore questions about this, I'll be sure to come back.

by tokumaru on 2008-03-19 (#31854)

I think it's fine for anyone to ask for opinions on how to do a certain thing. Bregalad was a bit too harsh when he said "do it however you want", I must agree. But now I feel that we have discussed the pros and cons of a few different methods, and Celius seems to have made a decision, so everything is fine.

This is a perfectly good subject to discuss, because setting up the way game logic and screen updates are performed is not as straightforward as it may seem at first, and it's easy to make a decision that will give you headaches later on. I have changed the way my main loop is organized a few times already, and even a change of mapper hardware has made an impact on that.

by Bregalad on 2008-03-20 (#31865)

Quote:

This is a perfectly good subject to discuss, because setting up the way game logic and screen updates are performed is not as straightforward as it may seem at first, and it's easy to make a decision that will give you headaches later on. I have changed the way my main loop is organized a few times already, and even a change of mapper hardware has made an impact on that.

This is, sadly, right. The first time I've started to do NES programming I decided to do everything in NMI and only do a jmp * instruction outside of it (like SMB), mostly because many demos available back then worked like this.
But this later happened to be a very bad chose and I eventually stopped to do it that way.

Being a more experienced programmer doesn't solve this kind of "bad choises" problem at all. I recently completely changed the way objects works on my main game project, and even if the new method is a lot better than the old one, that is quite a big thing to fix, and need quite some work (it's still not finished). I also made major change to the routine that handle sprites, and made it much more optimised and user friendly.
Overall, I don't know if that's the case of many people here, but I personally pass more time to improve/fix/optimize code I wrote before than code new things.

That's probably why, when someone have a problem in nesdev, nobody can come and say "I have the solution to your problem". Instead, one will maybe have a part of a possible solution, and worse you'll most likely get completely different solutions, and it's hard to determine which is the best.

by nineTENdo on 2008-03-23 (#32013)

im kinda in the same boat right now. deciding where which part of my code is better to be run in NMI or a Main Loop. So far i found that a least delays work best in the Main loop and not in the NMI. basically almost everything in my code is being run in NMI's but there some slight problems in NMI that run better in the Main. Like loading a new nametable with little or no gliches and or moving a sprite while one is moving in its own algorithm. Glitchy Glitchy Glitchy. thats the problem. if figure flags flags flags but i just havnt envisioned it yet.

by tokumaru on 2008-03-23 (#32015)

Wait a minute... "delays"? What do you mean by "delays"? I hope you don't mean a loop that just wastes time in order for something to remain on the screen for a while... because that's just terrible.

It's terrible because you end up delaying EVERYTHING. A proper delay system should be defined in terms of frames, and variables in RAM would act as counters. If you decrement these variables once everyframe, ideally in your NMI routine, when they reach zero you know the delay is over. This allows you to delay diferent parts of your program independently if you want to.

For example, if you have an enemy walking left, but then he stops moving for a while, before turning around and moving right. You'll sure not want the whole game to stop (possibly even the music) just because the enemy needed to remain stopped for a while.

About the PPU updates, you can only do it when the scren is not rendering, regardless of whether you're doing it inside the NMI routine or not.

The most common cases have already been explored:

1. Whole game inside the NMI routine;
2. Whole game in the main program;
3. Game logic in the main program, PPU updates inside NMI routine;

All of them work fine, but one might be better then the other for a particular project, and only the programmer can decide that.

Number 1 can be a bit confusing, because every VBlank you update the PPU with data calculated during the last frame, and then calculate the data for the next one. Number 2 works works fine too, when you use the NMI to set a flag to indicate that VBlank started. Neither 1 or 2 handle slowdowns very well, because you either can't tell you missed a frame, or you can only tell a while after it happened. So you better make sure your frame calculations fit whithin a frame's time at least 99% of the time.

With number 3 you can split the work, in a way that a piece of it can take as long as necessary, while the other will run once every frame for sure. This makes it easy to take care of things that need critical timing, such as music.

Frankly, if you game logic is pretty fast, it doesn't matter at all what method you choose. The problem is mostly how to handle slowdowns.

There are a few rare cases when a game could seriously benefit from the 3rd method. when I was first thinking about making a raycaster, I knew I'd have a lot of things to calculate. So, once there was enough data to update the PPU, instead of waiting fo VBlank doing nothing (methods 1 and 2) the data for the next frame could already be calculated, so chances were frames would be ready faster. In some cases it is necessary to keep the main code always busy, while the NMI simply updates the PPU with whatever data is ready at the time it fires.

Again, this is the case of more complex programs, where a software frame is not necessarily aligned to a single hardware frame.

by nineTENdo on 2008-03-23 (#32017)

tokumaru wrote:
If you decrement these variables once everyframe, ideally in your NMI routine, when they reach zero you know the delay is over.

but then again you will have to nest to other counters. beacsue the maximum possible value would FF to 00

tokumaru wrote:
This allows you to delay diferent parts of your program independently if you want to.

this is pretty genius though i never though of it like that.

by tokumaru on 2008-03-23 (#32019)

nineTENdo wrote:
but then again you will have to nest to other counters. beacsue the maximum possible value would FF to 00

You don't have to nest, simply using multi-byte variables will do. With 2 bytes, you can count up to $FFFF. That's 65535 frames, or about 1092 seconds. That's 18 minutes! Why would you want to delay anything more than that? If by any chance you do need to, use one more byte and you can wait about 77 hours. I doubt anyone even keeps their NES on for that long! =)

nineTENdo wrote:
this is pretty genius though i never though of it like that.

This is the most important thing that makes games possible. Most people just study linear logic when learning how to program, and because of that they fail to understand how an interactive program works. In a game, everything works in little pieces over time. If you hog the CPU for a single purpose, how will the other things in the program function?

I guess that it's like each entity (enemies, player, animations, delays, whatever) only "thinks" once per frame, and what they think is: "Based on my current state and on the current circunstances, what action should I take during this frame?". In the case of the title screen delay, the current state is "waiting for the counter to reach 0", the circunstance is the value of the counter, and the action it takes is either "decrement the counter" (in case it hasn't reached 0) or "fade out the screen and move along" (in case it did reach 0). Most events in a game can be described like this.

by Celius on 2009-03-20 (#44572)

I really hate digging up this old thread, especially considering that this has been talking about a bunch of times. But I'm still confused about something.

Do most games force a software frame to take an integer number of hardware frames? So a game frame will either take 1 or 2 hardware frames, not like 1.5? It seems like this would be the only way to go, because you never want the game loop to be executed unless all of the necessary PPU and audio data has been copied, which can only really happen at the start of a frame. If that were the case, then how do (if any do) games manage to slow the frame rate down to anything other than 30, 20, 15, 12, etc. FPS?

by tepples on 2009-03-20 (#44573)

A game might use a pattern of 2, 1, 2, 1, etc. I believe it's called triple buffering. Even with plain-old double buffering,

Field 1: Video
Field 2: Audio and physics, then half of video
Field 3: Half of video, then audio and physics

On the NES, "half of video" might refer to backgrounds vs. sprites, or backgrounds at the scroll seam vs. backgrounds within the visible area, or backgrounds and sprites vs. CHR RAM.

by Celius on 2009-03-20 (#44579)

Wow, I can't believe I wasn't thinking about double-buffering or triple-buffering. This opens up a whole new world of ideas for me.

I was thinking of something pretty interesting. What if you had like a "stack" of updates, and every frame you pulled off as many as you could, leaving the rest for next frame? Though after a certain amount of time, you would end up with mixes of multiple frames, which would be bad.

I'll have to think about this some more...

by Bregalad on 2009-03-21 (#44581)

Quote:
I was thinking of something pretty interesting. What if you had like a "stack" of updates, and every frame you pulled off as many as you could, leaving the rest for next frame? Though after a certain amount of time, you would end up with mixes of multiple frames, which would be bad.

If the CPU is too slow, the stack will simply be pushed more than pulled each frame and will finally overflow, so I belive it's a bad idea, I could be wrong tough.

Also what tepples says applies if the game can predict that it slows down, and in most cases you can't really predict it. Normally, you'd do it so that most of the time it runs at 60FPS, but if too many objects are active, it may slow down to 30FPS or even lower (20, 15, etc...)

Also it depends on the kind of the game. A RPG is much less likely to slow down than a space shooter.

by Celius on 2009-03-21 (#44592)

Well I have a multi-directional scrolling platformer with RPG aspects, so slow-down is really possible. I guess it wouldn't be so bad if it slowed down to 30 FPS, but I wouldn't want this to happen often...

I agree about the stack idea. It would eventually overflow, which would be really bad. It was just an idea though.

The bottom line is every requested PPU/APU update has to happen one way or another. You cannot skip any updates, otherwise really really bad things could happen. So in order to be certain that all of these updates have happened, it seems the game loop cannot be executed again until all of the necessary information has been copied over, thus meaning at the end of a game loop, one must wait for the next NMI to copy the PPU/APU data, which forces the game loop to take an integer number of frames.

Now that I think about it, how would double-buffering be implemented for a game loop? Wouldn't that only apply if the game loop was shorter than a frame, and not longer?