Code organization with bank switching

Code organization with bank switching
by bleubleu on 2019-01-12 (#232074)

I'm trying to figure out the best way to organize my bank switching for code (im working with an MMC3).

Basically I made one of the (two) switchable 8KB banks for data, and the other one for code. Seems like the best setup.

For code, some big chunks are relatively independent of the rest so they can be easily moved in their own banks, simply remember to switch before calling and it's all good.

Some parts are most susceptible from being called from all the over the code (ex: a procedure to draw text). Sometimes calls can be made from one code bank to another. It can get pretty annoying to track all the dependencies. But these functions can be big and are not called hundreds of times per frame, so moving them to a bank makes sense to me.

Ideas I tried so far (they work, but none make me 100% happy) :

Keep all the generic code the fixed bank, more specialized code in swappable banks. Swap bank manually.
Create little stubs for these big functions that are called all-over the place that automatically push/restore the code bank when the function is called. I'm worried about perf with this idea (easily 50-100 cycles or overhead).

How do you guys organize your code in multiple banks?

-Mat

Re: Code organization with bank switching
by DRW on 2019-01-13 (#232086)

This is how I'm doing it in MMC3:

I use one of the switchable banks as a fixed bank. So, after this bank is set once in the beginning, it never gets changed anymore.
This way, I have 24 KB of space that are always available.

Furthermore, I programmed a function that can copy source code from a bank into the cartridge RAM.*
This way, I have up to 8 KB of space that are also always available, minus the RAM space that you actually need for conventional variables.

And so, we have 24-32 KB of always available code in contrast to the 16 KB of UNROM, so that we hopefully don't have to worry about bankswitching actual source code in the first place and can dedicate the bankswitching to data alone.

If it is still not enough, then I would look for all the functions that are independent from other code and data and put them into a bank. Then I'd use the one bank that is still switchable for these things.

In this case, the functions of course have to be independent not only from other functions, but also from data constants, unless these constants and functions are all located within in the same bank.

In my case, even though it wouldn't be necessary yet, I store the code and data of the title screen, game over screen, save data menu etc. into a separate bank.
These things never need anything from the rest of the game, only battery RAM values for savestates, but nothing from code or external ROM data.
So, I can waste eight more KB for text screens that are not counted towards my up to 32 KB of always available stuff. And the text screens are guaranteed to never interfere with the flow of my program since they are only called in one specific location that will never change ever again.

* About putting code from ROM to RAM:
If you use cc65, the config file allows you to declare a segment that is stored in one location, but that gets executed in another location. You just have to program the copy function, but from then on, you can use the functions out of the RAM, even though at compile time, they are stored at a different address in ROM.

bleubleu wrote:

simply remember to switch before calling and it's all good

A little hint: Everything that is in a switchable banks gets a postfix in my code:

ProcessTitleScreen_bTxt

This shows me that the function is stored in the text screens bank, so I have to call SwitchBank(BankTextScreens) before I call it. (Unless the function that calls ProcessTitleScreen also has the postfix _bTxt in which case I know that I'm in this bank.)

Furthermore, any function that switches a bank gets the postfix _sB, so that I know: After I called this function, it's not guaranteed anymore that the current bank is still the same:

Code:

SwitchBank(BankXyz);
DoSomething_bXyz();

// Stored in global bank:
SomethingDifferent();

// No bank switch necessary: We're still in XYZ.
DoSomethingElse_bXyz();

// Stored in global bank, but changes the variable bank to any value:
SomethingCompletelyDifferent_sB();

// Bankswitch necessary again because it was changed.
SwitchBank(BankXyz);
DoSomething_bXyz();

Re: Code organization with bank switching
by bleubleu on 2019-01-13 (#232089)

Thanks for the feedback.

Wow, copying code in RAM and executing it? I've honestly never thought of that. Not sure that's a good fit for the design of my engine, but that's really original! :mrgreen:

About the suffix, I considered that too, i might do that to make it really clear which function might trigger a bank switch or something. I'm still thinking about my final solution.

Thanks!!

-Mat

Re: Code organization with bank switching
by 3gengames on 2019-01-13 (#232099)

In early Arcade games, they would call a routine that was in RAM, and then jumped back in after switch to their PRG code from a table at the very beginning of the ROM for what subroutine they wanted to call. Also an okay way of doing it probably.

Re: Code organization with bank switching
by NovaSquirrel on 2019-01-13 (#232113)

In Nova the Squirrel I used MMC1, so the approach was necessarily different (16KB switchable and 16KB fixed) but things should mostly apply.

Most of my things self-organized intuitively into related groups of code/data that accessed each other, and those fit nicely into 16KB banks. I have one 16KB bank that contains the entire soundtrack and almost the entire music engine (I had to spill a few hundred bytes into the fixed bank), one 16KB bank that contains almost all of the enemy code (with lots of shared code between enemies), one that contains the player code and the player's interaction with level tiles, and so on.

I think it would make a lot of sense to use MMC3's two bank slots as a single 16KB unit for cases like that where it works out. Your code gets to be simpler and you can keep the amount of bank switching down.

It also made sense for me to organize things deliberately by what dependencies they have. My inventory screen and level select both needed the variable width font so they made sense to put into the same bank as it and avoid far calls.

This isn't to say far calls ended up being a big problem, though! I used a lot of them and the performance hit of using them never really mattered. If they aren't a very frequent thing (like changing a block of the level, or displaying some text, or playing a sound effect are all things you aren't going to do many times a frame) then it's negligible.

The fixed bank is good for common routines you don't need to bank switch to in order to access, but also consider putting routines that need to switch to a lot of different code banks in there. The biggest candidate for that sort of thing is your main gameplay loop.

Re: Code organization with bank switching
by tokumaru on 2019-01-13 (#232118)

The exact solution may vary depending on the configurations allowed by the mapper, but I generally try to minimize the amount of bankswitch operations necessary per frame.

In my own projects, I consider up to 30 or so switches per frame a very acceptable amount. So far I've used mostly 32KB bankswitching (BNROM), in which case I try to group the data and the subroutines that make use of that data in the same bank. For example, in a platformer, every bank that contains level maps will also contain subroutines to test for collisions against the maps, and to decode rows and columns of metatiles for rendering purposes.

What drives the number of switches per frame up is the amount of active objects, since each one will have to switch banks at least once in order to collide with the background. Some less common objects might even have their logic tucked away in a different bank if there's no room for all of them in a single bank.

One switch for music playback is also a given, as is one for video updates during vblank, where I normally use a lot of unrolled code.

I'm working on an MMC3 project now, and at first I thought that I'd need to switch banks less often, but since the banks are so small, I have to split the object logic across many banks, so that's still at least one switch per object, and possibly more if the level's data doesn't fit in one 8KB bank.

As for copying code to PRG-RAM in order to reduce the amount of bankswitching you have to do per frame, I think that's a valid approach *IF* you already have that RAM available for other reasons (such as battery saves or very dynamic maps), but I think it's overkill to use PRG-RAM just for that.

Re: Code organization with bank switching
by tepples on 2019-01-13 (#232124)

Rule of thumb: anything with less than 14K of data gets its own bank; anything with more goes in the fixed bank.

MMC3 and other mappers with dual 8 KiB windows let you do a slick trick called "linear memory", which I've used in The Curse of Possum Hollow. Treat the entire first three-fourths of your ROM, such as 384 KiB out of 512 KiB (48 MMC3 banks out of 64), and set them up as one memory area with one segment. Then you can encode pointers into this as the bank number (address >> 13) and the effective address ((address & 0x1FFF) | 0x8000), and then you set window $8000 (MMC3 window 6) to bank B and $A000 (MMC3 window 7) to B+1. For example, if the address of something within linear memory is $013D00, it turns into bank $09 and address $9D00, and when accessing that item, banks $09 and $0A get loaded into windows $8000 and $A000 respectively. Then you don't have to worry about bank boundaries when laying out data in ROM. Nor do you have to worry about them when reading things from ROM unless a single object is bigger than 8K, and even if so, you can occasionally "renormalize" an address while streaming through a big array: once the address exceeds $A000, subtract $2000 from the address and add 1 to the bank.

Re: Code organization with bank switching
by DRW on 2019-01-14 (#232133)

bleubleu wrote:

Wow, copying code in RAM and executing it? I've honestly never thought of that. Not sure that's a good fit for the design of my engine, but that's really original! :mrgreen:

Not really that original. It's common enough, so that cc65 has a built-in functionality for this. Also, that's the way the C64 uses its code since it cannot run it directly from the disc due to the slow speed.

I believe "The Legend of Zelda" does this as well because you see a whole bunch of weird values in the battery file when you simply start the game.

Also, how can this be a good or a bad fit for an engine? It's not like this will influence your general design, so I think it's always neutral. As long as you have a bunch of RAM left, it can always be used, no matter how the rest of your code is structured.

You simply dedicate a certain SEGMENT (doesn't even have to be a whole bank, just one SEGMENT in a MEMORY block) to your RAM:

Code:

MEMORY
{
   WRAM:      type = rw, start = $6000, size = $2000;

   # Here we store the code that shall go into RAM.
   # Of course, this bank can also still be used
   # for other stuff if your RAM code is much less than 8 KB.
   BANK_RAM: type = ro, start = $8000, size = $2000, file = %O, fill = yes;
}

SEGMENTS
{
   # Variables that need to be saved between two game sessions, i.e. savestates.
   BATTERY: load = WRAM, type = bss;

   # Generic additional RAM variables if the regular console RAM isn't enough.
   BSS_EX:  load = WRAM, type = bss;

   # Code executed in RAM.
   # Note the difference between "load =" and "run =".
   RAM_CODE_AND_RODATA: load = BANK_RAM, type = ro, run = WRAM, define = yes;
}

And then you do this in your initialization, for example before the second vblank:

Code:

   LDA #BankRam
   JSR SwitchBank

   LDA #<__RAM_CODE_AND_RODATA_RUN__
   STA Pointer1 + 0
   LDA #>__RAM_CODE_AND_RODATA_RUN__
   STA Pointer1 + 1

   LDA #<__RAM_CODE_AND_RODATA_LOAD__
   STA Pointer2 + 0
   LDA #>__RAM_CODE_AND_RODATA_LOAD__
   STA Pointer2 + 1

   LDY #0

@copyCodeToRamLoop:

   LDA (Pointer2), Y
   STA (Pointer1), Y

   INC Pointer1 + 0
   BNE @pointer1IncrementEnd
   INC Pointer1 + 1

@pointer1IncrementEnd:

   INC Pointer2 + 0
   BNE @pointer2IncrementEnd
   INC Pointer2 + 1

@pointer2IncrementEnd:

   LDA Pointer2 + 1
   CMP #>(__RAM_CODE_AND_RODATA_LOAD__ + __RAM_CODE_AND_RODATA_SIZE__)
   BNE @copyCodeToRamLoop
   LDA Pointer2 + 0
   CMP #<(__RAM_CODE_AND_RODATA_LOAD__ + __RAM_CODE_AND_RODATA_SIZE__)
   BNE @copyCodeToRamLoop

Now you have up to 8 KB of additional always available space for anything. Just put the RAM_CODE_AND_RODATA segment declaration over anything that shall go into the RAM. It doesn't even matter which functions you use for this. You can change them on demand without altering anything else.

So, this is not even a concept that you need to plan in your engine design. It's a totally independent thing that always works as long as you have some WRAM left and as long as there's still room in any of your banks.

I'm doing this because in my opinion, it's the best if code is always available.

I would never do the approach where I split the code and its data into small, logical units (like all music and the music engine go into one bank, sprite declarations and sprite handling go into another bank etc.).
Because if you have data that's larger than one bank (for example screen definitions in an action RPG), this means you have to mirror the code into several banks.

Instead, my goal is to put the whole code into the global bank and to only use banks for data. (Maybe with the exception of completely separated code that will never ever interfer with anything else, like the title screen.)

That's why I welcome ways where you can maximize the always available code. So, I would never use 16 KB of switchable banks if I can cut it down to 8 KB while leaving the other 8 KB constant. In fact, if MMC3 allowed for 4 KB banks, I would leave three of them constant and only use one for switchable stuff.

In my opinion, having code always available is better than grouping code along with its corresponding data into banks. This way, it's less conscious planning.
If you find out that one of your functions needs stuff from two different banks where it only needed stuff from one bank before, you don't have to reorganize everything. The function can still switch the banks on demand, without the fear that the bank switch will pull the function itself away from the execution.

Re: Code organization with bank switching
by Fiskbit on 2019-01-14 (#232135)

On the topic of code in RAM, Zelda does this because most ($C000-$E3FF) of the fixed bank is dedicated to PCM data, so it copies about $1251 bytes of code and data to PRG-RAM to effectively extend the fixed bank. It expects the same code and data to be there at all times, and it doesn't do any tricks with it like self-modifying code (though PRG-RAM can be really handy for that if you want to make something pretty fast for slightly different use cases without a large space footprint).

Re: Code organization with bank switching
by Bregalad on 2019-01-14 (#232136)

It could also be an heritage of the original code who ran in the FDS RAM adapter. Either they used self-mod code, or they effectively had a "fixed bank" (i.e. data that is loaded once from the disk at boot and never edited ever again) larger than 16k.

Re: Code organization with bank switching
by tepples on 2019-01-14 (#232156)

DRW wrote:

Also, how can this be a good or a bad fit for an engine? It's not like this will influence your general design, so I think it's always neutral. As long as you have a bunch of RAM left, it can always be used, no matter how the rest of your code is structured.

The compromises to the engine's "general design" to ensure that "you have a bunch of RAM left" might not necessarily be a good fit, particularly if you're trying to ship your game on a cartridge board without extra RAM (such as the NESdev Compo board). Because Haunted: Halloween '85 and the Action 53 menu run on 32K bankswitching boards, they copy about 256 bytes of code for interbank data fetching and decompression to RAM, for instance. A53 doesn't do much, but HH85 ends up with fairly static terrain geometry because its actor-to-terrain collision detection is largely ROM-based.

DRW wrote:

I would never do the approach where I split the code and its data into small, logical units (like all music and the music engine go into one bank, sprite declarations and sprite handling go into another bank etc.).
Because if you have data that's larger than one bank (for example screen definitions in an action RPG), this means you have to mirror the code into several banks.

Hence the 14K rule of thumb. For something with as many rooms as Lizard, provided they aren't packed as tightly as the rooms in Super Mario Bros. or Cat Quest, I'd know from the start that there'd be far more than 14K and would plan to put their decoder in the fixed bank.

Fiskbit wrote:

Zelda does this because most ($C000-$E3FF) of the fixed bank is dedicated to PCM data

Much of which, if I remember correctly, consists of samples of the original sound effects that used the FDS chipset's wavetable channel.

Re: Code organization with bank switching
by DRW on 2019-01-14 (#232157)

tepples wrote:

Yeah, o.k., I was talking under the condition that you actually need the additional RAM in the first place, for example if you have a battery anyway or if you need huge amount of additional RAM for world maps like in "Super Mario Bros. 3". Then you can just as well use the rest of the RAM for code to avoid some bankswitching.

tepples wrote:

Hence the 14K rule of thumb.

If it works for you, then that's good. However, there's always the danger that one function might also need something from a different bank, not just its own bank.
Or that data might grow too much after you intially decided that one bank is enough.
Or that your code grows so much that the data itself would still fit nicely, but code + data exceeds the bank size.

To avoid this kind of restructuring and to avoid the issue of always keeping exact track which function uses which data, I strive to make sure that all of my code is always available.

Re: Code organization with bank switching
by Banshaku on 2019-01-14 (#232205)

I think it is hard to tell a "good" code organization for bank switching since it will depends on the needs of your project. Since (I guess) it's your first mmc3 one, the needs will change along the way since you are still not sure how to manage your assets or maybe even the scope of the project is not clear yet. This is what happened in my case and I had to do some refactoring to re-adjust the code and banking based on the current requirements that are always changing.

You will hit a few walls along the way for the first project but other ones will be easier once you get the hang on how to manage the data with your own style. One thing for sure that you will need to bank on every frame will be the music driver and song data, once you start to have not enough space in the currently selected banks during prototyping.