Getting HiROM/map no 21 image to work correctly?

Getting HiROM/map no 21 image to work correctly?
by benjaminsantiago on 2014-07-28 (#131687)

Hey so I have been trying to convert this SNES init tutorial to HiROM

http://wiki.superfamicom.org/snes/show/ ... ES+Program

and have been facing a few issues.

So initially I did the first part of what bazz specifies here:

http://www.cs.umb.edu/~bazz/snes/wladx/loromtohirom/

and got the ROM to work. However, the other day I was trying to transfer the ROM to a chip/PCB and opened up SNESRomUtil to SwapBin the ROM and got an error. The ROM will show up as a HiROM game, as a "good" ROM in every emulator I have (ZSNES, NO$SNS, SNES9X, BSNES).

When I change all the bank definitions to .ORG to $0000, the SNESRomUtil understands I have a HiROM game however the ROM will not work (black screen) and shows up as a "bad ROM" in the emulators mentioned above

I have tried what bazz recommends (using the .BASE $40 directive), but I'm not really getting anything with that either. Do I have to add a colon to every subroutine call? Do I have to add a .BASE $40 to every label?

When I look at the ROM in a hex editor things seem to check out (the header is at $FFC0, all the interrupt vectors look okay, the byte that designates the memory map is $21) , despite that, it seems like the code is never getting to the reset vector, I'm not sure the best way to "step through" the code...but when I reset it does not go to SEI, the first opcode at the reset vector. Is there a hopefully dumb explanation as to why this might happen, $FFCO is still in the same "bank" (as far as the 65816 64K bank definition is concerned), do I have to do something to make it get to the reset vector?

(sorry if I am saying anything that is obviously incorrect, I have been reading a lot of 65816 info where a "bank" is defined as 64K bytes and all the SNES stuff indicates a bank as 32K bytes).

Re: Getting HiROM/map no 21 image to work correctly?
by lidnariq on 2014-07-28 (#131688)

Well, using NO$SNS's debugger, where does it start execution? Do you see those bytes near the header?
AIUI, the 65816 is "supposed" to start execution from the address pointed to at [$00FFFC] in 65C02 compatibility mode.

Re: Getting HiROM/map no 21 image to work correctly?
by koitsu on 2014-07-28 (#131689)

lidnariq wrote:

Correct.

This brings into question how exactly your ROM file is being layed out (as in what sections of the file contain what). For all we know the emulator is attempting to go to the reset vector properly, but the reset vector area is all filled with zeros or $FFs (due to layout mistakes). It really helps if you're using an assembler that can actually generate a listing which also includes what banks and areas of the ROM it thinks your code should be physically mapped to. Some do not output this, which is very disappointing and requires a lot of trial and error to learn what works and what doesn't.

Additionally, you need a SNES emulator that offers a debugger that can start/launch immediately at power-on, and display what the addresses of all the vectors are. Geiger's SNES9x Debugger version offers this through the button labelled "Vector Info". Example using Actraiser 2 (HiROM / mode 21 / 16mbit) (don't ask me what the hell "8-bit" vs. "16-bit" vectors are, I have no clue, the 16-bit ones are what's correct).

Code:

Vectors:
   8 Bit      16 Bit
ABT   $00:FFFF   $00:FFFF
BRK   $00:FFFF   $00:8013
COP   $00:FFFF   $00:800F
IRQ   $00:FFFF   $00:800B
NMI   $00:FFFF   $00:8007
RES  $00:8000

APU vectors:
FFC0 0000 0000 0000 0000 0000 0000 0000 0000 
FFD0 0000 0000 0000 0000 0000 0000 0000 0000 
FFE0 00FF 00FF 00FF 00FF 00FF 00FF 00FF 00FF 
FFF0 00FF 00FF 00FF 00FF 00FF 00FF 00FF 00FF 

The vectors in question are found within memory offsets $C0FFE0 to $C0FFEF, as you'd expect. See a mode 21 memory map for how that memory region actually "maps" into bank $00 so that they work.

You'll find nothing but pain and suffering in the SNES emulator debugger world. Just warning you up front. :-)

All the assembler pseudo-directives (I'm not familiar with bazz though) like .base and .org are to help the assembler know how to generate addresses, but do not necessarily correlate with what you need to do to ensure that the actual ROM file is layed out properly so that it correlates with the proper memory map.

P.S. -- Don't forget that one of the very first things you need to do in your reset and NMI vectors in mode 21 is to do a long jmp (i.e. a full 24-bit jump) to the next preceding opcode/line. This is documented in the official SNES developers manual as well, and has to do with what bank the 65816 thinks it's operating in on start-up and within NMI. I can point you to threads about this if you want, but the thread I'd reference is with regards to a different subject.

Re: Getting HiROM/map no 21 image to work correctly?
by benjaminsantiago on 2014-07-28 (#131690)

lidnariq wrote:

I'm a be real, I don't know how to do this in NO$SNS, In SNES9X (the debugging one), it just says it is at $00:0000, but I'm not sure where in the cart that is supposed to be since almost everything in there is empty.

koitsu wrote:

P.S. -- Don't forget that one of the very first things you need to do in your reset and NMI vectors in mode 21 is to do a long jmp (i.e. a full 24-bit jump) to the next preceding opcode/line. This is documented in the official SNES developers manual as well, and has to do with what bank the 65816 thinks it's operating in on start-up and within NMI. I can point you to threads about this if you want, but the thread I'd reference is with regards to a different subject.

Do you know what section this is in, in the SNES Dev manual? I was looking through it before and saw the section where it gives the memory maps for various carts.

Re: Getting HiROM/map no 21 image to work correctly?
by benjaminsantiago on 2014-07-28 (#131691)

koitsu wrote:

All the assembler pseudo-directives (I'm not familiar with bazz though) like .base and .org are to help the assembler know how to

just for clarity, the assembler I am using is wla, bazz is a dude who made tutorials (sometimes on here), and bass is another assembler by byuu (also sometimes on here).

Re: Getting HiROM/map no 21 image to work correctly?
by lidnariq on 2014-07-28 (#131693)

benjaminsantiago wrote:

In SNES9X (the debugging one), it just says it is at $00:0000, but I'm not sure where in the cart that is supposed to be since almost everything in there is empty.

That sure looks like some kind of misalignment... Anyway, what does (debugger of choice) show is at $00FFFC ?

Re: Getting HiROM/map no 21 image to work correctly?
by koitsu on 2014-07-28 (#131705)

If SNES9x says the vectors are values $00:0000, then to me that means your ROM format is "incorrect" -- meaning your actual vector values are being placed in the ROM at file offsets that aren't correct for what the emulator is expecting. This is hard to explain, but: basically your assembler is configured in such a way to be generating a ROM image that does not truly mimic the operating mode (mode 21) you wish to use. As such, the emulator ends up loading the ROM however it does, and within the ROM there is probably a sequence of empty banks (read: all zeros) and one of those is being chosen as bank 0, which includes the vector locations. If you replaced the all-zeros with all-$FF then you'd end up seeing a vector location of $00:FFFF. You get the idea.

I can assure you that what SNES9x debugger for "Vector Info" tells you is correct though, for those vector locations I mean. Lots of commercial mode 21 hirom games do the Right Thing(tm), so this is purely an issue with the file your assembler has generated.

I don't have any familiarity with WLA though, so I can't help there. :/ I would not be surprised if the issue boiled down to how and where to use .org and .base. There is probably a relationship between those pseudo-ops and how the assembler generates a ROM image.

So TL;DR -- this is purely a "how do I get WLA to make a proper mode 21 hirom ROM image I can use?"

Note, do not use any kind of "conversion tools" to try and "convert from LoROM to HiROM". These things, in my experience, are horribly broken and don't take into account all the nuances involved (changing addresses the underlying 65816 code has to load from, etc.) -- many of these tools think "all you have to do is fix the bits in the SMC header and maybe the cart header and you're done!" Wrong.

As for where in the SNES Developers Manual it tells you that you need to do a long jmp: it depends on what revision of the manual you have. But it's in the section titled "Programming Cautions" (Chapter 24 for me, but may vary for you). For me it's labelled as Caution #14 and Caution #15. There are alternate ways to accomplish what they show you there, but the long jmp is the easiest. (Don't forget that you might also need to change the B register depending on what bank you want to read data from, if using 16-bit addresses in your code!).

Re: Getting HiROM/map no 21 image to work correctly?
by benjaminsantiago on 2014-07-29 (#131718)

Thanks koitsu!

So the reset vector has to be somewhere in $8000-$FFFF because the bank (can't remember if it's supposed to be the program or data bank) is initially zero right? I assume that has something to do with the jump you (koitsu) were talking about. If that's the case and you can use the $0000-$7FFF area for any non-reset vector stuff that makes sense and I guess problem solved.

I had things ORG'ed to $0000, but seems like it was being confused with the WRAM. My header looked okay in a hex editor but was just zeros in SNES9X debugger.

It looks like the aforementioned issue with the ROM's not working in SNESRomUtil was simply because I used $00 to represent empty space instead of $FF. I looked at N-Warp (the dforce3000 game) and seemed like everything in the headers was the same except that, sorry if this has been documented somewhere else, or maybe is a problem related to the particular version I have (I grabbed 2.1, the one from Romhacking.net).

Re: Getting HiROM/map no 21 image to work correctly?
by tepples on 2014-07-29 (#131721)

Yes. Just as reset has to be within a fixed bank (usually $C000-$FFFF or $E000-$FFFF) on the NES, reset has to be within $008000-$00FFFF on Super NES HiROM.

Re: Getting HiROM/map no 21 image to work correctly?
by koitsu on 2014-07-29 (#131729)

Yes, what you've concluded and what Tepples said are both correct. I guess I'll start with some of the basics, leading up to the ROM layout/format ordeal (which should hopefully make sense by the time I get to that paragraph).

When the 65816 starts up, it's operating in emulation mode (which is why clc/xce is needed first thing). This also means K -- the name of the register associated with the bank where PC is executing -- is $00. It has to be -- it's emulating a 65c02 which only has 16 bits of addressing space (no concept of banks). So everything operates out of $00.

The one exception to this is the B register (where things like lda $1234 load from), which if I remember right, is "unknown" on power-on or reset, meaning it can be any value. It's up to the programmer to initialise that, commonly people doing things like phk/plb (to set B to the same value as K). As such, you'll often see reset vector code that looks like: sei/clc/xce/jml {next instruction}/phk/plb. Now you know why.

The use of a long jmp solves a couple things. The Cautions I mentioned explain the reasoning, just not very verbosely. The first caution has to do with what I already said -- on start-up K is $00. Now is when we have to talk about mode 21's memory map -- sorry to suddenly switch topics but I'll circle back:

Now look at a mode 21 memory map, specifically the one in the developers manual (for me it's figure 2-21-3). You'll see the following layout:

Banks $00 to $7D, address range $8000-FFFF = mirrors "Program ROM Area (1)"
Banks $80 to $BF, address range $8000-FFFF = mirrors "Program ROM Area (1)"
Banks $C0 to $FF, address range $0000-FFFF = Program ROM Area 1

You're going to ask how exactly the mirroring works for banks $00-7D and $80-BF, because there's only 32KBytes of mirroring going on (i.e. the $0000-7FFF range isn't mirrored). "So how does that work?"

The explanation is given in Note #1 at the bottom of the diagram: banks $00-3F address range $8000-FFFF are mirrors of banks $C0-FF address range $8000-FFFF. Don't ask me what's in banks $40-7D, but my gut feeling is that it's just the same thing over and over (missing a couple banks at the end of course), i.e. still the upper 32KBytes. It doesn't matter -- do not care, do not focus on that.

Right about now something should be going off in your head: "ohh! OHHHH! *NOW* I see how $C08000 to $C0FFFF play a role in regards to bank $008000 to $00FFFF, which includes all the vectors!"

And now you ALSO see the danger in continuing to operate code out of bank $00 while in mode 21 -- only the upper 32KBytes are mirrored. So what happens when your code (built to use banks $C0-FF, thus "a linear 64KBytes") tries to do something like lda $58FF? Yeah, you guessed it -- unless it's either a) using 24-bit addressing (ex. lda $C058FF), or b) explicitly setting the B register in advance, you're not going to get the data within your ROM like you expect. :-)

So that should answer the first Caution point and what it's about.

The 2nd Caution point has to do with high-speed mode (3.58MHz). There's a memory map layout figure that outlines this (in my manual it's Figure 2-21-1). The docs are confusing since now you have a "memory speed" map in addition to the mode 20 and mode 21 maps, I know. To be honest I've never used high-speed mode (the old SNES intro/demo/trainer I wrote claimed to but I didn't know what the fuck I was doing, I really thought it was just as simple as setting bit 0 to 1 in $420d and it'd "magically be faster" but I was young and we didn't have this kind of thing on the Apple IIGS). My understanding is that banks $80-BF can operate "faster" than other banks, when bit 0 of $420d is set to 1. But if you notice, banks $80-BF don't have the linear 64KByte arrangement like $C0-FF do, so... yeah. So anyway, what the Caution point is telling you is that you need to do a long jmp to make sure K=$80 so that you can benefit from high-speed mode. (I should note "high-speed mode" just means that the speed (I think?) at which ROM is accessed is faster -- this is a separate bus and doesn't have any correlation with how many CPU cycles it takes to execute an instruction, those remain the same as always. I don't understand the hardware part of the thing so I just don't care much about this).

So now that I've covered all that, does why the ROM layout matter, re: vectors, start to make more sense? :-)

In general I tell people this: if you're going to try converting a LoROM thing into HiROM, you need to understand the memory maps correctly. And this is purely my opinion, but if you're going to use mode 21, just use banks $C0-FF to your hearts content -- now you have a full linear 64KByte bank to work with, and everything "just works" -- with one major caveat: what about registers? If you're in bank $C0, what is sta $2100 going to do? (Or better yet, what about loading from memory-mapped registers?) It's a great question, and I believe the answer is that it'll do exactly what you think: it'll be accessing ROM (the STA won't work, duh, can't write to ROM), not the registers.

I remember many Square and Enix games doing something I thought was absolutely ingenious the first time I saw it: rep #$20 / lda #$2100 / tcd, then proceeding to access the memory-mapped registers ($2100 up) as direct page, e.g. sta $00 would actually write to register $2100 and so on. Clever, not to mention it takes less CPU cycles to access, since direct page accesses are faster.

But I've also seen games using 24-bit accesses for registers, ex. sta $002100 and the like.

Or, alternately, you can use banks $80-BF, assuming your code and ROM is layed out correctly to comprehend the fact that you've only address ranges $8000-FFFF to work in. But then we're back to the "32KByte bank" concept (for lack of better term), which doesn't really impact you directly per se as long as you remember that that's how the memory layout works. Me personally? I come from an Apple IIGS background so I'm used to the entire address range being linear (e.g. 64KBytes) and that mapping directly to things, none of this "32KByte mirroring" half-ass crap. So if I were doing a HiROM thing, I'd just use banks $C0-FF and design everything with that in mind. How I'd do register accesses would probably be through similar means as what Square/Enix did, or do something like lda #$00 / pha / plb followed by register accesses, or use 24-bit addresses exclusively (which are slower). Really hard decision though.

I think the SNES memory map / modes are pretty awful, to be honest. I posted something somewhere semi-recently about how **I** would have designed the layout from the beginning, where you'd have a lot more ROM space, and it'd always be linear, and the overall layout becomes a lot simpler. I can't find that post though. Well whatever.

Hope this kinda covers numerous bases for you and helps shed some light on it all.

Re: Getting HiROM/map no 21 image to work correctly?
by benjaminsantiago on 2014-07-29 (#131733)

koitsu wrote:

It has to be -- it's emulating a 65c02 which only has 16 bits of addressing space (no concept of banks). So everything operates out of $00.

ohhhhh yeah! I didn't try to think about it that way. I know that you have to get into native mode on start up but I didn't think about WHY.

koitsu wrote:

Now look at a mode 21 memory map, specifically the one in the developers manual (for me it's figure 2-21-3).

forgot to ask, how many versions of the SNES Dev Manual are there? I've only found one on RHDN. There are other ones floating around?

koitsu wrote:

But I've also seen games using 24-bit accesses for registers, ex. sta $002100 and the like.

Are you just disassembling games or there is code floating around somewhere? Also are the demos you mention up anywhere?

koitsu wrote:

How I'd do register accesses would probably be through similar means as what Square/Enix did, or do something like lda #$00 / pha / plb followed by register accesses, or use 24-bit addresses exclusively (which are slower).

I imagine most of the time with RPG the slower access was not a big deal, I think I read another post on here that suggested one of the advantages of HiROM were for large scripts and stuff. The only reason I actually needed a HiROM version of a ROM was because those were the only PCB's I had around, not really a programming choice.

In your suggested model how would you handle accessing stuff like VRAM/CGRAM etc? Just a different (64Kbyte) bank? I haven't gotten to this level with other computers/game consoles that are this low level, so I can't compare, but I do like being able to access the SNES registers where-ever. I don't find it that difficult, but I am about the edge of being able to make a full game and I'm going to start being able to push past the 2Mbit boundary.

Anyway thanks for the info! I read through the map 20/21 diagrams again and the programming cautions you mentioned last night (I actually fell asleep without sending my last post), so things are starting to make more sense and your explanation of other stuff like the access time is helpful.

Re: Getting HiROM/map no 21 image to work correctly?
by koitsu on 2014-07-29 (#131737)

1. Re: dev manuals: there were many revisions made over the years. I'd rather not discuss this matter publicly for a lot of reasons (which I also won't go into here). But the one you say you have is the last revision I'm aware of.

2. re: disassembling games vs. code floating around: it's based on disassembling games / doing romhacking, but there's nothing "special" about someone deciding to do sep #$20 / lda #$00 / pha / plb / lda #$f8 / sta $2100 vs. sep #$20 / lda #$f8 / sta $002100. Code is code is code. :-)

3. re: demos: the code I was referring to was in my "infinity" demo, which comes with my old SNES documentation. Within the .zip there is a file called test.zip which contains some example code. It's not made for WLA; it's intended to be assembled with TRASM (that's not a typo), which was later superseded by x816 (done by the same author: Norman Yen / minus), so it probably can be slightly modified to work with x816. Be aware that any public scrutiny of this is probably legit, but ultimately I really don't care -- every single person I've witnessed over the years "judging" said work are people who weren't in the snesdev "scene" during the mid-to-late 90s, which is when all of the reverse-engineering and hacking were going on. So it's all after-the-fact whining. (For those people: I'm sorry we didn't have time travel capability in the fucking 90s! :P) That demo is also LoROM but I'm sure I could reassemble it (using x816) with changes to make it HiROM/mode 21. If you want me to, let me know, but I can't give a time frame on when I can have this done. I have a lot going on this week.

4. re: access times: the "type" of game has no bearing on this. Chrono Trigger, for example, is an RPG, right? It's also one of the games which uses all sorts of tweaky effects on the SNES and does a lot of "clever stuff" because of all the visual effects and stuff going on during a lot of the scenes. As I remember it, it's a very "timing-sensitive" game. So don't ever get in the habit of thinking a particular game genre correlates with slower code per se. The genre rarely has anything to do with it. By 3rd-gen SNES/SFC games, most of the major companies had their own libraries or entire suites of code that they could use to build a game. I've heard of some games which had parts of them done in C.

5: re: suggested model: accessing VRAM is always the same: you write to it using DMA. If you meant just general memory-mapped register accesses, my recommended method is this: for a few one-off accesses or writes, just use 24-bit addressing (ex. sta $002100). But if you know you're going to be doing a lot of memory-mapped register accessing (more likely within NMI), then I suggest using sep #$20 / lda #$00 / pha / plb / sta $2100 followed by your large batch of register accesses. Most NMI routines I've seen tend to do that. You could probably also shove a phb/plb in there so that you restore B near the end of the entire routine too (make sure accumulator size is the same as when you pushed it, though, else you'll have a stack overflow/underflow eventually). If timing is really critical (e.g. you want to save bytes and cycles), then use the rep #$20 / lda #$2100 / tcd method I described earlier (to move the direct page base to $2100). Honestly the first two methods I describe above are the easiest to use in combination and make the most sense / don't cause a lot of confusion.

If you're just starting out and really want to use mode 21 with a linear memory map, then I suggest just using full 24-bit addresses for your memory-mapped register accesses. Yes it's going to be slower + waste a byte per statement, but you can optimise all that out later. :-)

Re: Getting HiROM/map no 21 image to work correctly?
by tepples on 2014-07-29 (#131741)

Good explanation.

For the most part, I'm inclined to prefer the method of keeping all PPU-related code and read-only data in the second half of each 64K bank. One problem with the Square Enix method (redirecting the direct page to $2100) is that now your indirect addressing has to go through the stack (the d,s and (d,s),y modes) instead of a direct page in RAM. But with DMA, there might not be quite as much indirect addressing going on in an NMI handler as there was on the NES. And I guess you could put the direct page at $1F40 to map the PPU at $C0-FF. This puts both RAM and the PPU in the same direct page, almost like it was on the Atari 2600, but now you must eat the extra CPU cycle.

If you do go the 24-bit route, you might want to define the registers to sit at $212100-$21213F as a cheap way to ensure that the assembler doesn't go and "optimize" accesses into 16-bit addresses.