After a lot of work and general examination (bsnes-plus's trace logger was a godsend), I've managed to figure out why the initial text and Censor logo appear corrupted on some emulators (ex. SNES9x) and on SD2SNES. I spent a lot of time comparing VRAM dumps between SNES9x and bsnes-plus (the differences were huge, but then some parts matched up). This was confusing at first -- and I put about 12 hours into the task of RE'ing it -- before going to bed and having a minor epiphany while falling asleep: "were VRAM updates being done inside VBlank? I don't remember seeing them wait for VBlank. Check that tomorrow".
Sure enough, the initial VRAM population were being done regardless of VBlank. This includes all BG1 and BG2 CHR data (BG1 = Censor logo, BG2 = ASCII text overlay), as well as BG1/BG2 screen/layout data. Purely because of timing/luck a bunch of the VRAM update routines were being done "partially" inside VBlank, and partially outside. Forced blanking would fix this problem, especially so early on -- and the demo does set $2100 gradually (from $00 to $0F, for a fade-in) on its own. But setting $2100 = $8F is required per official Nintendo documentation, surely they didn't omit this?
This is a mode 20 demo. The reset vector is $8000. The code at $008000 is nothing more than jml $168000. The code at $168000 contains what looks *almost* like the per-official-documentation initialisation code. Here it is:
$2100 isn't being set until later on. If I set a write breakpoint on $2100, this is the result:
That's smack dab in the middle of the fade-in and fade-out routine (which also has some support for setting $2100 = $80, but I haven't sat down to fully understand the "logic flow" of the routine).
Patching the game to use a reset vector of $FC00, then adding some code there (file offset $7C00; .sfc file, not .smc) was easy: clc / xce / sei / sep #$30 / lda #$8f / sta $2100 / jml $168000. The results in SNES9x debugger are attached.
I believe the reason it has worked on older copiers (SWC DX32, etc.) is because those probably pre-init $2100 (and certainly other registers) to $8F (or maybe just $80) prior to actually mapping the ROM address-line-wise + jumping to reset vector. SD2SNES and emulators showing garbage is 100% legitimate -- the core issue is that forced blanking wasn't set, or another way of looking at it is that VRAM update routines weren't waiting for VBlank (using forced blanking is a lot easier to solve this problem).
There's a lot more work to do on this, including figuring out why the starfield graphics during the actual Sidmania part are all garbled/messed up (I'm thinking similar issues, re: doing VRAM updates outside of VBlank, or memory that is being pre-used/pre-initted by the previous phase of the demo, or some mode 7 registers not being initialised correctly/inconsistently), but hey, one thing at a time.
Sure enough, the initial VRAM population were being done regardless of VBlank. This includes all BG1 and BG2 CHR data (BG1 = Censor logo, BG2 = ASCII text overlay), as well as BG1/BG2 screen/layout data. Purely because of timing/luck a bunch of the VRAM update routines were being done "partially" inside VBlank, and partially outside. Forced blanking would fix this problem, especially so early on -- and the demo does set $2100 gradually (from $00 to $0F, for a fade-in) on its own. But setting $2100 = $8F is required per official Nintendo documentation, surely they didn't omit this?
This is a mode 20 demo. The reset vector is $8000. The code at $008000 is nothing more than jml $168000. The code at $168000 contains what looks *almost* like the per-official-documentation initialisation code. Here it is:
Code:
008000 jml $168000 [168000] A:0000 X:0000 Y:0000 S:01ff D:0000 DB:00 nv1BdIzc V: 0 H: 46 F: 0
168000 clc A:0000 X:0000 Y:0000 S:01ff D:0000 DB:00 nv1BdIzc V: 0 H: 54 F: 0
168001 xce A:0000 X:0000 Y:0000 S:01ff D:0000 DB:00 nv1BdIzc V: 0 H: 58 F: 0
168002 sei A:0000 X:0000 Y:0000 S:01ff D:0000 DB:00 nvMXdIzC V: 0 H: 61 F: 0
168003 phk A:0000 X:0000 Y:0000 S:01ff D:0000 DB:00 nvMXdIzC V: 0 H: 65 F: 0
168004 plb A:0000 X:0000 Y:0000 S:01fe D:0000 DB:00 nvMXdIzC V: 0 H: 70 F: 0
168005 rep #$20 A:0000 X:0000 Y:0000 S:01ff D:0000 DB:16 nvMXdIzC V: 0 H: 77 F: 0
168007 lda #$01fd A:0000 X:0000 Y:0000 S:01ff D:0000 DB:16 nvmXdIzC V: 0 H: 83 F: 0
16800a tcs A:01fd X:0000 Y:0000 S:01ff D:0000 DB:16 nvmXdIzC V: 0 H: 89 F: 0
16800b lda #$0000 A:01fd X:0000 Y:0000 S:01fd D:0000 DB:16 nvmXdIzC V: 0 H: 92 F: 0
16800e tcd A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvmXdIZC V: 0 H: 98 F: 0
16800f sep #$30 A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvmXdIZC V: 0 H:102 F: 0
168011 stz $2101 [162101] A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvMXdIZC V: 0 H:107 F: 0
168014 stz $2102 [162102] A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvMXdIZC V: 0 H:115 F: 0
168017 stz $2103 [162103] A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvMXdIZC V: 0 H:122 F: 0
16801a stz $2104 [162104] A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvMXdIZC V: 0 H:130 F: 0
16801d stz $2104 [162104] A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvMXdIZC V: 0 H:147 F: 0
168020 stz $2105 [162105] A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvMXdIZC V: 0 H:155 F: 0
...
168000 clc A:0000 X:0000 Y:0000 S:01ff D:0000 DB:00 nv1BdIzc V: 0 H: 54 F: 0
168001 xce A:0000 X:0000 Y:0000 S:01ff D:0000 DB:00 nv1BdIzc V: 0 H: 58 F: 0
168002 sei A:0000 X:0000 Y:0000 S:01ff D:0000 DB:00 nvMXdIzC V: 0 H: 61 F: 0
168003 phk A:0000 X:0000 Y:0000 S:01ff D:0000 DB:00 nvMXdIzC V: 0 H: 65 F: 0
168004 plb A:0000 X:0000 Y:0000 S:01fe D:0000 DB:00 nvMXdIzC V: 0 H: 70 F: 0
168005 rep #$20 A:0000 X:0000 Y:0000 S:01ff D:0000 DB:16 nvMXdIzC V: 0 H: 77 F: 0
168007 lda #$01fd A:0000 X:0000 Y:0000 S:01ff D:0000 DB:16 nvmXdIzC V: 0 H: 83 F: 0
16800a tcs A:01fd X:0000 Y:0000 S:01ff D:0000 DB:16 nvmXdIzC V: 0 H: 89 F: 0
16800b lda #$0000 A:01fd X:0000 Y:0000 S:01fd D:0000 DB:16 nvmXdIzC V: 0 H: 92 F: 0
16800e tcd A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvmXdIZC V: 0 H: 98 F: 0
16800f sep #$30 A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvmXdIZC V: 0 H:102 F: 0
168011 stz $2101 [162101] A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvMXdIZC V: 0 H:107 F: 0
168014 stz $2102 [162102] A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvMXdIZC V: 0 H:115 F: 0
168017 stz $2103 [162103] A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvMXdIZC V: 0 H:122 F: 0
16801a stz $2104 [162104] A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvMXdIZC V: 0 H:130 F: 0
16801d stz $2104 [162104] A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvMXdIZC V: 0 H:147 F: 0
168020 stz $2105 [162105] A:0000 X:0000 Y:0000 S:01fd D:0000 DB:16 nvMXdIZC V: 0 H:155 F: 0
...
$2100 isn't being set until later on. If I set a write breakpoint on $2100, this is the result:
Code:
Loaded Sidmania.
Breakpoint 0 hit (1).
16825d sta $2100 [162100] A:3c01 X:0000 Y:0000 S:01fc D:0000 DB:16 nvMXdIzC V:231 H:310 F:15
Breakpoint 0 hit (1).
16825d sta $2100 [162100] A:3c01 X:0000 Y:0000 S:01fc D:0000 DB:16 nvMXdIzC V:231 H:310 F:15
That's smack dab in the middle of the fade-in and fade-out routine (which also has some support for setting $2100 = $80, but I haven't sat down to fully understand the "logic flow" of the routine).
Patching the game to use a reset vector of $FC00, then adding some code there (file offset $7C00; .sfc file, not .smc) was easy: clc / xce / sei / sep #$30 / lda #$8f / sta $2100 / jml $168000. The results in SNES9x debugger are attached.
I believe the reason it has worked on older copiers (SWC DX32, etc.) is because those probably pre-init $2100 (and certainly other registers) to $8F (or maybe just $80) prior to actually mapping the ROM address-line-wise + jumping to reset vector. SD2SNES and emulators showing garbage is 100% legitimate -- the core issue is that forced blanking wasn't set, or another way of looking at it is that VRAM update routines weren't waiting for VBlank (using forced blanking is a lot easier to solve this problem).
There's a lot more work to do on this, including figuring out why the starfield graphics during the actual Sidmania part are all garbled/messed up (I'm thinking similar issues, re: doing VRAM updates outside of VBlank, or memory that is being pre-used/pre-initted by the previous phase of the demo, or some mode 7 registers not being initialised correctly/inconsistently), but hey, one thing at a time.