I just thought I'd include that in the title as not to mislead anyone. Anyway, thanks to darryl.revok (me misreading the logo
) and psychopathicteen (the bullet rendering) for the idea.
Anyway, what the plan is that it'll be top down, and there'll be a layer solely devoted to the ink colors. Whenever there's a change in it, (a splat that appears) the tiles where the change occurred are uploaded to vram, but only if it's onscreen. Whenever the screen scrolls, it'll load extra tiles like any other game. The buffer is to be a lot bigger than the screen, and I'd probably need extra ram on the cartridge if I get that far.
This is really pretty incomplete, (I kind of need to know how to code 8x16 multiplication for multiplying the y position with the buffer width at two parts) but I'm not sure how I should do everything. In the process of writing this, I just realized that my idea of having the code look through the whole splat horizontally won't really work, because in the actual thing, it would need to check collisions... I think I'll recode it to where it draws the thing 8x8 block by block. The splat graphics will just have to have a blank tile at the top and bottom. (I'll be drawing blank areas, but whatever. Probably faster to do that than check every 8x1 pixel sliver.) I'll also reorder where the data for the pixels are in them, but I'll keep how I had the mask and the actual pattern the same, where the bytes for both are intertwined. Actually, additionally, if I'm drawing it by every tile, it should be a lot easier to see what tiles I need to upload if the splat appeared onscreen. (Also, I won't need different code for every width of the splat.) I haven't done anything with checking the edges of the buffer when drawing the graphic, and I really kind of wonder how I'll handle the graphic if it starts off the buffer. (If only parts of the splat are visible from the top and left.)
Anyway, it isn't commented right now because I'm feeling lazy (I have to fix it anyway, so I can do it then.) I find it to be pretty easy to follow, but then again, I wrote it, so...
Code:
lda SplatRequestTable+YPosition,x
(Multiply by data per line in buffer. I don't know how.)
sta SplatPosition
lda SplatRequestTable+XPosition,x
ror
ror
and #%0011111111111111
clc
adc SplatPosition
sta SplatPosition
lda SplatRequestTable+XPosition,x
and #%0000000000000111
bne continue_calculating_offset
lda #$0001
sta SplatBetweenTilesWidth
continue_calculating_offset:
tay
lda SplatGraphicOffsetPositionTable,y
clc SplatRequestTable+GraphicOffset,x
sta SplatGraphicOffset
lda SplatRequestTable+Height,x
(Multiply by data per line in buffer. I don't know how.)
clc
adc SplatPosition
sta EndOfSplat
lda SplatRequestTable+Width,x
ror
ror
and #%0011111111111111
sta SplatDataWidth
lda SplatRequestTable+Width,x
clc
adc SplatBetweenTilesWidth
tay
jsr (VaryingWidthCodeAddressTable,y)
;======================================================================
start_draw_splat_8_tile:
ldx SplatPosition
ldy SplatGraphicOffset
draw_splat_8_tile_loop:
lda Buffer,x
and #$0000,y
ora #$0002,y
sta Buffer,x
lda Buffer+16,x
and #$0004,y
ora #$0006,y
sta Buffer+16,x
lda Buffer+32,x
and #$0008,y
ora #$000A,y
sta Buffer+32,x
lda Buffer+48,x
and #$000C,y
ora #$000E,y
sta Buffer+48,x
lda Buffer+64,x
and #$0010,y
ora #$0012,y
sta Buffer+64,x
lda Buffer+80,x
and #$0014,y
ora #$0016,y
sta Buffer+80,x
lda Buffer+96,x
and #$0018,y
ora #$001A,y
sta Buffer+96,x
lda Buffer+112,x
and #$001C,y
ora #$001E,y
sta Buffer+112,x
cpy EndOfSplat
bcs continue_draw_splat_8_tile_loop
(Whatever to get ready for looking at the next splat)
continue_draw_splat_8_tile_loop:
inx
inx
tya
clc
adc SplatDataWidth
tay
bra draw_splat_8_tile_loop
Espozo wrote:
The buffer is to be a lot bigger than the screen
How large do you expect it to be? And at pixel resolution, or would something coarser do?
It's to be screen resolution. I looked at 2x2, but it looks like crap, and trying to use some sort of algorithm to smoothen the edges takes too much CPU time.
How large do you expect the buffer at screen resolution to be?
Well, the characters are probably going to be about 32x32, and the maps would probably be about the size of the ones in the actual game (If they aren't to be straight up copies, but you'd have to deal with depth then)
and on this same map, you're this size:
So pretty big.
if your characters are going to be that large, why not use 8x8 predefined tiles for the ink splatter? (no ink, team A ink, team B ink).
Use solid ink on the inside of the spray shape and draw rounded corner edge pieces depending on its neighboring tile values.
Code:
........
...CC...
..CSSC..
..SSSS..
..CSSC..
S = known solid ink tiles, C = check when drawing, as in look it its neighbors to determine the tile number for an appropriate rounded edge. When drawing, optimize to avoid redundant checking of neighboring cells multiple times.
end result:
Code:
........
...__...
../OO\..
..OOOO..
..\OO/..
For example, look at an overhead map view in a traditional JRPG, and notice how the water coastline and land transition has rounded, stylized tiles. Or how forest tiles have rounded edges at the transition to grass tiles so as to make more of a blob of forest instead of strictly square edges.
i have a feeling splatoon itself is using a floor grid and is marking each cell either ink color (or none) and picking a texture based on neighbors and its own saturation value. For "immersion" sake, it draws the walls in finer grained per-pixel splatter textures. But walls don't count toward the end score.
whicker wrote:
i have a feeling splatoon itself is using a floor grid and is marking each cell either ink color (or none) and picking a texture based on neighbors and its own saturation value. For "immersion" sake, it draws the walls in finer grained per-pixel splatter textures. But walls don't count toward the end score.
Grass wear in
Animal Crossing: City Folk operates similarly.
whicker wrote:
if your characters are going to be that large, why not use 8x8 predefined tiles for the ink splatter? (no ink, team A ink, team B ink).Use solid ink on the inside of the spray shape and draw rounded corner edge pieces depending on its neighboring tile values.
I understand the concept, but I don't think it possible for you to have a pattern this random looking that way:
I don't have a clue how they have enough memory though to cover every surface if it were the way I described though
I could always join a Splatoon fansite or something and ask if anyone knows how it's rendered if it's that big of a deal.
I'm trying to fix my code, but it's melting my brain...
The actual resolution of the "blue or orange" texture map is a lot lower than the visual rendering of it.
It takes a sample from the low resolution map, adds some spatially-stable noise to it (i.e. always the same noise in the same world coordinates), and then takes a threshold to decide whether to render that pixel blue or orange.
rainwarrior wrote:
It takes a sample from the low resolution map, adds some spatially-stable noise to it (i.e. always the same noise in the same world coordinates), and then takes a threshold to decide whether to render that pixel blue or orange.
You're saying it gets a low map, blows it up (by like a bilinear filtering sort of deal) and then it does what? This seems like it's more for taking less memory than for speed (which I guess partially explains why the game is only 720p and no anti-aliasing when the geometry is so simple) but I'd rather this run fast and have extra memory, even if it is "cheating".
Wait, how did you even know that?
Yes, bilinear filtering on the low res texture. Then, to add fake visual detail, you layer noise on top of it. Finally, to create sharp edges you apply a threshold.
This process should not be particularly expensive for a pixel shader to do. I don't think it's a tradeoff for rendering speed
at all. The gameplay map is low res to conserve memory, and most importantly to
conserve network bandwidth, not rendering time.
low res map = less data to send over the network = more responsive netplay
Espozo wrote:
Wait, how did you even know that?
It's an educated guess. I did 3D graphics programming for a living before I quit to work on my NES game.
rainwarrior wrote:
Then, to add fake visual detail, you layer noise on top of it.
You mean like gray if the two colors were to be white and black?
rainwarrior wrote:
Finally, to create sharp edges you apply a threshold.
What does that mean?
rainwarrior wrote:
This process should not be particularly expensive for a pixel shader to do.
It would be for s console without one, which is the deal...
rainwarrior wrote:
more responsive netplay
A second less responsible, and it would be unplayable. The lag is already the worst part the game. You'd have thought they'd base the gameplay a little more around the fact that there's a good amount of lag anytime by giving you more health or something so you don't just magically explode after just meeting someone.
I'll just try to explain visually:
Attachment:
ink_demo.png [ 70.62 KiB | Viewed 4866 times ]
1. Low res ink map.
2. Bilinear filtering.
3. Threshold of 2: this makes a nicer edge but the low resolution underlying is too apparent.
4. Noise map.
5. Noise map + bilinear filtered low res ink map.
6. Threshold of 5: creates extra details hiding the low resolution but maintaining its basic shape.
espozo wrote:
rainwarrior wrote:
This process should not be particularly expensive for a pixel shader to do.
It would be for s console without one, which is the deal...
I'm not offering a SNES solution. I'm merely offering an explanation as to how Splatoon might be rendered.
Okay, here's the revised code: (but still far from complete)
Code:
lda SplatRequestTable+YPosition,x
ror
ror
ror
cmp #$0010000000000000
bcc continue_calculating_vertical_offset
inc EndOfSplat
continue_calculating_vertical_offset:
and #%0001111111111111
(Multiply by data per line in buffer. I don't know how.)
sta SplatPosition
lda SplatRequestTable+YPosition,x
and #$0000000000000111
asl
sta SplatGraphicOffset
lda SplatRequestTable+GraphicOffset,x
sec
sbc SplatGraphicOffset
sta SplatGraphicOffset
lda SplatRequestTable+XPosition,x
ror
ror
and #%0011111111111111
clc
adc SplatPosition
sta SplatPosition
lda SplatRequestTable+XPosition,x
and #%0000000000000111
bne continue_calculating_x_offset
inc SplatWidth
continue_calculating_x_offset:
lda SplatRequestTable+XPosition,x
and #%0000000000000111
(Multiply by data per variation of splat graphic. All sizes should be the same.)
clc
adc SplatGraphicOffset
sta SplatGraphicOffset
lda SplatRequestTable+Height,x
clc
adc SplatHeight
(Multiply by data per line in buffer. I don't know how.)
clc
adc SplatPosition
sta SplatEndOfSplat
lda SplatRequestTable+Width,x
ror
ror
and #%0011111111111111
sta SplatDataIncrement
lda SplatRequestTable+Width,x
clc
adc SplatWidth
sta SplatWidth
;======================================================================
start_draw_splat_loop:
ldx SplatPosition
ldy SplatGraphicOffset
draw_splat_loop:
lda Buffer,x
and #$0000,y
ora #$0002,y
sta Buffer,x
lda Buffer+2,x
and #$0004,y
ora #$0006,y
sta Buffer+2,x
lda Buffer+4,x
and #$0008,y
ora #$000A,y
sta Buffer+4,x
lda Buffer+6,x
and #$000C,y
ora #$000E,y
sta Buffer+6,x
lda Buffer+8,x
and #$0010,y
ora #$0012,y
sta Buffer+8,x
lda Buffer+10,x
and #$0014,y
ora #$0016,y
sta Buffer+10,x
lda Buffer+12,x
and #$0018,y
ora #$001A,y
sta Buffer+12,x
lda Buffer+14,x
and #$001C,y
ora #$001E,y
sta Buffer+14,x
cpy EndOfSplat
beq done
inc TilesDrawnHorizontally
cmp SplatWidth
beq next_row
txa
clc
adc #$0010
tax
tya
clc
adc SplatDataIncrement
tay
bra draw_splat_loop
next_row:
stz TilesDrawnHorizontally
txa
clc
adc BufferDataPerRow
tax
tya
clc
adc SplatDataIncrement
tay
bra draw_splat_loop
I actually just now realized that "Multiply by data per line in buffer" doesn't exactly work here because of how the tiles are set up in the buffer, but my brain hurts now so It'll have to wait.
rainwarrior wrote:
I'll just try to explain visually:
That's definitely clever... (Not necessarily for Nintendo, but for whoever came up with the idea.)
Espozo wrote:
That's definitely clever... (Not necessarily for Nintendo, but for whoever came up with the idea.)
Using noise to create extra visual detail in computer graphics is probably at least 40 years old, but if you want someone to thank I think
Ken Perlin did the
most notable work to popularize the idea in the early 80s.
Oh yeah, how do you multiply on the SNES? I looked at the SNES hardware register page on superfamicom.org and it lists the multiplicand and multiplier registers for the CPU, but it doesn't seem to for the 8x16 multiplication in the PPU for mode 7. However, it does list the registers that hold the product. Obviously, we're not dealing with Mode 7 here so it should be safe.
The part that talks about the result registers tells you what the inputs are. They're just part of the Mode 7 matrix.
This is a neat idea, though. Good thing it's not me conceptualizing it - I'd have insisted on doing a full 3D version with the Super FX2 and software texture re-rendering or some nonsense... with explicit XBAND support, of course... I haven't yet learned that there's a difference between what a SNES could do and what a SNES should do...
93143 wrote:
The part that talks about the result registers tells you what the inputs are. They're just part of the Mode 7 matrix.
Wow, I'm an idiot...
Quote:
Registers 211b through 2120 are 16 bits wide.
0x211B is also used as the 16-bit multiplicand for registers 0x2134-6 (write twice)
0x211C is also used as the 8-bit multiplier for registers 0x2134-6
What does it mean by "write twice" though? Is it just that it's two 8 bit writes, or one 16 bit write?
Anyway, I think I figured out how to calculate the vertical offset: You take the YPosition, divide it by 8 by bit shifting, multiply this number by the amount of pixels in the buffer horizontally times the amount of data per tile (pre calculated obviously) store this, load YPosition again, "and" this to where only the first 3 bits are left, multiply this number times the amount of data per 8 pixels (2 for 2 bytes) and then add the result of the previous thing to this for the final result. I'll probably code this tomorrow and try and do some other stuff to it also.
93143 wrote:
This is a neat idea, though.
Just thinking, another thing that made me thing about it is how the maps are mostly 2D but with elevation differences. I wonder how well you could port the maps in this game to Doom...
93143 wrote:
Good thing it's not me conceptualizing it - I'd have insisted on doing a full 3D version with the Super FX2 and software texture re-rendering or some nonsense... with explicit XBAND support, of course... I haven't yet learned that there's a difference between what a SNES could do and what a SNES should do...
Are you regretting working with the Super FX? Although something like XBAND is obviously out of the question, making a local cable for hooking multiple SNES's together would be awesome. I was actually thinking about this in the old discussion about how a 2 player F-Zero wouldn't be (easily) possible because of only having one tilemap.
Wait... SNES Doom could be 2 player!?
https://www.youtube.com/watch?v=-P3JGxBNUyM I didn't think any game actually worked around the XBAND. It's not like there's a game built around the Game Genie anyway.
Espozo wrote:
Quote:
Registers 211b through 2120 are 16 bits wide.
0x211B is also used as the 16-bit multiplicand for registers 0x2134-6 (write twice)
0x211C is also used as the 8-bit multiplier for registers 0x2134-6
What does it mean by "write twice" though? Is it just that it's two 8 bit writes, or one 16 bit write?
Talking purely about mode 7 crap:
It means two 8-bit writes, in the order of low byte followed by high byte of whatever 16-bit multiplicand value you have. The 16-bit value is signed, so the MSB of the 16-bit value defines signage.
A native 16-bit write to $211B would write to both registers $211B and $211C, which isn't what you want.
Proper order of operation:
1. Write 8-bit value (lower byte of multiplicand) to $211b
2. Write 8-bit value (upper byte of multiplicand) to $211b
3. Write 8-bit value (multiplier) to $211c
4. Read $2134, $2135, $2136 for results
I've never done mode 7 stuff, and these registers are actually described more thoroughly in the official documentation, including mentioning something about decimal placement.
If what you're looking for is literal/absolute unsigned multiplication or division, then those can be done through different registers. For 8-bit multiplication (with a 16-bit result):
1. Write $4202 (multiplicand) (8-bit)
2. Write $4203 (multiplier) (8-bit)
3. Wait 8 CPU cycles
4. Read $4216 (result or "product") (16-bit)
For division:
1. Write $4204 (dividend) (16-bit)
2. Write $4206 (divisor) (8-bit)
3. Wait 16 CPU cycles
4. Read $4214 (quotient) (16-bit)
5. Read $4216 (remainder) (16-bit)
The order in which you write to these registers matters.
"Wait N CPU cycles" means you won't have a valid result (in $4216/4217) until N number of CPU cycles has passed. You can do other things in the meantime (i.e. you don't need to do 8 or 16 cycles worth of NOPs unless you really want to).
Espozo wrote:
Are you regretting working with the Super FX?
No, no. It's just that in this case, judging by how Doom turned out, I imagine it'd probably be better to go with a demake rather than trying for a direct port. In the case of the shmup I'm porting, based on my calculations I think I can pretty much preserve the gameplay and (to an extent) the graphical look and feel without having to redesign it to fit on the platform. Splatoon is a different matter.
Why add a coprocessor (thus "cheating") and all the programming headaches of a 3D Super FX game in order to produce an end product that looks and plays
worse than a 2D version that would only have needed the base hardware? Not to mention that while a 2D version would be its own thing, a 3D version would just be a bad version of the Wii U game.
As I said, I'm speaking against my instincts here...
koitsu wrote:
"Wait N CPU cycles" means you won't have a valid result (in $4216/4217) until N number of CPU cycles has passed.
If I recall correctly, the ALU multiplier does give a valid result early if the numbers being multiplied are small. Taz-Mania relies on this behaviour, and I believe higan emulates it correctly. As far as I know, no other emulator even bothers with the delay.
Wait, I'm an idiot, instead of leaving empty space at the graphic for the splat at the bottom and the top, I could just still draw the splat tile by tile, but I'd have special code for each tile height. It would go through the first tile height loop for one whole row, then it would go to the second for more than likely multiply rows and this would undoubtedly be 8 pixels tall, and then it would go to the last loop. Although there'd be slightly more preparation at the beginning for this, it would probably still save time overall and a little memory too. I'm working on it now.
Espozo wrote:
I was actually thinking about this in the old discussion about how a 2 player F-Zero wouldn't be (easily) possible because of only having one tilemap.
You know, even F-Zero X and GX don't have multiplayer GPs like Mario Kart. It's always head-to-head with maybe a couple of CPU opponents. And X provides a precedent for streamlining the graphics in multiplayer modes. In that context, it seems less unreasonable to use double-buffered quarter-maps (possibly with reduced texture resolution to help ensure that the edges of a quarter-map are always out of sight) and devote 4 KB of DMA per frame to tilemap updates; there should be enough room for software-scaled sprites at a reasonable frame rate if there can only ever be 6 of them.
And I'm sure the SA-1 in an F-Zero SX cart could spare the ~6% of a frame it would take to build a half-dual-quarter-map for both players with repeated ROM-to-BWRAM DMA transfers, so it's ready for an easy one-shot DMA to VRAM during VBlank. You wouldn't need to waste VBlank time on Mode 7 HDMA tables for non-flat courses either, since unlike the Super FX (which either hogs its memory or can't use it at all) the SA-1 can share memory with the S-CPU, so if you build the tables in BWRAM there's no need to move them.
...see, that's how I normally think about SNES development (and I've barely scratched the surface of what I'd want to do with that game). I'm apparently a big fan of enhancement chips. I know lots of people consider it cheating, but I just don't feel it...
There's a big difference between the kind of chips that were released at a time and a coprocessor based on modern hardware.
93143 wrote:
You know, even F-Zero X and GX don't have multiplayer GPs like Mario Kart. It's always head-to-head with maybe a couple of CPU opponents. And X provides a precedent for streamlining the graphics in multiplayer modes. In that context, it seems less unreasonable to use double-buffered quarter-maps (possibly with reduced texture resolution to help ensure that the edges of a quarter-map are always out of sight) and devote 4 KB of DMA per frame to tilemap updates; there should be enough room for software-scaled sprites at a reasonable frame rate if there can only ever be 6 of them.
I know. We went over this, it's just at the time I didn't know. Anyway, I'm not interested in doing that, at the moment anyway.
93143 wrote:
I know lots of people consider it cheating, but I just don't feel it...
Well, I mean, I wouldn't compare a game using software rendering vs. the Super FX... It's its own category.
Sik wrote:
There's a big difference between the kind of chips that were released at a time and a coprocessor based on modern hardware.
Well, this is an entirely different category.
Anyway, about having the code for each tile height, I think I'll have the code for an 5 pixel height and then an 8 pixel height and then a 3 pixel height grouped together because it's faster, but I'm also lazy...
The problem with this is though that it will always assume the splat is always at least 16 pixels tall, but I could always make it to where it skips the middle 8 pixel loop if it detects it's not big enough. I have it to where it skips this thing altogether and only goes to an 8 pixel loop if it's in exactly the right position, and by the nature of how this is, it works with 8 pixel high splats. I'll work on it again when I get home. In programming this, there's a lot to keep track of... (To me anyway.)
I'm sorry; I'm not intending to drag your thread off topic. I had just realized that with an SA-1, the map preparation actually seems quite tractable. And since you had mentioned it, and I had been generating (largely useless) noise over my tendency to want to use special chips with everything...
I might be interested in doing it. But probably not until after my current project, which is going to take a while...
I've posted my realization to the F-Zero 2-player thread, where it belongs.
It's fine. Most of the time, I derail my own topics anyway...
Anyway, what I said earlier about having the unrolled loop thing, (That's what an unrolled loop is, isn't it?) that's definitely the best option. I forgot how drawing four pixels at the top of a tile and drawing four at the bottom aren't exactly the same thing...
Also, I have the graphics format down for how I want the splats. it's going to alternate between a word that's a mask and a word that's the actual pattern, like I always thought I would. However, the main thing is I'll have it to where when the tile's completed, instead of going to the one on the right until the end of the row, it'll go all the way down and then over one. The reason for this is that regardless of vertical position of the splat, every line down is incremented by the same amount of data, which isn't the case for if it were the normal way because of tile boundaries. Luckily, I have no homework or tests so I can work on this again. I did not expect this to be even remotely as difficult as it is...
Also, are you squidding (sorry...
) me Nintendo?
There are people in territories outside of Japan that play this game... Why aren't we allowed to have cheap gummies and crappy plastic watches and
picture books mangas and admittedly cool plushies and t-shirts?
Here's the rendering code for a splat that's been shifted down 3 pixels. (If the splat is partially off the screen vertically, it may skip the first tile row code and jump to the second or third set. This would be coded elsewhere.) Holy crap... I didn't expect it to be even a fraction as difficult to make as this, and the worst part is that I have to make this
six more times... (The one for if the tile is directly in the right spot won't be nearly as difficult) I know this is a giant waste of memory, but frankly, I'm going to be more worried about my sanity in copying this several times... In the jump table for selecting what code to jump to because of the different vertical heights, I'll have all of them point to this, and if it works, I'll copy it. I need to fix the preparation code for this now. Anyway...
Code:
start_draw_offset_3_splat:
ldx SplatBufferPosition
cpx EndOfBuffer
bcs draw_offset_3_splat_done
ldy SplatGraphicOffset
lda SplatGraphicOffset
clc
adc #$0014
sta SplatGraphicOffset
draw_offset_3_splat_first_rows:
lda Buffer+6,x
and #$0000,y
ora #$0002,y
sta Buffer+6,x
lda Buffer+8,x
and #$0004,y
ora #$0006,y
sta Buffer+8,x
lda Buffer+10,x
and #$0008,y
ora #$000A,y
sta Buffer+10,x
lda Buffer+12,x
and #$000C,y
ora #$000E,y
sta Buffer+12,x
lda Buffer+14,x
and #$0010,y
ora #$0012,y
sta Buffer+14,x
inc TilesDrawnHorizontally
cmp SplatTileWidth
bcs start_offset_3_splat_middle_rows
txa
clc
adc #$0010
tax
tya
clc
adc SplatDataHeight
tay
bra draw_offset_3_splat_first_rows
;======================================================================
start_draw_offset_3_splat_middle_rows:
ldx SplatBufferPosition
stz TilesDrawnHorizontally
txa
clc
adc DataPerRowInBuffer
tax
ldy SplatGraphicOffset
draw_offset_3_splat_middle_rows:
lda Buffer,x
and #$0000,y
ora #$0002,y
sta Buffer,x
lda Buffer+2,x
and #$0004,y
ora #$0006,y
sta Buffer+2,x
lda Buffer+4,x
and #$0008,y
ora #$000A,y
sta Buffer+4,x
lda Buffer+6,x
and #$000C,y
ora #$000E,y
sta Buffer+6,x
lda Buffer+8,x
and #$0010,y
ora #$0012,y
sta Buffer+8,x
lda Buffer+10,x
and #$0014,y
ora #$0016,y
sta Buffer+10,x
lda Buffer+12,x
and #$0018,y
ora #$001A,y
sta Buffer+12,x
lda Buffer+14,x
and #$001C,y
ora #$001E,y
sta Buffer+14,x
lda FullTilesDrawnVertically
cmp FullTileSplatHeight
bcs start_draw_offset_3_splat_last_rows
inc TilesDrawnHorizontally
cmp SplatTileWidth
bcs offset_3_splat_next_row
txa
clc
adc #$0010
tax
tya
clc
adc SplatDataHeight
tay
bra draw_offset_3_splat_middle_rows
offset_3_splat_next_row:
cpx EndOfBuffer
bcs draw_offset_3_splat_done
inc FullTilesDrawnVertically
inc TilesDrawnHorizontally
txa
clc
adc DataPerRowInBuffer
tax
lda SplatGraphicOffset
clc
adc #$0020
sta SplatGraphicOffset
tay
bra draw_offset_3_splat_middle_rows
;======================================================================
start_draw_offset_3_splat_last_rows:
ldx SplatBufferPosition
cpx EndOfBuffer
bcs draw_offset_3_splat_done
stz TilesDrawnHorizontally
txa
clc
adc DataPerRowInBuffer
tax
ldy SplatGraphicOffset
draw_offset_3_splat_last_rows:
lda Buffer,x
and #$0000,y
ora #$0002,y
sta Buffer,x
lda Buffer+2,x
and #$0004,y
ora #$0006,y
sta Buffer+2,x
lda Buffer+4,x
and #$0008,y
ora #$000A,y
sta Buffer+4,x
inc TilesDrawnHorizontally
cmp SplatTileWidth
bcs draw_offset_3_splat_done
txa
clc
adc #$0010
tax
tya
clc
adc SplatDataHeight
tay
bra draw_offset_3_splat_last_rows
I hope it works...
I know. Where did I get that wrong?
Also, you actually read it?
I think this should be the top, "preparation" part. It's got initialization stuff I still need to do, but it's mostly done if there aren't any problems I find with it. I still need to code where it says what tiles need to be uploaded, but I'll try and upload the tiles from a fixed position and if it works correctly, I'll implement it then. I also need to implement the splat being only partially on the buffer, but I have no clue how I want to do that. Oh yeah, and collision detection. Some tiles would be don't draw at all, and then some would be draw with an extra mask for diagonal walls and stuff like that. If I ever get this far, I kind of wonder how I'll handle collision detection with the player and the ink. I'll probably just check one pixel in the middle of them and see what color it is. Anyway...
Code:
lda SplatRequestTable+YPosition,x
and #$0000000000000111
asl
bne continue_calculating_vertical_offset
sta SplatSubTileYPosition
continue_calculating_vertical_offset:
lda SplatRequestTable+YPosition,x
ror
ror
ror
sep #$20 ;A=8
sta $4202
lda DataPerRowInBuffer
sta $4203 ;Apparently, there's some sort of waiting time for this?
lda SplatRequestTable+Height,x
sta FullTileSplatHeight
asl
asl
asl
asl
asl
sta DataSplatHeight
lda SplatRequestTable+Width,x
sta TileSplatWidth
rep #$30 ;A=16, X/Y=16
lda $4216
sta SplatBufferPosition
;======================================================================
lda SplatRequestTable+XPosition,x
ror
ror
and #%0011111111111111
clc
adc SplatBufferPosition
sta SplatBufferPosition
lda SplatRequestTable+XPosition,x
and #%0000000000000111
bne continue_calculating_x_offset
inc TileSplatWidth
lda SplatRequestTable+XPosition,x
and #%0000000000000111
continue_calculating_x_offset:
clc
adc SplatRequestTable+GraphicOffset,x
sta SplatGraphicOffset
lda SplatSubTileYPosition
asl
tax
jsr (VariableYOffsetCodeJumpTable,x)
...I just noticed how the Splatoon candy says "ikasu", just like on the Batman squid snacks... Well, at least it makes sense here.
I'm
fairly certain this should work (if I actually build the jump table, of course.) I just got rid of the indexing on the "SplatRequestTable" because I'm only loading one splat right now, but I'll add it back latter. I'm just trying to take baby steps.
Code:
start_draw_splat:
rep #$30 ;A=16, X/Y=16
lda SplatRequestTable+YPosition
and #$0000000000000111
asl
bne continue_calculating_vertical_offset
sta SplatSubTileYPosition
continue_calculating_vertical_offset:
lda SplatRequestTable+YPosition
ror
ror
ror
sep #$20 ;A=8
sta $4202
lda DataPerRowInBuffer
sta $4203 ;Apparently, there's some sort of waiting time for this?
lda SplatRequestTable+Height
sta FullTileSplatHeight
asl
asl
asl
asl
asl
sta DataSplatHeight
lda SplatRequestTable+Width
sta TileSplatWidth
rep #$30 ;A=16, X/Y=16
lda $4216
sta SplatBufferPosition
;======================================================================
lda SplatRequestTable+XPosition
ror
ror
and #%0011111111111111
clc
adc SplatBufferPosition
sta SplatBufferPosition
lda SplatRequestTable+XPosition
and #%0000000000000111
bne continue_calculating_x_offset
inc TileSplatWidth
lda SplatRequestTable+XPosition
and #%0000000000000111
continue_calculating_x_offset:
clc
adc SplatRequestTable+GraphicOffset
sta SplatGraphicOffset
lda SplatSubTileYPosition
asl
tax
jsr (VariableYOffsetCodeJumpTable,x)
;======================================================================
VariableYOffsetCodeJumpTable: (I don't remember the structure for
tables... I'm using the school's computer)
;======================================================================
start_draw_offset_3_splat:
ldx SplatBufferPosition
cpx EndOfBuffer
bcs draw_offset_3_splat_done
ldy SplatGraphicOffset
lda SplatGraphicOffset
clc
adc #$0014
sta SplatGraphicOffset
draw_offset_3_splat_first_rows:
lda Buffer+6,x
and #$0000,y
ora #$0002,y
sta Buffer+6,x
lda Buffer+8,x
and #$0004,y
ora #$0006,y
sta Buffer+8,x
lda Buffer+10,x
and #$0008,y
ora #$000A,y
sta Buffer+10,x
lda Buffer+12,x
and #$000C,y
ora #$000E,y
sta Buffer+12,x
lda Buffer+14,x
and #$0010,y
ora #$0012,y
sta Buffer+14,x
inc TilesDrawnHorizontally
cmp SplatTileWidth
bcs start_offset_3_splat_middle_rows
txa
clc
adc #$0010
tax
tya
clc
adc SplatDataHeight
tay
bra draw_offset_3_splat_first_rows
;======================================================================
start_draw_offset_3_splat_middle_rows:
ldx SplatBufferPosition
stz TilesDrawnHorizontally
txa
clc
adc DataPerRowInBuffer
tax
ldy SplatGraphicOffset
draw_offset_3_splat_middle_rows:
lda Buffer,x
and #$0000,y
ora #$0002,y
sta Buffer,x
lda Buffer+2,x
and #$0004,y
ora #$0006,y
sta Buffer+2,x
lda Buffer+4,x
and #$0008,y
ora #$000A,y
sta Buffer+4,x
lda Buffer+6,x
and #$000C,y
ora #$000E,y
sta Buffer+6,x
lda Buffer+8,x
and #$0010,y
ora #$0012,y
sta Buffer+8,x
lda Buffer+10,x
and #$0014,y
ora #$0016,y
sta Buffer+10,x
lda Buffer+12,x
and #$0018,y
ora #$001A,y
sta Buffer+12,x
lda Buffer+14,x
and #$001C,y
ora #$001E,y
sta Buffer+14,x
lda FullTilesDrawnVertically
cmp FullTileSplatHeight
bcs start_draw_offset_3_splat_last_rows
inc TilesDrawnHorizontally
cmp SplatTileWidth
bcs offset_3_splat_next_row
txa
clc
adc #$0010
tax
tya
clc
adc SplatDataHeight
tay
bra draw_offset_3_splat_middle_rows
offset_3_splat_next_row:
cpx EndOfBuffer
bcs draw_offset_3_splat_done
inc FullTilesDrawnVertically
inc TilesDrawnHorizontally
txa
clc
adc DataPerRowInBuffer
tax
lda SplatGraphicOffset
clc
adc #$0020
sta SplatGraphicOffset
tay
bra draw_offset_3_splat_middle_rows
;======================================================================
start_draw_offset_3_splat_last_rows:
ldx SplatBufferPosition
cpx EndOfBuffer
bcs draw_offset_3_splat_done
stz TilesDrawnHorizontally
txa
clc
adc DataPerRowInBuffer
tax
ldy SplatGraphicOffset
draw_offset_3_splat_last_rows:
lda Buffer,x
and #$0000,y
ora #$0002,y
sta Buffer,x
lda Buffer+2,x
and #$0004,y
ora #$0006,y
sta Buffer+2,x
lda Buffer+4,x
and #$0008,y
ora #$000A,y
sta Buffer+4,x
inc TilesDrawnHorizontally
cmp SplatTileWidth
bcs draw_offset_3_splat_done
txa
clc
adc #$0010
tax
tya
clc
adc SplatDataHeight
tay
bra draw_offset_3_splat_last_rows
;======================================================================
draw_offset_3_splat_done:
stz FullTilesDrawnVertically
rts
Espozo wrote:
I know. Where did I get that wrong?
Also, you actually read it?
You indexed Y by an immediate. I didn't read the whole thing, I just noticed the repetitive part.
psycopathicteen wrote:
You indexed Y by an immediate.
Several times. I was wondering if this wasn't a 65816 thing...
Oh, I see now, when I was loading from $0000 which was indexed to get me the address of where the splat graphic is for drawing it... Yeah, I wasn't thinking.
It's an easy fix though.
I'll correct it, and then I'll make the jump table and then I'll make a 16x16 block graphic. I'll upload the tiles manually at first, and if it works, I'll go from there. I wonder how slow this will be... At least although there will be plenty of splats, there shouldn't be way too many appearing on the same frame. It's not like the guns shoot every frame. (The charger is a wide line of splats though.) I just thought about something, and it's that the roller is really going to suck trying to make.
Yeah, I fixed it now, and I actually noticed that I left part of my code unfinished, (the part where it decides on what graphics to choose based on the sub tile x position) so I fixed that along with some other mess ups. I finished the graphics for 8 copies of a 16x16 block, but it's so large and you already know how it's going to look so I won't bother showing it here. (I'm really going to need a tool for it if I ever get anywhere...) Hopefully, I'll have something show up onscreen soon.
Code:
start_draw_splat:
rep #$30 ;A=16, X/Y=16
lda SplatRequestTable+YPosition
and #$0000000000000111
asl
bne continue_calculating_vertical_offset
sta SplatSubTileYPosition
continue_calculating_vertical_offset:
lda SplatRequestTable+YPosition
ror
ror
ror
sep #$20 ;A=8
sta $4202
lda DataPerRowInBuffer
sta $4203 ;Apparently, there's some sort of waiting time for this?
rep #$30 ;A=16, X/Y=16
lda SplatRequestTable+Height
sta FullTileSplatHeight
asl
asl
asl
asl
asl
sta SplatDataHeight
lda SplatRequestTable+Width
sta TileSplatWidth
lda $4216
sta SplatBufferPosition
;======================================================================
lda SplatRequestTable+XPosition
and #%0000000000000111
bne continue_calculating_x_offset
inc TileSplatWidth
lda SplatRequestTable+XPosition
and #%0000000000000111
sep #$20 ;A=8
sta $4202
lda SplatRequestTable+DataSize
sta $4203 ;Apparently, there's some sort of waiting time for this?
rep #$30 ;A=16, X/Y=16
lda SplatRequestTable+XPosition
ror
ror
and #%0011111111111111
clc
adc SplatBufferPosition
sta SplatBufferPosition
lda $4216
sec
sbc SplatDataHeight
continue_calculating_x_offset:
clc
adc SplatRequestTable+GraphicOffset
sta SplatGraphicOffset
lda SplatSubTileYPosition
asl
tax
jsr (VariableYOffsetCodeJumpTable,x)
;======================================================================
VariableYOffsetCodeJumpTable:
.word start_draw_offset_3_splat,start_draw_offset_3_splat
.word start_draw_offset_3_splat,start_draw_offset_3_splat
.word start_draw_offset_3_splat,start_draw_offset_3_splat
.word start_draw_offset_3_splat,start_draw_offset_3_splat
;======================================================================
start_draw_offset_3_splat:
ldx SplatBufferPosition
cpx EndOfBuffer
bcs draw_offset_3_splat_done
ldy SplatGraphicOffset
lda SplatGraphicOffset
clc
adc #$0014
sta SplatGraphicOffset
draw_offset_3_splat_first_rows:
lda Buffer+6,x
and $0000,y
ora $0002,y
sta Buffer+6,x
lda Buffer+8,x
and $0004,y
ora $0006,y
sta Buffer+8,x
lda Buffer+10,x
and $0008,y
ora $000A,y
sta Buffer+10,x
lda Buffer+12,x
and $000C,y
ora $000E,y
sta Buffer+12,x
lda Buffer+14,x
and $0010,y
ora $0012,y
sta Buffer+14,x
inc TilesDrawnHorizontally
cmp SplatTileWidth
bcs start_offset_3_splat_middle_rows
txa
clc
adc #$0010
tax
tya
clc
adc SplatDataHeight
tay
bra draw_offset_3_splat_first_rows
;======================================================================
start_draw_offset_3_splat_middle_rows:
ldx SplatBufferPosition
stz TilesDrawnHorizontally
txa
clc
adc DataPerRowInBuffer
tax
ldy SplatGraphicOffset
draw_offset_3_splat_middle_rows:
lda Buffer,x
and $0000,y
ora $0002,y
sta Buffer,x
lda Buffer+2,x
and $0004,y
ora $0006,y
sta Buffer+2,x
lda Buffer+4,x
and $0008,y
ora $000A,y
sta Buffer+4,x
lda Buffer+6,x
and $000C,y
ora $000E,y
sta Buffer+6,x
lda Buffer+8,x
and $0010,y
ora $0012,y
sta Buffer+8,x
lda Buffer+10,x
and $0014,y
ora $0016,y
sta Buffer+10,x
lda Buffer+12,x
and $0018,y
ora $001A,y
sta Buffer+12,x
lda Buffer+14,x
and $001C,y
ora $001E,y
sta Buffer+14,x
lda FullTilesDrawnVertically
cmp FullTileSplatHeight
bcs start_draw_offset_3_splat_last_rows
inc TilesDrawnHorizontally
cmp SplatTileWidth
bcs offset_3_splat_next_row
txa
clc
adc #$0010
tax
tya
clc
adc SplatDataHeight
tay
bra draw_offset_3_splat_middle_rows
offset_3_splat_next_row:
cpx EndOfBuffer
bcs draw_offset_3_splat_done
inc FullTilesDrawnVertically
inc TilesDrawnHorizontally
txa
clc
adc DataPerRowInBuffer
tax
lda SplatGraphicOffset
clc
adc #$0020
sta SplatGraphicOffset
tay
bra draw_offset_3_splat_middle_rows
;======================================================================
start_draw_offset_3_splat_last_rows:
ldx SplatBufferPosition
cpx EndOfBuffer
bcs draw_offset_3_splat_done
stz TilesDrawnHorizontally
txa
clc
adc DataPerRowInBuffer
tax
ldy SplatGraphicOffset
draw_offset_3_splat_last_rows:
lda Buffer,x
and $0000,y
ora $0002,y
sta Buffer,x
lda Buffer+2,x
and $0004,y
ora $0006,y
sta Buffer+2,x
lda Buffer+4,x
and $0008,y
ora $000A,y
sta Buffer+4,x
inc TilesDrawnHorizontally
cmp SplatTileWidth
bcs draw_offset_3_splat_done
txa
clc
adc #$0010
tax
tya
clc
adc SplatDataHeight
tay
bra draw_offset_3_splat_last_rows
;======================================================================
draw_offset_3_splat_done:
stz FullTilesDrawnVertically
rts
Wait, I forgot, what's the difference between "BSS" and "BSS7E"? I wanted to make a screen sized, 2bpp buffer, but it said there was a memory overflow so I moved it and it hasn't said anything else.
I was beginning to think about actual graphics, and there's no way to have individual clothes and stuff that aren't one color without overlaying 4 sprites over each other for one character, and even then, I'm not sure. I'd have all the clothes be the same for each character, but I'd have different weapons that would be separate sprites. I'll figure it all out when if I get there.
My linker script has three BSS segments.
- BSS: $000000-$001FFF
- BSS7E: $7E2000-$7EFFFF
- BSS7F: $7F0000-$7FFFFF
I split out BSS from BSS7E because only the first 8K of RAM is mirrored into LoROM banks ($00-$3F and $80-$BF).
Ah, so it's fine. Thanks.
tepples wrote:
the first 8K of RAM is mirrored into LoROM banks ($00-$3F and $80-$BF).
What's the point of that?
It's mirrored into $00 to allow the direct page and stack to work, as the 65816 requires them to be in bank $00.
It's mirrored into $00-$3F and $80-$BF so that LoROM programs have some RAM to use no matter what the data bank register is set to.
Something that I'm trying to do that's making no sense to me is I'm trying to load the buffer position in ram (the begining of BSS7E) + the size of the buffer (the screen, so 256x224/4=14336) and it's saying that there's a range error, but I'm pretty sure they don't add up to 65536...
The first line is what's (somehow) problematic:
Code:
lda #InkBuffer+InkBufferSize
sta EndOfBuffer
$7E2000 is already greater than $00FFFF. Try using only the low 16 bits if necessary.
Oh, I was thinking about the whole first 8KB thing...
I never really thought about it though, and that's how do you do 24 bit addressing? I mean, I remember on the Irem M92, you had the different registers for it. I think it was ds? Anyway, yeah.
Much as the M92's V33 (8086) has CS, DS, and SS segment registers, the SNES's 65816 has K, B, and 0 bank registers.
Could I have a rundown of what each does?
Before you get knee-deep in 65816 coding, I'd highly recommend you at least take a gander at the classic
Programming The 65816, available for free download on our very own Wiki.
AFAICT, lidnariq's nomenclature is non-standard. I'm also, TBH, not exactly sure which
three registers they're talking about - there's only two bank registers on the 65816, the data bank register and the program bank register. At risk of stating the obvious, the data bank is used for data accesses and the program bank can be thought of as the most significant byte of the 24-bit PC.
On the 65816, many (most?) instructions have long addressing modes that let you use a full 24-bit address. But this is slower and less space-efficient than taking advantage of the bank registers - if you use a 16-bit address, the 65816 will generate the full address by automatically tacking the relevant 8-bit bank register onto the front. This will save a byte in the instruction stream and execute in fewer cycles. Moreover, not every addressing mode for every instruction can be made long, so if you really want to take advantage of everything the chip can do you need to be bank-aware.
I think I recall reading a post here from someone who disassembled some early SNES games and found that one of the reasons for all of their slowdown was that they basically used 24-bit addresses for everything, since their programmers hadn't quite grasped how to effectively use the banks or something.
adam_smasher wrote:
AFAICT, lidnariq's nomenclature is non-standard.
While they're called the DBR and PBR, they're set using the
PusHBPulLB and
PusHKPulLK instructions respectively ...
Of course, "B" also means "the hidden half of the 16-bit A register when A is 8 bits wide" so, I think I'll just blame them for being bad at naming things.
(much like it both having the 16-bit C register and the C bit)
Or the X register and the X bit... or the D register and the D bit...
Well, I haven't done anything for a long time now toward this because I've been busy and I don't feel like reading an entire manual for one piece of information, so I made this:
Attachment:
SNES Splatoon Squid.png [ 485 Bytes | Viewed 2229 times ]
It didn't turn out quite as well as I thought it would, but at the same time, I have no clue how to improve it. I don't know if everything is really proportionally correct because I just looked at pictures, but I'd say it's "good enough". Better than the other shit I've seen people have drawn anyway...
I don't see any possible way to implement different clothing if I ever get there because of color and tile issues. I hate perks anyway.
I'm going to miss the hat shop though, If ya know what I mean...
Well, actually, I feel like being productive now and I downloaded the pdf for the 65816 manual, but I realized that I don't really know what I'm looking for...
However, I narrowed down my search to two chapters:
Attachment:
Table of Contents.png [ 409.87 KiB | Viewed 2163 times ]
When it says "long" addressing, is it talking about 24 bit?
Anyway, about artwork, I realized something, and that's that I'm not really sure how to go about something. I could either go about having a palette with the two main colors and then whatever extra for items and whatnot and then have a palette for the rest of the characters (the characters would use overlayed sprites) or I could essentially have a copy of the palette for each character but then I wouldn't have the extra half palette and I figure that if I want the character to visibly take "damage" in that the other team's color is on them, I'd need to use overlaying sprites anyway. Thinking about how many colors each character should take up (4 for hair, 4 for white shirt and black pants, 3 for skin, 3 for green colors (tank on back) 2 for red colors (also tank on back) 2 for shoes, 2 for laces) there's no way I could include the hair as part of the sprite. I would have to make 2 different copies of each character for each color, (4 really for male and female) but oh well. By the way, I'm basing how I want the character to look off this picture: (This is the smallest picture I could find)
One thing I've found out about this game is that the lighting is really weird in that things don't go from one color straight down in value, but they kind of actually change color like how orange starts to have a purple tint (It's not just the color of the tentacles, look at the shadow that's being projected on them from the gun) Speaking of the tentacles, the spots on the bottom on some pictures seem to be darker than the area they're on, and then lighter in other pictures. Even this official image isn't sure:
Just rambling on, I don't see how I'll be able to get all the gun colors in 8 palettes, (these will undoubtedly be different sprites) so I think I'll reserve 3-4 palettes and split them up to where for each gun onscreen. (there will be several copies of the guns) I'll fit the characters and the display and the special and sub weapon and tower/rainmaker palettes in to whatever is left. If this gets finished, there's going to be a lot of seemingly redundant data... 4 copies for hair (the body can probably stay the same for the girl and the boy, even if the girl has one big... difference...) 2 for the squids, 2-3 for the guns, and 16 copies of the splats (8 positions x 2 colors) The animations for the characters will be standing in 3 directions, (mirroring will be used, and there are only 8 possible directions on the d-pad so 3 copies for 3 directions) walking in 3 directions, shooting in 3 directions, strafing while shooting up in 2 directions (the game will have you start strafing when you start shooting like the real game) (backwards/left could be up/right played in reverse) shooting while strafing diagonally in 2 directions, and shooting while strafing right in 2 directions. Oh yeah, and I'm forgetting throwing weapons and jumping if I ever do that. Yeah, I'll definitely want tentacles to be their own palette so they can be their own sprite and be "free flowing". The ink palette could double as the score board. That gives me 2-3 palettes for subweapons and specials and the tower/rainmaker, and some can take the character and ink palette, like the splat bombs because they're the same color as the ink tank. Holy crap though, there's going to be beyond NES levels of sprite overlays (the hair, the body, the gun, and the ink visible in the gun).
You know, I just thought of something technical. On the one mode I always wrote off as useless (the 8bpp layer with the 2bpp one and offset per column) does it always look for the column offset, or is there a way to shut it off? I think I'll use that mode because I don't want the ink to be 4bpp (takes to long to draw it, and if I want it to look glossy but not screw up, it will take some fancy work on it, but most importantly, the size in vram is too big) but I want the background to be 8bpp because why not. Worse comes to worst, I'll just have it point to a table where the offset for every column has no offset and I'll still save a lot of space.
I'm sorry guys, but I couldn't help myself. I looked up "inkling" on Google images for the reference picture, and this way on the fifth line (not even a tenth of the way down the page) of pictures...
Attachment:
Uhh....png [ 164.2 KiB | Viewed 2163 times ]
It's amazing how I never found this stuff on the internet when I was younger...
Back in the good ol mid 2000's, when people weren't attached to their phones like life support and everything didn't beg for your money.
How do I always get so off topic? I think I have a mental disorder...
A very lame update, but I'm just here to say that I made the palette:
Attachment:
Inkling Body Palette.png [ 227 Bytes | Viewed 2140 times ]
However, you'll notice it's one color too many. I'm split between making the brightest shirt and skin color the same, or having only one shoe lace color (it's the blue color). Or maybe I could compromise with the darkest part of the shoe and the dark red on the ink tank. I find when doing this sort of thing, it's usually better to do it with areas that aren't close together. Actually, I don't really know what's close together because everything is overlapping because you're looking at the character from an overhead view. I probably don't need as many shades of a certain color then because it'll be hard to see fully. Whenever the character is hit with ink, I'll probably only draw the enemy ink on the hair because that'll be less sprites and less animation (the palette for both colors is the same)
I think I'll just draw it with all the colors and then see what I want to get rid of.
Actually, I really just blew everything out of proportion in regards to code. The only real thing that's a problem in regards to 24 bit addressing is this section right here:
Code:
lda InkBuffer,x
This is easy though, as it's just "absolute long indexed with X". (Apparently I got lucky, because it doesn't work with Y?) Because X is 16 bit and the highest possible value is 65536, you'd get 65536, multiply it by 4 (4 2bit pixels make up one byte) and then square root it for 512, so the largest possible map this way would be 512x512 or some variation of that. I'm not sure that that's large enough though.
Apparently, my memory overflow was caused by me being an idiot and doing this:
Code:
lda #InkBuffer+InkBufferSize
sta EndOfBuffer
When looking at the first code snippet I showed you, you'll notice that it's actually starting at InkBuffer, so this is screwed up:
Code:
cpx EndOfBuffer
If I just get rid of the "+InkBufferSize", there is no range error. Like I said, I'll probably edit the code to enable a bigger map, but I'm trying to take it one step at a time. I still need to upload the tiles of where the thing is in ram to vram. This is now one of those moments where I'm scared to put it all together and run it because I know there'll be something wrong...
I just realized I quadruple posted... Well, hopefully I'll post again in that I'll get it working.
Espozo wrote:
or having only one shoe lace color (it's the blue color).
Considering the shoelaces are going to be a handful of pixels at most (if even that)...
Well, I made the status bar for the most part. It was really difficult to figure out though because out of all the pictures I've seen of the game, only one was actually straight up what the Wii U was outputting and not some blurry garbage.
Anyway...
Attachment:
SNES Splatoon.png [ 2.19 KiB | Viewed 2008 times ]
I made it to where the body palette shares some of the same gray colors from the ink/scoreboard palette for consistency, for I just think it would look nice if the gray in the character was the same. throughout. I know, I'm weird. I changed to orange palette because I originally wanted it to turn more purple latter, but I looked at it on another computer and it was way to vibrant. I don't know if it was just that computer, or if it's mine, but it's not like it's difficult to change the palette latter. Also, I think the "official" way the spots are supposed to be is that they're lighter than the very bottom, but darker than the top, so there's a small area where they aren't to visible. Anyway, I did this and I think it looks better.
Luckily, I have a week off for spring break so hopefully I'll be able to do more. I think I'll try and see if the code works. I didn't want to have it set up to be drawn and have it not work because then I'd freak out and wouldn't have an opportunity to fix it.
I just thought of something... I'm trying to DMA part of the buffer to VRAM, and it has the spot for the bank byte that I filled out with "^", but with the rest of the address, (the other 16 bits) how do I go about this? Just writing "InkBuffer" gives me a range error.
Actually I figured I could just look at the values in ram, but what I found out is that although $7E0000 is supposed to be the start of the buffer because it's the first thing in "BSS7E", there are random, constantly changing values that start right there. I thought that maybe the drawing code was jacked up, but I got rid of the jump to there and it didn't change anything in that the weird values were still there.
I don't know what's up with this either:
Attachment:
SNES RAM.png [ 12.26 KiB | Viewed 1933 times ]
Again though, nothing else should be in $7E0000:
Code:
;======================================================================
.segment "ZEROPAGE"
;======================================================================
ColorCounter: .res 2
Joy1Data: .res 2
Joy2Data: .res 2
Joy1Press: .res 2
Joy2Press: .res 2
NewObjectRequest: .res 2
ObjectOffset: .res 2
SpriteCount: .res 2
MetaspriteCount: .res 2
HFlipMask: .res 2
VFlipMask: .res 2
MetaspriteAttributes: .res 2
TilesDrawnHorizontally: .res 2
FullTilesDrawnVertically: .res 2
SplatBufferPosition: .res 2
SplatSubTileYPosition: .res 2
SplatGraphicOffset: .res 2
SplatTileWidth: .res 2
SplatFullTileHeight: .res 2
SplatDataHeight: .res 2
DataPerRowInBuffer: .res 2
EndOfBuffer: .res 2
;======================================================================
.segment "BSS"
;======================================================================
ObjectTableEntryNumber= 32
ObjectTableEntrySize= .sizeof(ObjectTableSlot)
ObjectTable: .res ObjectTableEntryNumber * ObjectTableEntrySize
ObjectTableSize= *-ObjectTable
SpriteBuf1: .res 512
SpriteBuf2: .res 32
SpriteBuf3: .res 512
SplatRequestTableEntryNumber= 1
SplatRequestTableEntrySize= .sizeof(SplatRequestTableSlot)
SplatRequestTable: .res SplatRequestTableEntryNumber * SplatRequestTableEntrySize
SplatRequestTableSize= *-ObjectTable
;======================================================================
.segment "BSS7E"
;======================================================================
InkBuffer: .res 14336
InkBufferSize= *-InkBuffer
;======================================================================
.segment "BSS7F"
;======================================================================
;======================================================================
.segment "RODATA"
;======================================================================
I even disabled my object routine to see if that was causing the problem, but it wasn't either. (No objects are actually being drawn, it's just changing the background color every frame. I wanted to see if it properly jumped correctly.)
$7E0000-$7E1FFF is a mirror of $000000-$001FFF. For this reason, the BSS7E segment in my own linker script starts at $7E2000.
Ah, that explains it. I had started to enter panic mode.
Anyway, it doesn't work, but it doesn't crash either (which I've known for a while). It must not be getting to the actual drawing part. I'm really not good at debugging, so this is the part where I just put a brk somewhere and see if it crashes.
(To see if that part of the code is even read.)
Okay, this makes absolutely no sense. I was trying to make the SNES freeze to see if it's going there, and I couldn't make it with the drawing code, even when I made the first line that's supposed to be run "brk", so I was seeing if it even approaches the jump and I couldn't even get that to work until I moved the "brk" higher up.
This freezes it:
Code:
lda #$0040
brk
sta DataPerRowInBuffer
lda #InkBufferSize
sta EndOfBuffer
jsr start_draw_splat
While this doesn't:
Code:
lda #$0040
sta DataPerRowInBuffer
brk
lda #InkBufferSize
sta EndOfBuffer
jsr start_draw_splat
I'm completely stumped. If any of you want to figure out what's wrong, here's the code. It really isn't much aside form the ink drawing code, just a code that identifies objects and jumps to the correct code.
Attachment:
Espozo's Failure.zip [207.2 KiB]
Downloaded 75 times
In both cases, the BRK interrupt is returned from immediately, so it doesn't actually "do" anything on its own. The problem is that BRK is supposed to be treated as a two-byte instruction, but cc65 only represents it as one byte in the actual generated code, so once you return from the interrupt, the first byte of the next instruction is skipped and you're basically executing garbage. What happens then depends on what the next instructions are supposed to be.
If you want the BRK instruction to actually freeze the game, you should make it lead to an infinite loop rather than an RTI. Or ideally, use an emulator that supports breakpoints (which it looks like you're already doing) and use those instead.
no$sns will automatically open the debugger on STP. I believe Snes9XW treats it as a breakpoint too. No idea about bsnes-plus; I haven't used it yet...
The latest version of bsnes-plus has the option to use the otherwise-unused WDM instruction as a breakpoint, so it will only affect execution if that option is enabled but is basically a no-op otherwise. It's probably ideal for development compared to "normal" breakpoints since you don't have to know exactly where the code of interest is located in memory.
Revenant wrote:
If you want the BRK instruction to actually freeze the game, you should make it lead to an infinite loop rather than an RTI. Or ideally, use an emulator that supports breakpoints (which it looks like you're already doing) and use those instead.
93143 wrote:
no$sns will automatically open the debugger on STP.
Oh, that's what I wanted to do. I thought "BRK" would just stop the CPU in whatever way, but "STP" is what does that. Yeah, it freezes both times now, like intended. I actually went through the splat drawing code, and somehow, it goes through the entire thing and exits at the right spot (I put STPs at all the exits except the intended one, and I tested to see how far it would go by moving how far I placed the STP) but nothing gets drawn. I'm going to have to look over it carefully. I don't think I've ever made a code work not as intended but without having it crash, so this is new to me.
I think I might have found something out. I've actually been busy even during spring break, so I haven't been able to work on this...
Anyway, instead of
Code:
lda InkBuffer,x
and $0000,y
ora $0002,y
sta InkBuffer,x
I wrote
Code:
lda #$FFFF
sta InkBuffer,x
and looked at $7E2000 to see if anything showed up, but somehow, the game actually crashed. I looked at the spot in ram, and it appeared to be empty, and I then looked around $0000 and it was filled with FFFF. I'm assuming by some miracle that it worked originally, because I thought that "and #$0000" would make everything there 0, and I was speculating that the "y" value was incorrect and not getting the right data but the weirdest thing is that it's crashing even when I wrote
Code:
lda #$0000
sta InkBuffer,x
Anyway though, it really shouldn't be writing anything in that location anyway if long addressing is being used, which I feel it isn't. Is there a way to force it?
Anyway, apparently to add to my confusion, take a look at this:
Attachment:
Writing $0000 vs. writing $FFFF.png [ 25.41 KiB | Viewed 2369 times ]
None of those #$87's should be there, and only two bytes should be filled out at most. I'm assuming another code is looking at the variables in ram that are no incorrect and if freaking out. A couple values actually get updated every frame, surprisingly.
I found something. Putting an "stp" at the end of the drawing code (to where no other code gets written) still produces the same garbage in ram, so it must be something self contained. Isn't there a way to make ca65 give you files that say exactly what the assembler is doing, because it makes 0 sense for it to be screwing up for loading a number there. putting an "stp" after the first thing I did in ram doesn't appear to create the weird crazy patterns I showed you guys, but it might be messing up a variable that's triggering a chain reaction. Again, I need the files to tell me if it's the assembler's fault, and I've already had one of those with a different assembler...
Damn it, does anyone else ever get problems like this?
Edit: I thought it would be wise to upload the .bat file I use for assembling everything, so if anyone knows how to fix it to give me .lst files, here it is:
Code:
@echo off
ca65 Main.asm -o Game.o
if errorlevel 1 goto end
ld65 -o Game.sfc Game.o -C SNES.cfg
if errorlevel 1 goto end
ucon64 -q --snes --nhd --chk %1.sfc >nul 2>&1
echo assembly completed.
if exist Game.o del Game.o
if exist Game.sfc start Game.sfc
exit
:end
echo assembly failed.
if exist Game.o del Game.o
I have two .exe files called ca65 and ld65, if that helps. It looks like several people have already downloaded the .zip file I posted though.
Espozo wrote:
Code:
ca65 Main.asm -o Game.o
Code:
ca65 -l Main.asm -o Game.o
My guess is that you're pushing a register without popping it, building up values on the stack. Checked for that?
nicklausw wrote:
My guess is that you're pushing a register without popping it, building up values on the stack. Checked for that?
I just checked for that, and no, there's no pushing or pulling involved. That shouldn't even mess it up anyway, because it's only ran once. The "lda #$0000" definitely shouldn't mess it up.
Joe wrote:
ca65 -l Main.asm -o Game.o
I tried that but it didn't work. I was actually messing around with it, and I fixed it though by guessing:
Code:
ca65 Main.asm -o Game.o -l Game.lst
Unfortunately, it only makes one file with everything together, but it still works.
I just noticed something. How did I know this would happen...
Code:
0003BFr 2 BD rr rr lda InkBuffer,x
There's kind of an "rr" missing, because "InkBuffer" is in 7EXXXX... Is there a way to save a cycle or two by only having one "rr" for the top 4 of the 24 bits and having the other bottom 8 bits just be offset by x?
If the data segment register B is equal to $7E, you can access variables in $7E0000-$7EFFFF using 2-byte addresses. That might save a cycle.
How would I do this? I just don't want to do long addressing when I don't need to, and the assembler seems to think that InkBuffer fits in a 16 bit address...
Code:
phb
; 8-bit accumulator (EDIT)
lda #$7e
pha
plb
; do stuff with bank $7e
lda $2100 ;wow mom, I loaded $7e2100 !!
plb ; restore old databank
Of course you will be losing cycles unless you can justify the additional stack-processing
Don't forget the accumulator needs to be in 8-bit mode (sep #$20 minimum) for aforementioned to work properly, otherwise you'll eventually end up with a stack overflow. So be sure to add that to your cycle counting, in addition to any necessary rep #$20 or equivalent thereafter.
One trick I like to do is pea two bytes onto the stack and then plb one at a time.
Wait a minute, if the bank register is set to anything other than #$00 and I want to load from something from an area in ram where the top 8 bits are #$00, then won't I have to do "a:" and take up an extra cycle for that?
This is what I'm doing, if it helps. It's taking 8 2bit pixels, masking them, applying a pattern, and then storing them in the original spot. Each step is its own line.
Code:
lda InkBuffer,x
and $0000,y
ora $0002,y
sta InkBuffer,x
The RAM at $0000-$1FFF is mirrored in both bank $00 and bank $7E.
Exhaustively:
The only RAM as such in bank $00 (other than a handful of sometimes-useful bytes in the $43xx range) is what some call shadow RAM, the mirror of the bottom 8K of WRAM. If you're in any bank from $00-$3F or $80-$BF, shadow RAM works normally. If you're accessing bank $7E, shadow RAM still works normally, because $7E:0000-1FFF is what shadow RAM is a mirror of.
If you're in bank $7F or $40-$7D or $C0-$FF, you don't have access to shadow RAM. You can use direct page, because direct page is always in bank $00. But if what you want isn't in direct page, you have to either change the data bank register to access a bank with shadow RAM, change the direct page register so the data you want is in range, or use 24-bit addressing.
Espozo wrote:
load from something from an area in ram
Wait a minute, I'm an idiot, because I meant I want to load something from an area in rom. That's what the
Code:
and $0000,y
ora $0002,y
are for. If it were offset by #$7E, it would mess it up. Actually, looking at it, I just noticed I have to have all the splat graphics in one memory bank. I guess that shouldn't be too bad. I'm going to have to go through all this code at some point anyway for if there's extra ram in the cartridge.
Man, now looking at it, the SNES memory map is awfully complicated...
If you want to access both ROM and RAM without using any sort of 24-bit addressing, then you'll need to ensure that the ROM data is in the top half of bank $80-$BF and the RAM data is in the shadow RAM area ($7E0000-$7E1FFF). If you cannot arrange this, then you must use 24-bit addressing for either the source or the destination.
Are you using LoROM ($20/$30) or HiROM ($21/$31)? Or do you want the details explained for both?
I want the details explained for both...
Isn't HiROM bigger?
I wonder if I'm ever going to understand SNES memory mapping... Every time you guys talk about it, it sounds completely alien to me! It looks like everything is so broken up, it's insane! So yeah, I'm kinda interested in an explanation too!
The SNES's two different memory maps were approximately the two ways that N thought that people might want to use the SNES:
Mode 20 ("LoROM") assumed that people wanted to pretend that the SNES acted approximately like the NES's BNROM: shared resources (RAM from $0000-$1FFF, PPU, sound, DMA, on-cartridge save RAM) were always available in the lowest 64 banks. (Or, if the person paid the then-premium for faster memory, banks $80-$BF).
Mode 21 ("HiROM") was a forward-looking way of thinking about it, assuming that people were actually going to take advantage of the 65816's larger address space, with 24-bit pointers and remapping the direct page.
Of course, since the 65816 has both the PBR and DBR, this division is a little artificial. And the small handful of Mode 20 games that are larger than 2 MiB don't even get the advantages of LoROM in those upper banks (where $40 ≤ DBR ≤ $7D )
In LoROM (mode $20 or $30), PRG ROM runs from $808000 through $80FFFF, $818000 through $81FFFF, $828000 through $82FFFF, ..., $FF8000 through $FFFFFF, for a total of up to 32 megabits 4 MiB. A23 and A15 are ignored. Banks $80-$FD are mirrored into $00-$7D. If the data bank register is set to bank $00-$3F or $80-$BF, the program can access shadow RAM and a ROM bank from the first 16 megabits (2 MiB) at once without using 24-bit addresses.
In HiROM (mode $21 or $31), PRG ROM runs from $C00000 through $FFFFFF, for a total of up to 32 megabits (4 MiB). A23 and A22 are ignored. Banks $C0-$FD are mirrored into $40-$7D, and the second half of each bank is mirrored into $00-$3F and $80-$BF. If the data bank register is set to one of these banks, the program can access shadow RAM and the second half of a ROM bank at once without using 24-bit addresses.
In ExHiROM (mode $25 or $35), PRG ROM runs from $C00000 through $FFFFFF, $400000 through $7DFFFF, $3E8000 through $3EFFFF, and $3F8000 through $3FFFFF, for a total of up to 63.5 megabits (just under 8 MiB). A22 is ignored. The second half of each bank $C0-$FF is mirrored into $80-$BF, and the second half of each bank $40-$7D is mirrored into $00-$3D. If the data bank register is set to one of these banks, the program can access shadow RAM and the second half of a ROM bank at once without using 24-bit addresses.
The difference between $20 and $30, $21 and $31, and $25 and $35 is that $30, $31, and $35 can take advantage of faster 120 ns ROM, which was more expensive at the time than the common 200 ns ROM. The S-CPU runs at 3.6 MHz when accessing fast ROM or 2.7 MHz when accessing RAM or slow ROM. In ExHiROM, only the area from $C00000-$FFFFFF (including its mirrors in $80-$BF) is fast. DMA is always slow (2.7 MHz), so if a game DMAs from ROM, it can put the data used as a DMA source in the slow part and gain the benefit of fast ROM for the rest of the ROM.
I made a
diagram several months ago.
That's confusing as fuck. If I was ever gonna jump into 16-bit development, I'd probably choose the Genesis/MD, which appears to have a much more consistent architecture, without a shit-ton of possible configurations, from memory mappings to video modes. Is there at least a consensus on what's better to do on the SNES these days, when cost isn't such a big factor?
-Banks $00-$3f and $80-bf contain both ROM and RAM.
-Banks $40-$7d and $c0-$ff are ROM only.
-Banks $7e and $7f are RAM only.
tokumaru wrote:
That's confusing as fuck. If I was ever gonna jump into 16-bit development, I'd probably choose the Genesis/MD, which appears to have a much more consistent architecture, without a shit-ton of possible configurations, from memory mappings to video modes.
I'm beating a dead horse here, but the color count kills it for me. Honestly, I think the onscreen color count on the SNES is kind of mediocre, especially when you see an arcade system from 1985-1986 or so that's inferior to the SNES in every way except it has over twice the number of colors. Hell, no need to look further than the PCE, and it was a home console. Anyway, if I were to go for a "balanced" 16 bit system, I would have started out with the M92.
(Unfortunately though, it has piss poor documentation...)
tokumaru wrote:
Is there at least a consensus on what's better to do on the SNES these days, when cost isn't such a big factor?
Definitely FastROM. (You'd be a sucker not to use it.) I don't know about LoROM vs. HiROM, but I'd probably use HiROM because it's easy to switch over to ExHiROM if you find yourself running out of memory with graphics or whatever. (Although I can't imagine myself ever filling up over 4MB worth of graphics at the rate I draw... Better be safe than sorry?)
You know tepples, I was always under the impression that you made this (SNES.cfg) and although I think it's meant for LoROM, will it work for HiROM?
Code:
# ca65 linker config for 256 KiB (2 Mbit) sfc file
# Physical areas of memory
MEMORY {
ZEROPAGE: start = $000000, size = $0100; # $0000-00ff -- zero page
# $0100-01ff -- stack
BSS: start = $000200, size = $1e00; # $0200-1fff -- RAM
BSS7E: start = $7e2000, size = $e000; # SNES work RAM, $7e2000-7effff
BSS7F: start = $7f0000, size = $10000; # SNES work RAM, $7f0000-$7ffff
ROM0: start = $008000, size = $8000, fill = yes;
ROM1: start = $018000, size = $8000, fill = yes;
ROM2: start = $028000, size = $8000, fill = yes;
ROM3: start = $038000, size = $8000, fill = yes;
ROM4: start = $048000, size = $8000, fill = yes;
ROM5: start = $058000, size = $8000, fill = yes;
ROM6: start = $068000, size = $8000, fill = yes;
ROM7: start = $078000, size = $8000, fill = yes;
}
# Logical areas code/data can be put into.
SEGMENTS {
CODE: load = ROM0, align = $100;
RODATA: load = ROM0, align = $100;
SNESHEADER: load = ROM0, start = $ffc0;
CODE1: load = ROM1, align = $100, optional = yes;
RODATA1: load = ROM1, align = $100, optional = yes;
CODE2: load = ROM2, align = $100, optional = yes;
RODATA2: load = ROM2, align = $100, optional = yes;
CODE3: load = ROM3, align = $100, optional = yes;
RODATA3: load = ROM3, align = $100, optional = yes;
CODE4: load = ROM4, align = $100, optional = yes;
RODATA4: load = ROM4, align = $100, optional = yes;
CODE5: load = ROM5, align = $100, optional = yes;
RODATA5: load = ROM5, align = $100, optional = yes;
CODE6: load = ROM6, align = $100, optional = yes;
RODATA6: load = ROM6, align = $100, optional = yes;
CODE7: load = ROM7, align = $100, optional = yes;
RODATA7: load = ROM7, align = $100, optional = yes;
ZEROPAGE: load = ZEROPAGE, type = zp;
BSS: load = BSS, type = bss, align = $100, optional = yes;
BSS7E: load = BSS7E, type = bss, align = $100, optional = yes;
BSS7F: load = BSS7F, type = bss, align = $100, optional = yes;
}
It would need to be modified. But right now, I'm running a bit of a low grade fever, which means I'm not in the mental state to make and test a HiROM.
If you want to try yourself, change the memory area sizes to $10000, the memory area starts to $C00000, $C10000, ..., and the SNESHEADER start to $C0FFC0.
Dang, I switched it, but I forgot how it would give me a bunch of range errors...
Why do all the starter demo stuff use LoROM? I guess they assume that everyone who works on the SNES has started with the NES, but I'm not under the impression that any of the people working on the SNES have worked on the NES.
Anyway, this is my first (and hopefully only) problem it's given me. It's code for jumping to appropriate object code. Direct page is being moved here, which is why "Identity" isn't being offset by anything. It loads "identity", first checks to see if it's 0 (no object), if not, gets the value and offsets the table by it, and loads that value in the table it landed on which is the address of where the code starts. I'm still not entirely sure how "()" works, but I guess "jsr Whatever" and "jsr (Whatever)" are different instructions, because one is jumping to the posted address and the other is jumping somewhere based on the value at that address?
Code:
object_identifier_loop:
lda Identity
beq next_object
tax
jsr (ObjectIdentificationJumpTable-2,x)
Further down...
;========================================================================
.segment "RODATA"
;========================================================================
ObjectIdentificationJumpTable:
.word object1_code
Could I just get the Identity, have it be 24 bit, and somehow transfer the value to the program counter directly? Actually, how would I even do this in one go? You can't transfer 24 bits at once, and that's kind of important here. Yeah, it was a dumb idea anyway...
Espozo wrote:
Dang, I switched it, but I forgot how it would give me a bunch of range errors...
Why do all the starter demo stuff use LoROM?
Because I flipped a coin.
Quote:
I'm still not entirely sure how "()" works, but I guess "jsr Whatever" and "jsr (Whatever)" are different instructions, because one is jumping to the posted address and the other is jumping somewhere based on the value at that address?
Yes. If $FFFC-$FFFD have values $5E $A0, then JMP $A05E and JMP ($FFFC) do the same thing.
Quote:
[Jump table stuff]
Could I just get the Identity, have it be 24 bit, and somehow transfer the value to the program counter directly? Actually, how would I even do this in one go? You can't transfer 24 bits at once, and that's kind of important here. Yeah, it was a dumb idea anyway...
You could try pushing the 24-bit address minus 1 on the stack and using the RTS trick, like the Apple IIGS Toolbox does. (Caution: The minus 1 part has to be done modulo 65536, as execution wraps within a program bank.)
Espozo wrote:
Why do all the starter demo stuff use LoROM?
I imagine Super Mario World is LoROM? Because I imagine a lot of people would be in it to hack SMW instead of making games from scratch.
tepples wrote:
Because I flipped a coin.
Really?
tepples wrote:
You could try pushing the 24-bit address minus 1 on the stack and using the RTS trick, like the Apple IIGS Toolbox does.
That's actually pretty ingenious...
tepples wrote:
(Caution: The minus 1 part has to be done modulo 65536, as execution wraps within a program bank.)
I don't have a clue as to what that means, but I'm guessing that this can just be static? I mean, for the "Identity" of "Object1", it would just be "#object1_code-1". (or something like that. I don't know how it'll tell if the -1 is part of the name or not.)
Sik wrote:
I imagine a lot of people would be in it to hack SMW instead of making games from scratch.
But, I mean, we're talking about things like the SNES
Starter Kit.
Has anyone ever made a sort of reassembly of a game to where they've made assembly code that completely recreates the game just how it was made? I guess different assemblers have different preferences as to where they want to map everything and memory and stuff like that though.
Espozo wrote:
tepples wrote:
(Caution: The minus 1 part has to be done modulo 65536, as execution wraps within a program bank.)
I don't have a clue as to what that means, but I'm guessing that this can just be static? I mean, for the "Identity" of "Object1", it would just be "#object1_code-1". (or something like that. I don't know how it'll tell if the -1 is part of the name or not.)
he means if your symbol is at $040000 then the math must say 04FFFF instead of 03FFFF. most programs use % for modulus, but you don't really need to actually use it. just do the subtraction on the lower 16 bits and leave the bank byte unmodified.
I'm all for sharing neat tips and tricks but I feel like Espozo is in over his head.
bazz wrote:
just do the subtraction on the lower 16 bits and leave the bank byte unmodified.
I know how you'd do this in software, but how would you write it in the assembler (ca65)? There's no sense in doing something at runtime that doesn't have to be.
bazz wrote:
most programs use % for modulus, but you don't really need to actually use it.
You're saying I could just write "%" before the value of the address? I thought this was for telling the assembler that the value is to be interpreted as binary. I'm not sure what "modulus" are.
bazz wrote:
I'm all for sharing neat tips and tricks but I feel like Espozo is in over his head.
It's not like anyone is obligated to try and help me.
Espozo wrote:
You're saying I could just write "%" before the value of the address?
No. He meant that in many programming languages, "%" is the symbol for the modulo operation, just like "+" is for addition and "*" is for multiplication.
Quote:
I'm not sure what "modulus" are.
The
modulo operation finds the remainder of a division. In binary, much like division and multiplications by powers of 2 can be quickly done through shifting, you can find the module with an AND operation. For example, to divide a number by 8, you shift it right 3 times, and to find the modulo you AND it with %00000111, and you'll get a remainder ranging from 0 to 7.
When tepples said you had to do the minus 1 part modulo 65536, he meant that you should keep the math restricted to the lower 16-bits of the address, preventing any bits above that from being affected.
As for how to do it in ca65... maybe
(address & $ff0000) | ((address - 1) & $ffff)? Unless ca65 has some feature I'm not aware of specifically implemented to handle this case.
I used ca65 for SNES years ago and I recall being able to separate symbols into their constituent pieces. RTFM
.BANKBYTE
.LOWORD
ie lda #.LOWORD(foo) - 1
16-bit accum implied..
The next time you dont know how to do something, I highly recommend taking a look at the manual and learning at least one new thing while trying to find your target matter. better yet, gloss over the entire manual for a few days with the direct purpose of gaining a high understanding of how everything works. that's what I did when I first learned.
you still don't need to do any actual modulus operations, but that's EXPECTED to be obvious to you. if it is not, then you don't really understand what a modulus is -- it is a cycling constrainer .. think of a byte - you know it can only go from 0-255.. if you try to go 256 it just becomes zero. that phenomenon is mathematically identical to a % 256 operation.
ie 256 % 256 = 0
257 % 256 = 1
but it is a little different since we can have great input values
513 % 256 = 1
the value to the right of the % is defining the range of the modulus. subtract 1 and you get the eligible values, which start from 0. basically, the input value is divided by the modulus value, and you are given the remainder.
ie 512 % 256
512 / 256 = 2 r 0
therefore, 512 % 256 = 0
54 % 10
54 / 10 = 5 r 4
therefore, 54 % 10 = 4
Anyway, we don't need to manually use any modulus symbols since we separated our arithmetic to the lowest 16 bits with .LOWORD, making a 16-bit modulus automatically. just load the bank byte yourself and you'll be on your merry way -
any self-respecting programmer will also ensure that their assembler outputs #$FFFF when performing the-operation at-the-top-of-this-post -- on a symbol whose bottom 16-bits is 0000. That is just to fill in the hole between desired behavior and actual behavior -- this is important to do when you try to use newly acquired knowledge -- in my experience I have developed a sense of how many assumptions I am making as I go along -- and to verify those assumptions at the most ideal times (altogether now!)
also tokumaru, your reaction to the SNES memory map, although valid, was based on a terse textual description.. I am overwhelmed at the idea of impressing upon you how simple it seems to me, but I believe
this page is the source of my happiness relating to understanding the SNES memory map - relating to how it maps the cartridge particularly. Bare in mind that there is a certain harmony of knowledge in software and hardware that facilitates a simple understanding of the matter. That, I earned over years and years of patience and a hardcore deliberation to master an understanding of what is required to make a game on the SNES. it was a childhood dream, for me. and it has led into many other relevant areas.
I would be pain-staken to try to deliver that amount of knowledge to you over a brief moment, but I can only say that I looked up the Sega megadrive memory map, and the fact it alots a completely linear chunk of space to the cartridge looks very comfortable. Of course, I do not have as vast an understanding of architecture design to be able to consider potential pros/cons of a linear model and non-linear model.. I could only see that the architecture was undoubtedly designed around the CPU features, undoubtedly on the SNES.. Whether the SNES CPU was well designed, I cannot say.. I am not sure exactly why certain hardware is mapped where - except probably to accommodate the Direct Page of the CPU...
I had to stop learning 65816 in favor of z80 when I was young so that I could digest easier. z80 on ticalc and Gameboy helped act as a stepping stone to SNES understanding. I would not be surprised if anyone else besides yourself needs a stepping stone. I seem to have not appealed to your interest in SNES. I have no more energy to properly tie the conclusion to this post. good night.
I'm getting in way over my head so I'm just going to sit back now...
That table is still a horrible mess.
The biggest problem seems to be $00~$3F and $80~$BF (and when you think about it, that looks a lot like the NES, with RAM in the bottom 8KB range and cartridge in the top 32KB range). Running off the other banks seems much easier, but I wonder how that affects port accesses.
"problem" is a funny thing to call it -- I mean I believe it's setup this way to simplify the programmer's job -- word access to RAM and system registers in the execution bank - saves opcodes and thought.
If anything I figure the banks that the cart has complete access to, if even used (Super Mario World only uses banks 00-0F), would probably just store static assets to the game, whether that is LUT, gfx, sfx/music, you get the idea.
Code can go in there, it's just non-ideal since it requires extra manipulation of Direct Page (DP) and Data Bank Register (DBR) to access the world outside of that bank. Therefore, simple library routines that make little to no use of external RAM/PPU/CPU registers are best.
I definitely want to see myself using the DP more ingeniously... I nearly forgot its role
I'm an idiot, but RAM really shouldn't change position when using HiROM, should it? I'm just perplexed because RAM for me is full of #$55, even when I made the first instruction be "stp", incase my code no longer worked correctly.
This is what I did, and it looks good to me. (it's for identifying objects and jumping to their code.) Of course, like I said, it appears to crash before it even gets here.
Setting up the object:
Code:
sep #$20 ;A=8
lda #.BANKBYTE(object1_code)
sta ObjectTableSlot::Identity
rep #$30 ;A=16, X/Y=16
lda #.LOWORD(object1_code-1)
sta ObjectTableSlot::Identity+1
Jumping to the object's code:
Code:
sep #$20 ;A=8
lda ObjectTableSlot::Identity
pha
rep #$30 ;A=16, X/Y=16
lda ObjectTableSlot::Identity+1
pha
rts
rep #$30 ;A=16, X/Y=16
Espozo wrote:
I'm an idiot, but RAM really shouldn't change position when using HiROM, should it?
Save RAM moves, but internal work RAM ($7E0000-$7FFFFF) doesn't, nor does shadow RAM.
Quote:
I'm just perplexed because RAM for me is full of #$55, even when I made the first instruction be "stp", incase my code no longer worked correctly.
Do you know the value of the B register at any given moment? A general strategy of B=K is more viable in LoROM than in HiROM. And you can prove out "appears to crash before it even gets here" by using debug breakpoints and the emulator's memory viewer/hex editor.
tepples wrote:
Do you know the value of the B register at any given moment
I imagine the initialization routine sets it to 0, but how do you interact with the B register again? is it "tab", kind of like "tcd" with the d register? (there's only 24 bits of addressable space, so B would only be 8 bits)
tepples wrote:
A general strategy of B=K is more viable in LoROM than in HiROM.
What's "K"?
tepples wrote:
you can prove out "appears to crash before it even gets here" by using debug breakpoints and the emulator's memory viewer/hex editor.
Do you know how to place a breakpoint in the bsnes debugger? I don't know what all these fields mean. The options other than "Exec" are Read, and Write.
Attachment:
Breakpoint Editor.png [ 4.22 KiB | Viewed 1490 times ]
Really though, I don't know how it's doing this, and I even put an stp before and then moved it to after the initialization routine and it's till messed up. I think my configuration file must be messed up or something, considering I made it.
This is the HiROM configuration file I made using your advice. There's probably an error somewhere.
Code:
# ca65 linker config for 256 KiB (2 Mbit) sfc file
# Physical areas of memory
MEMORY {
ZEROPAGE: start = $000000, size = $0100; # $0000-00ff -- zero page
# $0100-01ff -- stack
BSS: start = $000200, size = $1e00; # $0200-1fff -- RAM
BSS7E: start = $7e2000, size = $e000; # SNES work RAM, $7e2000-7effff
BSS7F: start = $7f0000, size = $10000; # SNES work RAM, $7f0000-$7ffff
ROM0: start = $C00000, size = $10000, fill = yes;
ROM1: start = $C00000, size = $10000, fill = yes;
ROM2: start = $C00000, size = $10000, fill = yes;
ROM3: start = $C00000, size = $10000, fill = yes;
ROM4: start = $C00000, size = $10000, fill = yes;
ROM5: start = $C00000, size = $10000, fill = yes;
ROM6: start = $C00000, size = $10000, fill = yes;
ROM7: start = $C00000, size = $10000, fill = yes;
}
# Logical areas code/data can be put into.
SEGMENTS {
CODE: load = ROM0, align = $100;
RODATA: load = ROM0, align = $100;
SNESHEADER: load = ROM0, start = $C0FFC0;
CODE1: load = ROM1, align = $100, optional = yes;
RODATA1: load = ROM1, align = $100, optional = yes;
CODE2: load = ROM2, align = $100, optional = yes;
RODATA2: load = ROM2, align = $100, optional = yes;
CODE3: load = ROM3, align = $100, optional = yes;
RODATA3: load = ROM3, align = $100, optional = yes;
CODE4: load = ROM4, align = $100, optional = yes;
RODATA4: load = ROM4, align = $100, optional = yes;
CODE5: load = ROM5, align = $100, optional = yes;
RODATA5: load = ROM5, align = $100, optional = yes;
CODE6: load = ROM6, align = $100, optional = yes;
RODATA6: load = ROM6, align = $100, optional = yes;
CODE7: load = ROM7, align = $100, optional = yes;
RODATA7: load = ROM7, align = $100, optional = yes;
ZEROPAGE: load = ZEROPAGE, type = zp;
BSS: load = BSS, type = bss, align = $100, optional = yes;
BSS7E: load = BSS7E, type = bss, align = $100, optional = yes;
BSS7F: load = BSS7F, type = bss, align = $100, optional = yes;
}
And also, here's the header. I edited it slightly too.
Code:
MAPPER_LOROM = $20
MAPPER_HIROM = $21
ROMSPEED_200NS = $00
ROMSPEED_120NS = $10
;ROM and backup RAM sizes are expressed as log2(size in bytes) - 10
MEMSIZE_NONE = $00
MEMSIZE_2KB = $01
MEMSIZE_4KB = $02
MEMSIZE_8KB = $03
MEMSIZE_16KB = $04
MEMSIZE_32KB = $05
MEMSIZE_64KB = $06
MEMSIZE_128KB = $07
MEMSIZE_256KB = $08
MEMSIZE_512KB = $09
MEMSIZE_1MB = $0A
MEMSIZE_2MB = $0B
MEMSIZE_4MB = $0C
REGION_JAPAN = $00
REGION_AMERICA = $01
REGION_PAL = $02
;======================================================================
.segment "SNESHEADER"
;======================================================================
romname:
.byte "SNES Splatoon"
.assert * - romname <= 21, error, "ROM name too long"
.if * - romname < 21
.res romname + 21 - *, $20 ; space padding
.endif
.byte MAPPER_HIROM|ROMSPEED_120NS
.byte $00 ; 00: no extra RAM; 02: RAM with battery
.byte MEMSIZE_256KB ; ROM size (08-0C typical)
.byte MEMSIZE_NONE ; backup RAM size (01,03,05 typical; Dezaemon has 07)
.byte REGION_AMERICA ; region code
.byte $33 ; publisher id, or $33 for "see 16 bytes before header"
.byte $00 ; ROM revision number
.word $0000 ; Checksum of all bytes will be poked here after linking
.word $0000 ; $FFFF minus above sum will also be poked here
.res 4 ; Unused vector space
;Native 65816 vectors
;NOTE: Reset vector set to $FFFF because it doesn't serve a purpose here;
;the 65816 starts up in emulation mode (even on soft reset).
.addr EmptyHandler ; 65816 COP vector
.addr EmptyHandler ; 65816 BRK vector
.addr EmptyHandler ; 65816 ABORT vector
.addr VBlank ; 65816 NMI vector
.addr $0000 ; 65816 RESET vector -- see above
.addr EmptyHandler ; 65816 IRQ vector
.res 4 ; Unused vector space
;6502/65c02 vectors
;NOTE: BRK vector set to $FFFF because 6502/65c02 doesn't have an actual
;BRK vector, it only has an IRQ vector.
.addr EmptyHandler ; 6502/65c02 COP vector
.addr $0000 ; 6502/65c02 BRK vector -- see above
.addr EmptyHandler ; 6502/65c02 ABORT vector
.addr EmptyHandler ; 6502/65c02 NMI vector
.addr Main ; 6502/65c02 RESET vector
.addr EmptyHandler ; 6502/65c02 IRQ vector
;These are vectors which are essentially unused on the SNES, i.e.
;defined to meet ca65 design requirements.
;
;The only ones which could be used are cop_handler, brk_handler,
;abort_handler, and irq_handler, but at present those do not serve
;a purpose for this game.
;======================================================================
.segment "CODE"
;======================================================================
.proc EmptyHandler
rti
.endproc
.proc EmptyVBlank
rep #30
pha
php
sep #$20
lda $4210 ;clear NMI Flag
plp
pla
rti
.endproc
The way to set B is PLB.
K is the bank bits (23-16) of the program counter, as pushed with PHK.
Wait, I thought B was for selecting the bank (the top 8 bits)...
What does it do if K does that?
B (DBR) is for data fetches, and is used whenever something has a 16-bit address.
K (PBR) is for code fetches.
In other words: K is used for the program, B is for everything else (right?) Makes sense, since in many cases you run the code from a different bank than the data it's using.
(Stack and direct page are always from bank=0)
Wouldn't you not really need to touch K? I mean, whenever you go further down the program and the program counter gets incremented automatically, it should also automatically increment K, shouldn't it? Even in the case of the rts that serves as a way to jump to the object code, it shouldn't even be messed with there. Where does the SNES start running code at anyway, as in what's the value in the program counter (including K)? In fact, just out of curiosity, what is the register for the lower 16 bits of the program counter?
JMP f:address (or JML), JSR f:address (or JSL), RTL, and RTI automatically handle K for you. Usually the reason you'd want to push it is to be able to use B=K with data in the same bank as the program.
Code:
phk
plb
So, I guess that's not really the problem... This is how the initialization routine starts, and it sets B=K:
Code:
.macro InitializeSNES
sei ;Disable interrupts
clc ;Clear carry, used by XCE
xce ;Set 65816 native mode
jmp :+ ;Needed to set K (bank of PC) properly if MODE 21 is ever used;
;see official SNES developers docs, "Programming Cautions"
: cld ;Disable decimal mode
phk ;Push K (PC bank)
plb ;Pull B (data bank, i.e. data bank now equals PC bank)
I really think the problem lies in the configuration files.
It could still be the problem so-to-speak.
That
jmp :+ needs to be a "long jump" (a.k.a. JML) (full 24-bit address; opcode $5C) for K to be "correct". A "short jump" (16-bit address; opcode $4C) won't suffice. A listing generation of the code will show you which opcode is being used -- and, if it's a long jump, what destination bank is (which is what K will be after the JMP takes place).
What controls the generation of the address and bank the JMP refers to is -- you guessed it -- the assembler and it's related configuration directives.
My gut feeling is that it's either a 16-bit JMP (which means K is going to be whatever it is on power-on/reset -- on 65816, that's always $00), or it's a 24-bit JMP with the assembler choosing a bank of $00 itself.
Which bank you want it to be is up to you -- you need to understand the SNES's memory map (mode 20 vs. mode 21) to understand what bank is relevant. The official documentation outlining mode 21's memory map
is here. This subject has come up time and time again over the years:
viewtopic.php?f=12&t=9286viewtopic.php?f=12&t=10389 (note my post here discusses mode 25, which is not mode 21)
viewtopic.php?f=12&t=10423 (relevant to previously-linked diagram/picture)
In case it hasn't been made clear:
B is the Data Bank register, which is the bank that 16-bit memory operations will use (e.g.
lda $1234 would load from address ${B}/1234). Indexed memory access does wrap banks, e.g.
sep #$30, lda #$c4, pha, plb, ldx #1, lda $ffff,x would effectively read from address $c50000).
K is the Program Bank register, which is the bank where code is actively running from (e.g. in a real-time debugger, when your code first starts, you might see something like
$008000: sei -- in this case K is $00. Let's say it's followed by a
$008001: jmp $c08005, which after execution K would be $c0 and the PC would be $8005). The only way you can change K is through jump operations or through stack manipulation followed by
rtl). K also doesn't "roll over" to the next bank when executing code (e.g.
$c0ffff: sei, after executing, does not result in K=$c1 PC=$0000, instead it results in K=$c0, PC=$0000. In other words, PC always wraps).
Direct page reads/writes always (effectively) read/write to/from bank $00 (the D register is only 16-bits).
Stack reads/writes (pulls/pushes) always stick data on the stack which is located in bank $00 (the S register is only 16-bits).
All this is covered in Western Design's Centre's book on the 65816; see pages 47 and 48.
Even though the assembler should have told me that that's the problem... Ok though. I'll test it out.
Edit: I changed it, and nothing happened... You want to know what I noticed though? There are a bunch of (somewhat) random changing vales right here, and only right here. (It stops and goes to #$55 right after and before.)
Attachment:
SNES RAM.png [ 13.38 KiB | Viewed 1906 times ]
I even put in an "stp" right before the long jump, and the values still change somehow. Is this the "open bus" behavior or whatever? (I really don't know what it is) If it is, shouldn't it not be in ram? I'm 99% certain the problem lies with me messing with the header and configuration file.
Quote:
(e.g. $c0ffff: sei, after executing, does not result in K=$c1 PC=$0000, instead it results in K=$c0, PC=$0000. In other words, PC always wraps).
I just pray the assembler will give out a warning... Anyway, I've gone with not reading that book far enough. I should really probably read it.
The assembler does what you tell it -- this is exactly how a computer is supposed to behave. It has no way of knowing automatically/intrinsically know what your goal/intentions were. Sorry :-)
If that's a bsnes/higan memory dump of $000000 to $0001ff (that's RAM), then you can thank byuu for that wonderful "feature" -- he pre-populates random values all throughout the SNES's RAM intentionally on power-on (may also be called a "hard reset" by some). His justification is that RAM is not initialised to zero on power-on (which is true -- it isn't! -- and your program shouldn't blindly assume so. I just disagree with him doing this for the reasons depicted in this thread -- it confuses people). A soft reset is a different situation (RAM should not be touched, and there are games (at least on the NES) which utilise this fact).
I'm not sure what relevancy this has to what we're discussing, but...
koitsu wrote:
The assembler does what you tell it
Well, I've had problems of referencing a 24 bit address and the assembler falsely only using the first 16 bits.
koitsu wrote:
he pre-populates random values all throughout the SNES's RAM intentionally on power-on
The weirdest thing is that there are changing, and that shouldn't happen.
koitsu wrote:
I'm not sure what relevancy this has to what we're discussing, but...
It's relevant because you're trying to help me solve my problem.
Espozo wrote:
Well, I've had problems of referencing a 24 bit address and the assembler falsely only using the first 16 bits.
The assembler isn't going to "know" how to warn you of that situation either -- again, it cannot read your intentions, it can only do what it's told. Different assemblers do different things. When learning an assembler, it's important to both read its documentation (as best as possible -- and if there aren't concise docs, I always advocate not using it), and "fool around" to learn its nuances. This can be accomplished with a combination of generated assembly listings and a real-time debugger. IMO, the latter -- something good, that is -- is what's been missing from SNES emulators for almost 2 decades.
Espozo wrote:
The weirdest thing is that there are changing, and that shouldn't happen.
Maybe what you're looking at isn't RAM then, but instead a different batch of data? It's not like the system doesn't have several chips all with dedicated memory... :-) I don't use bsnes/higan so I don't know what "S-CPU bus" refers to exactly. Possibly it refers to something irrelevant. In NO$SNS, the full 24-bit memory space within the SNES can be examined (up to you to pick where/what) in the lower part of the window (selected via Window -> Data).
Espozo wrote:
It's relevant because you're trying to help me solve my problem.
I don't know what problem you're trying to solve, actually. I just saw the thread here discussing something at length which I can shed light on.
P.S. -- God I hate quoting on this forum. It's like micro-managing a toddler.
koitsu wrote:
I don't know what "S-CPU bus" refers to exactly.
Well, the options are S-CPU bus, S-APU bus, S-PPU VRAM, S-PPU OAM, and S-PPU CGRAM. S-CPU bus goes to $FFFFFF, S-APU bus goes to $FFFF, S-PPU VRAM goes to $FFFF, S-PPU OAM goes to $21F, and S-PPU CGRAM goes to $1FF. S-APU bus must be audio ram, S-PPU VRAM must be vram (it makes it seem like it's actually inside the chips like oam and cgram, but that's irrelevant), S-PPU OAM must be oam, and S-PPU CGRAM must be cgram, so there's no other option.
koitsu wrote:
I don't know what problem you're trying to solve, actually.
Multiple.
koitsu wrote:
P.S. -- God I hate quoting on this forum. It's like micro-managing a toddler.
?
Okay, that helps. Then yeah, definitely S-CPU bus is the general 24-bit memory space for the 65816 (i.e. RAM, ROM, everything).
I don't have an explanation for why your values in $0000-01ff keep changing unless you have something that's either misbehaving code-wise or possibly stack underflow (pulling things off the stack more times than you pushed). I would suggest doing sei, clc, xce, stp as your first 4 instructions (sei should ideally always come first) and see if the problem continues (make sure in the debugger there is nothing actually executing -- hopefully it has a simple way to break-to-debugger upon reset/load of ROM, else you'll need to put a breakpoint at whatever your reset vector points to).
If the CPU is stopped and you've confirmed that, then there's literally nothing I can think of that would be changing values in $0000-01ff magically like that. My initial feeling are a major bug in the emulator but I simply don't know. Can you try it on NO$SNS and see if the same behaviour happens?
Otherwise if the CPU isn't stopped at all, then possibly your code in the ROM isn't where it should be. Check bank $00, addresses $FFE4-FFEF and $FFF4-FFFF to see what the vectors look like. There's a possibility they're all $0000 or $FFFF or something (depends on a lot of things), which wouldn't be right -- think about what a reset vector of $0000 would end up doing, especially if the emulator does something like fill RAM up with random bytes (hint: stp never gets run, but a bunch of other weird bytes do get run as code, and the results are gonna be as random as the bytes in RAM).
If your vectors aren't what they should be, then yeah, this is a configuration problem either in the linker/assembler, or the emulator is mapping the ROM into memory space in a way you aren't expecting. In the latter case, it's usually possible to figure out where the emulator *is* mapping things by simply searching (if the emulator can do that) the addressing space for bytes that correlate with your vector values. From there it's possible to determine a little bit more of what's going on under the hood (e.g. emulator is treating your ROM as mode 20, or things are just downright broken).
While I'm editing this, and I fully admit I haven't read the thread, I'd be curious to know why you're already moving from mode 20 to mode 21 considering (respectfully) you don't understand things like B vs. K. In other words: I think you may be trying to jump into something that's too advanced right now. Usually people go with mode 21 once they have a good justification/need for the additional addressing space. Or maybe this is just a learning experience, I dunno. Would be curious to know why you're doing it though (it feels like just yesterday I was helping you with code, which is why I ask...)
The SNES always starts up in bank $00. Even for HiROM, $0000-$7fff for banks $00-$3f, and $80-bf, are laid out like LoROM, and $8000-$ffff (where LoROM banks would be) are mirrors of the high halves of each HiROM bank.
What this means is, if your reset code is somewhere like $c00123, and you make your reset vector $0123, the SNES will start in $000123, which is smack-dab in the middle of (uninitialized!) RAM. In other words, it won't even make it to your code, it'll just execute random garbage in RAM instead.
To solve this problem, you'll want your reset code to be within the range $c08000-$c0ffff, so that it's mirrored to $008000-$00ffff, and have that code do a long jump to bank $c0.
Something else to be aware of is that due to the heuristics BSNES uses to guess whether a given ROM is LoROM or HiROM, it can occasionally guess wrong for simple test ROMs like yours. In particular, it's programmed to heavily assume something starting with "stp" is not the right guess, meaning it could additionally be misinterpreting your ROM as LoROM, making the problem even more confusing. To be safe, I would start your ROM with "sei", "clc; xce", or a jump first, since BSNES takes that as a good sign that it's guessed correctly.
Espozo wrote:
Do you know how to place a breakpoint in the bsnes debugger? I don't know what all these fields mean. The options other than "Exec" are Read, and Write.
The checkbox activates the breakpoint. You do this last.
Then comes address box, valid formats are
2000,
002000 etc. Automatically interpreted as hex.
Then the value field for read/write breakpoints -- will only break if the specific value is read or written.
Then specify exec/read/write breakpoint type - exec simply breaks when the address is to be executed as an instruction, read if the address is to be read from, write if the address is to be written to.
Then Specify the bus - "CPU bus" is to debug SNES memory map as seen from the cartridge, but you can also break on other memory areas directly such as SPC700 (SMP), PPU, OAM, CGRAM.
How sad is it that I just figured out how the debugger works... Anyway, this is what's happening at startup...
Code:
000000 eor $55,x [000055] A:0055 X:0000 Y:0000 S:01d8 D:0000 DB:00 nv1BdIzc V: 0 H: 186
000002 eor $55,x [000055] A:0000 X:0000 Y:0000 S:01d8 D:0000 DB:00 nv1BdIZc V: 0 H: 216
000004 eor $55,x [000055] A:0055 X:0000 Y:0000 S:01d8 D:0000 DB:00 nv1BdIzc V: 0 H: 246
000006 eor $55,x [000055] A:0000 X:0000 Y:0000 S:01d8 D:0000 DB:00 nv1BdIZc V: 0 H: 276
000008 eor $55,x [000055] A:0055 X:0000 Y:0000 S:01d8 D:0000 DB:00 nv1BdIzc V: 0 H: 306
00000a eor $55,x [000055] A:0000 X:0000 Y:0000 S:01d8 D:0000 DB:00 nv1BdIZc V: 0 H: 336
00000c eor $55,x [000055] A:0055 X:0000 Y:0000 S:01d8 D:0000 DB:00 nv1BdIzc V: 0 H: 366
00000e eor $55,x [000055] A:0000 X:0000 Y:0000 S:01d8 D:0000 DB:00 nv1BdIZc V: 0 H: 396
000010 eor $55,x [000055] A:0055 X:0000 Y:0000 S:01d8 D:0000 DB:00 nv1BdIzc V: 0 H: 426
000012 eor $55,x [000055] A:0000 X:0000 Y:0000 S:01d8 D:0000 DB:00 nv1BdIZc V: 0 H: 456
000014 eor $55,x [000055] A:0055 X:0000 Y:0000 S:01d8 D:0000 DB:00 nv1BdIzc V: 0 H: 486
000016 eor $55,x [000055] A:0000 X:0000 Y:0000 S:01d8 D:0000 DB:00 nv1BdIZc V: 0 H: 516
It's starting in ram ($000000)? Oh yeah, and because the values start out random... (as opposed to all 0's or 1's).
Wait a minute, what can I even do to fix this? I thought the SNES by default wouldn't start at $000000 because it's in ram.
Quote:
While I'm editing this, and I fully admit I haven't read the thread, I'd be curious to know why you're already moving from mode 20 to mode 21 considering (respectfully) you don't understand things like B vs. K. In other words: I think you may be trying to jump into something that's too advanced right now.
Quote:
Unfortunately, I found out it was too advanced for me after I already started...
Quote:
Usually people go with mode 21 once they have a good justification/need for the additional addressing space. Or maybe this is just a learning experience, I dunno. Would be curious to know why you're doing it though (it feels like just yesterday I was helping you with code, which is why I ask...)
I just figured I'd get it out of the way first because I don't want to have to figure out I need the extra addressing space and mess with everything latter. Addressing space is the main reason I switched over, as I wasn't aware of the fact HiROM could address more previously. It definitely is a learning experience though. Had I never done this, I wouldn't have ever known about B or K, and that's pretty fundamental.
I don't know why, but the website keeps messing up the quotes by interjecting "[/quote]". Oh well.
I don't know why you'd think the CPU wouldn't do what it's told. If the reset vector is $0000, it's going to run code at $0000. What that *does* is a different story, but it's obviously not what you intended. It's perfectly fine to set a vector to somewhere in RAM -- many games do this for their IRQ vector. But that's besides the point.
In other words: the most likely explanations are that the ROM is organised incorrectly for mode 21 (hence your vectors are in the wrong spot, rather than where you hope/think they'll be) (which is likely the result of linker configuration mistakes (mode 20 vs. mode 21)), or the emulator is doing something odd (like Nicole mentioned -- possibly it's assuming mode 20 for some reason).
So now you know: your vectors are probably all $0000 (why not verify/check by looking at the address space I mentioned?), which explains the problem. The code which is being run is the result of bsnes/higan sticking random values in RAM on power-on. The solution is to figure out what's incorrect with your linker configuration and see if you can get emulators to do what you'd expect.
The
configuration shown here is for mode 20, and will certainly do the wrong thing if used as mode 21 (I can absolutely see how the top 32KByte of bank $00 could be all zeros, for example). You've been given an accurate memory map of what mode 21's memory model looks like, so attached is one for mode 20 (what your config currently is set up for). You should be able to look at the differences between the two and (hopefully) go "OH! Now it makes sense why it's all messed up!" and maybe take a stab at fixing it yourself. :-) And don't forget that you need to change the SNES cartridge header region to properly refer to mode 21 as well (if you haven't already) -- this is the byte in $FFD5. It needs to be $21, not $20.
koitsu wrote:
I don't know why you'd think the CPU wouldn't do what it's told. If the reset vector is $0000, it's going to run code at $0000.
Oh, I thought the location it started at was static, which is why I was confused. I haven't read all that you've wrote, but I'll work on it tomorrow because I should go to sleep.
The vector locations themselves are at static locations (e.g. IRQ is at $FFFE / $FFEE, RESET is at $FFFC, NMI is at $FFFA / $FFEA, etc.), but the contents of the vectors can be anything you want, pointing to anywhere within 16-bit address space (bank $00, range $0000-FFFF) -- that includes RAM.
It should be obvious that you don't want to set the reset vector to something in RAM, but games can (and do!) set any of the vectors -- NMI and/or IRQ are the two most common -- to RAM locations so that the game can change the those vector values on-the-fly.
koitsu wrote:
His justification is that RAM is not initialised to zero on power-on (which is true -- it isn't! -- and your program shouldn't blindly assume so. I just disagree with him doing this for the reasons depicted in this thread -- it confuses people).
Considering the amount of real hardware-only bugs that comes up in homebrew because it assumes memory is cleared to zero? I'd rather have it do things this way. This is even moreso true for video memory, where people usually only initialize what they need and forget that other parts still need to be reset. (also this isn't specific to SNES, just saying)
Sik wrote:
koitsu wrote:
His justification is that RAM is not initialised to zero on power-on (which is true -- it isn't! -- and your program shouldn't blindly assume so. I just disagree with him doing this for the reasons depicted in this thread -- it confuses people).
Considering the amount of real hardware-only bugs that comes up in homebrew because it assumes memory is cleared to zero? I'd rather have it do things this way. This is even moreso true for video memory, where people usually only initialize what they need and forget that other parts still need to be reset. (also this isn't specific to SNES, just saying)
Not really going to get into a debate about this; way too off-topic.
Espozo wrote:
Code:
ROM0: start = $C00000, size = $10000, fill = yes;
ROM1: start = $C00000, size = $10000, fill = yes;
ROM2: start = $C00000, size = $10000, fill = yes;
ROM3: start = $C00000, size = $10000, fill = yes;
ROM4: start = $C00000, size = $10000, fill = yes;
ROM5: start = $C00000, size = $10000, fill = yes;
ROM6: start = $C00000, size = $10000, fill = yes;
ROM7: start = $C00000, size = $10000, fill = yes;
By the way, I noticed this in your linker file; these shouldn't all be $C00000, but should go $C00000, $C10000, $C20000, and so on.
Yup, that definitely looks wrong, and is certainly (at least part of) the problem.
Nicole wrote:
Espozo wrote:
Code:
ROM0: start = $C00000, size = $10000, fill = yes;
ROM1: start = $C00000, size = $10000, fill = yes;
ROM2: start = $C00000, size = $10000, fill = yes;
ROM3: start = $C00000, size = $10000, fill = yes;
ROM4: start = $C00000, size = $10000, fill = yes;
ROM5: start = $C00000, size = $10000, fill = yes;
ROM6: start = $C00000, size = $10000, fill = yes;
ROM7: start = $C00000, size = $10000, fill = yes;
By the way, I noticed this in your linker file; these shouldn't all be $C00000, but should go $C00000, $C10000, $C20000, and so on.
Oh damn...
Well, I corrected it (looking at it now, I feel like a complete idiot. I just didn't type in the write thing because I copy/pasted) but it still doesn't work in that it does the eor $55 crap.
Anyway, look what I found...
Code:
;Native 65816 vectors
;NOTE: Reset vector set to $FFFF because it doesn't serve a purpose here;
;the 65816 starts up in emulation mode (even on soft reset).
.addr EmptyHandler ;65816 COP vector
.addr EmptyHandler ;65816 BRK vector
.addr EmptyHandler ;65816 ABORT vector
.addr VBlank ;65816 NMI vector
.addr $0000 ;65816 RESET vector -- see above
.addr EmptyHandler ;65816 IRQ vector
.res 4 ;Unused vector space
;6502/65c02 vectors
;NOTE: BRK vector set to $FFFF because 6502/65c02 doesn't have an actual
;BRK vector, it only has an IRQ vector.
.addr EmptyHandler ;6502/65c02 COP vector
.addr $0000 ;6502/65c02 BRK vector -- see above
.addr EmptyHandler ;6502/65c02 ABORT vector
.addr EmptyHandler ;6502/65c02 NMI vector
.addr Main ;6502/65c02 RESET vector
.addr EmptyHandler ;6502/65c02 IRQ vector
However, pay attention to the note.
Yeah, what the hell, I changed both of them to "Main", and it's still busted. I even did a soft reset, and it still starts out at $000000. I'm hopelessly lost.
Where is Main in your bank? If it's right at the beginning, it's probably at $c00000, making your vector $0000 in the first place. $c08000-$c0ffff are mirrored to $008000-$00ffff, so you need your interrupt handlers to fall within that range so that they're available in bank $00.
You could make a separate segment for $c08000-$c0ffaf (excluding the header), or, say, something like $c0f800-$c0ffaf if you want to keep most of your first bank contiguous, and put your handlers in there. How exactly you split it up is up to you, you just need to ensure your handlers are within that $c08000-$c0ffff range.
What Nicole said, again, is spot on. Making the vectors have their own section is the best way to ensure they end up in the proper place.
I really need a full .zip or .rar of the code at this point to be able to fix it; it's probably easiest to just do this and then explain to you what the issue is slowly. But yes, this is basically a linker configuration "issue" because of the fact that mode 21 uses linear 64KB banks (where the first 32KBytes starts at $0000-7FFFF), compared to mode 20 which uses linear 32KByte banks (so the first 32KBytes starts at $8000-FFFF, which includes the vector space).
I've had a very very long and hard day at work so my description of this is not as clear/concise as it would be normally.
koitsu wrote:
I really need a full .zip or .rar of the code at this point to be able to fix it; it's probably easiest to just do this and then explain to you what the issue is slowly.
Have fun!
Attachment:
Broken HirROM.zip [2.34 MiB]
Downloaded 64 times
Is see though that the problem is that I didn't set aside vector space, as it's not needed in LoROM or something like that. In all seriousness, thank you. I don't know how I can get so caught up on stupid shit and I'm perplexed how people like you got anywhere without people having to look at everything they're doing. I guess, in a sense, when you've learned one processor, you've learned them all. I actually learned more about addressing modes on the 80186 than I had on the 65816, and I kind of assumed what I learned there didn't overlap.
I now understand the nature of the problem. This is going to take a while to figure out how to solve. It's an issue of making ld65 do the "right thing" with regards to mode 21's memory map.
Here you go. Please read everything I've written before even loading the ROM. I can assure you, however, that your vectors are now correct. Before now, they weren't. I verified in SNES9x debugger (it has a button labelled "Vector Info" that makes this quite easy :-) ). Please be sure to look at Header.asm and snes.cfg -- pay close attention to what I've done with the segments, and with the vectors.
The issue: snes.cfg was causing the assembler/linker to generate (16-bit) addresses for vector locations within the linear mode 21 memory map. In other words: it was assuming that everything was running out of bank $C0. Your RESET vector was getting set to $0000 because your "Main" routine happened to be assembled at address $C00000 (lower 16-bits = $0000). If you're actually in bank $C00, a JMP $0000 would work -- but what happens if you're in bank $00 and do a JMP $0000? In effect, on reset/power-on, that's exactly what was happening. (We did tell you go to look at your vector values...)
The reason this worked in mode 20 was because the assembler knew that addresses ranged from $xx8000-xxFFFF ("32KByte banks" so-to-speak), so your code was all within that range too, hence RESET pointed to $8000, NMI pointed to something like $845C (I forget), and so on. And that worked. But the instant you switched to mode 21, all those addresses got changed.
What's important to understand about mode 21 is that there's a damn good reason $008000-00FFFF maps to $C08000-C0FFFF -- it's absolutely needed because when the CPU executes code where vectors point, it does so in bank $00. The way programmers solve this is either by sticking their vector-specific code in $C08000-C0FFFF, or by using bootstraps -- simple routines that do a jml SomeLabel that causes K to become something other than bank $00 and voila. Bank $00 is therefore "special" in this regard.
So now you hopefully understand, and are going to open up the ROM and find nothing but a black screen. Why? I haven't debugged this -- this is for you to figure out, and a real-time debugger is going to help you immensely! My gut feeling: there is still some code that needs to be fixed to properly deal with mode 21 addressing -- I'd be willing to bet you have code modifying SNES registers in the $21xx or $42xx or $43xx range. If B=$C0 and you do sta $2100, for example, you're going to write to $C02100 (which is ROM / does nothing). What you want is a write to $002100. You can either use a long write (e.g. sta $002100 or possibly sta.l $002100, not sure what ca65 wants), or you can do lda #$00, pha, plb so that B=$00 and all reads/writes happen within bank $00 (where registers are). DMA registers, etc. will be affected by this too. On the other hand, if you try to do something like lda MyROMData,x, that's going to try to read from whatever bank B currently is set to too, so you get to deal with that the same way (e.g. lda.l MyROMData,x or dealing with B yourself). Welcome to mode 21 and the pros/cons of having a full linear 24-bit addressing space within banks $C0-FF.
"So why did all of this just magically work in mode 20?" Because in mode 20, pretty much most all of bank $00 address range $0000-7FFF gets mirrored into banks $01-$3F. Your code/ROM is within $8000-FFFF within bank $00-3F, so any code running there can effectively write to registers/RAM/whatever without having to worry (as much) about banks.
In other words: moving to mode 21 is not a "change a config file and voila" task. You get to re-write a lot of code -- A LOT -- to deal with the major difference in memory layout.
I look forward to your progress. You can do it. Just pay close attention to what B is when working through things in a debugger. The easiest/quickest way to get things working is to use 24-bit (long) addressing EVERYWHERE. This is slower, cycle-wise, but it'd make things work.
Good luck.
koitsu wrote:
What's important to understand about mode 21 is that there's a damn good reason $008000-00FFFF maps to $C08000-C0FFFF -- it's absolutely needed because when the CPU executes code where vectors point, it does so in bank $00. The way programmers solve this is either by sticking their vector-specific code in $C08000-C0FFFF, or by using bootstraps -- simple routines that do a jml SomeLabel that causes K to become something other than bank $00 and voila. Bank $00 is therefore "special" in this regard.
And with the general recommendation to "get out of bank $00 as soon as possible to stay fast", I guess I'll need to make a special section in $C0FF80-$C0FFFF to put the
jml instructions next to the internal header once I get around to porting my simple sprite demo to mode 21.
Quote:
What you want is a write to $002100. You can either use a long write (e.g. sta $002100 or possibly sta.l $002100, not sure what ca65 wants)
The ca65 syntax for this is
sta f:$002100, where the "f" is for "far" (24-bit) addressing. Before I learned that, I had planned on using
sta $802100.
Quote:
On the other hand, if you try to do something like lda MyROMData,x, that's going to try to read from whatever bank B currently is set to too, so you get to deal with that the same way (e.g. lda.l MyROMData,x or dealing with B yourself).
Here,
lda f:MyROMData,x. It would be wise for you (and I, for that matter) to look up the "memory model" rules of ca65, which determine when ca65 guesses
f: ("far" 24-bit addressing) and when it guesses
a: ("absolute" 16-bit addressing).
Quote:
Welcome to mode 21 and the pros/cons of having a full linear 24-bit addressing space within banks $C0-FF.
The second half of each bank $C0-$FF gets mirrored down to $80-$BF, allowing some tricks to be done that mix mode 20 and mode 21 idioms. But that's an advanced technique, as is the technique of temporarily setting D (the direct page base pointer) to $2100 or $4300 to access PPU and DMA ports with direct page instructions. You can even use those DMA channels that you aren't otherwise using as spare direct page locations.
koitsu wrote:
Here you go.
You are absolutely a savior, koitsu.
And yes, I did read everything.
However, there's one thing I'm wondering if you know. I recently found out about the "step" thing in bsnes debugger, but it only increments by one instruction at a time. Do you think there would be some sort of feature to see every instruction that is being executed in one frame rather than pressing the step button thousands of times? It's got to be some sort of bad infinite loop.
koitsu wrote:
In other words: moving to mode 21 is not a "change a config file and voila" task. You get to re-write a lot of code -- A LOT -- to deal with the major difference in memory layout.
And that's exactly why I wanted to do it now!
Espozo wrote:
You are absolutely a savior, koitsu. :D And yes, I did read everything. :wink: However, there's one thing I'm wondering if you know. I recently found out about the "step" thing in bsnes debugger, but it only increments by one instruction at a time. Do you think there would be some sort of feature to see every instruction that is being executed in one frame rather than pressing the step button thousands of times? It's got to be some sort of bad infinite loop.
"Step" in every debugger on the planet means either "run one instruction" or "run one line of code". This is 100% normal.
Many debuggers implement several different permutations of this. Ones I deal with in FCEUX (NES emulator):
* Run -- runs until a breakpoint is hit or emulator is paused
* Step Into -- Runs a single instruction and then stops
* Step Over -- Runs one instruction, unless it's a JSR -- in which case, it runs until the corresponding RTS
* Step Out -- Runs until the current subroutine ends with an RTS (some cases will behave the same as Run)
* Run Line -- Runs for 1 scanline before breaking
* Run 128 Lines -- Runs for 128 scanlines before breaking
SNES9x debugger implements these -- note how the terminology vs. functionality differs:
* Run -- same as above
* Next Op -- displays current instruction
* Step Into -- same as above
* Step Over -- same as above, although I think it stops *after* the RTS has been executed
* Step Out -- same as above
* Skip Op -- skips over execution of the current instruction (danger! :-) )
* Frame Adv -- runs for a single frame (not sure if this means "one NMI", since NMI doesn't necessarily have to be tied to VBlank)
I mainly use SNES9x debugger, but it acts like a complete dick on Windows 7 due to how it's designed. There are VERY recent SNES9x builds (including 64-bit), but none of have the debugger; Geiger did a really nice job implementing what he did, I just wish it was something the SNES9x folks officially integrated. But IMO, it's the best SNES emulator debugger I've used to date.
It's been a while since I've looked at no$sns, but nocash develops *really* good tools and definitely puts effort into his debugger and related tools. His emulator is more for "development" and accuracy than it is for "great game experience".
You can talk to byuu about bsnes/higan debugger features, but I swore I read somewhere he didn't care to focus on any of that any longer and instead was going to leave it up to others to implement (i.e. he focuses just on the emulator core; UI + debugger + etc. is up to other people). I could be remembering wrong, however. But again, I don't use bsnes/higan, so...
Espozo wrote:
Do you think there would be some sort of feature to see every instruction that is being executed in one frame rather than pressing the step button thousands of times?
Normally you'd use a log for this. In FCEUX, for example, you can pause the emulation, start logging, and then run 1 frame at a time until you're ready to stop logging. That will give you a log file you can analyze as much as you need. See if the emulator you're using has a logging feature.
Yeah, that's one way to do it, but can be painstaking to dig through hundreds of megabytes of log data. Regardless, what can help is if the emulator/debugger doing the logging writes what scanline and/or frame it's on (or, bare minimum, if it's in NMI or not). Some do this, some don't.
tokumaru wrote:
Espozo wrote:
Do you think there would be some sort of feature to see every instruction that is being executed in one frame rather than pressing the step button thousands of times?
Normally you'd use a log for this. In FCEUX, for example, you can pause the emulation, start logging, and then run 1 frame at a time until you're ready to stop logging. That will give you a log file you can analyze as much as you need. See if the emulator you're using has a logging feature.
BSNES-classic Debugger does have CPU Tracing -- if you have a start/end position in mind you can set that/those breakpoint(s) combined with starting and stopping the tracing. Tracing files are written to the Data path, which is by default the path of the BSNES executable (on OSX anyways).
I added a handy Frame-break feature into my fork of BSNES. It allows an arbitrary amount of vblanks to pass before breaking. It's very handy for a multitude of use-cases. Making this more precise, I also added a "numbreaks" feature to each breakpoint -- so you can specify how many times the breakpoint must hit before actually halting execution!!
Of course, I added other useful things unrelated to this post - like a
SAT sprite viewer!!
https://github.com/bazzinotti/bsnes-classicNote: I've only compiled / tested my fork on OSX and Linux
koitsu wrote:
Yeah, that's one way to do it, but can be painstaking to dig through hundreds of megabytes of log data.
While I do agree that tracing for extended periods of time can yield gigantic logs -- that's not one frame of log -- I just tested this by logging Kirby's Dream Course gameplay idle state, and obtained 1.3 MB of log -- almost 15,000 lines. That's more appropriate. Also, since Espozo will primarily be debugging his small homebrew, he will have a much smaller amount of instructions per frame.
Quote:
Regardless, what can help is if the emulator/debugger doing the logging writes what scanline and/or frame it's on (or, bare minimum, if it's in NMI or not). Some do this, some don't.
BSNES-classic debugger does this : ) Sample Trace output:
Code:
808b5a sta $2c08,x [7e2c08] A:0080 X:0000 Y:0070 S:1ff4 D:1e00 DB:7e nvmxdIzc V:240 H:1160
bsnes-plus adds much-needed step-out and step-over functions to the debugger window, as well as various other additions like a faster memory editor and coprocessor debugging.
Yeah, I've been doing other stuff, and I just got back to this and this is really annoying me. I downloaded bsnes plus and have been doing stuff on it, but I'm still disappointed that there's no way to step more than one instruction. I tried just repeatedly pressing step, but I didn't even get past the initialization routine. I may just have to learn to use the breakpoint editor.
Espozo wrote:
but I'm still disappointed that there's no way to step more than one instruction. I tried just repeatedly pressing step, but I didn't even get past the initialization routine.
This is definitely not the way to go about debugging code. You don't trace the whole program until the point of interest, you trigger a breakpoint near the point of interest so you can step only through the part you really need to check. On the NES I often put an
sta $ff instruction at the place I want to start debugging, and then set up a breakpoint for writes to $00FF. I chose that specific memory location because that byte is very likely to remain unused until the very end of the development process. The top of the stack is also a good place to trigger breakpoints.
Well dang it. I tried the debugger on bsnes plus, but it's got more fields I don't know:
Attachment:
Debugger.png [ 11.25 KiB | Viewed 1866 times ]
Range? What? it'll go to any address between two numbers? Also, what's up with R, W, and X? I guess they start it, considering there isn't a check mark like in the other one.
Espozo wrote:
Range? What? it'll go to any address between two numbers?
Yes. In FCEUX if you need only one address, as opposed to a range, you can leave the second field empty, I don't know if it's the same here.
Quote:
Also, what's up with R, W, and X?
Read, write, and execute. You can select the types of accesses that will trigger the breakpoint.
Actually, when the breakpoint is triggered, what actually happens? It's annoying, because it looks like I can only step through the program right when it starts up, because if I try "step" any other time, nothing happens, and you can only even press it when the emulation is "broken".
Execution will automatically stop when a breakpoint triggers. If you had a breakpoint for when $801234 was executed (that is, you had "801234" in the first address range box, and X was checked), it would stop there once it reaches that point, and then, the next time you clicked Step, it would step over $801234.
However, that said, I think once STP is executed, it'll stop altogether, so Step won't do anything after that. That might be what you're running into.
Yeah, is there any way to see where it ran into the stp? I mean, otherwise, it's not way too helpful.
Hey, actually, look what I found (that I really should have looked at earlier)...
Attachment:
SNES RAM.png [ 11.81 KiB | Viewed 1803 times ]
What's up with all these damn 5s?!
The initialization routine must not be correct (or at least that's the only explanation I can come up with.)
Those 0x55s are what bsnes (or at least bsnes-plus) initially fills WRAM with, so yeah, if those are still around, your init routine hasn't cleared WRAM.
The problem seems to be about here:
Code:
;**** clear SNES RAM ********
STZ $2181 ;set WRAM address to $000000
STZ $2182
STZ $2183
LDX #$8008
STX $4300 ;Set DMA mode to fixed source, BYTE to $2180
LDX #.LOWORD(zero_fill_byte)
STX $4302 ;Set source offset
LDA #.BANKBYTE(zero_fill_byte)
STA $4304 ;Set source bank
LDX #$0000
STX $4305 ;Set transfer size to 64KBytes (65536 bytes)
LDA #$01
STA $420B ;Initiate transfer
LDA #$01 ;now zero the next 64KB (i.e. 128KB total)
STA $420B ;Initiate transfer
What the heck? There's a large, 128KB patch of 0's in my rom? I'd have thought this would have been a simple loop, not a DMA transfer. I mean, why waste the space?
If the DMA channel's increment mode is set not to increment the A bus address, it will take one literal $00 in the ROM and spray it over the entire $7E bank and then over the entire $7F bank.
Yeah, there isn't a 128 KB patch of zeroes in your ROM, DMA is able to write a single byte to a bunch of addresses. That said, I am surprised that code doesn't seem to be working; it seems to be correct as far as I can see.
I guess if there's not something wrong there, there must be something wrong in the rest of the initialization routine? I don't know. Anyway, it's not that big, so...
Code:
.macro InitializeSNES
rep #$30 ;A=16, X/Y=16
;Note: this should correlate with ZEROPAGE in snes.cfg
lda #$0000
tcd ;Set D = $0000 (direct page)
;Note: this should correlate with the top of BSS in snes.cfg
ldx #$1fff
txs ;Set X = $1fff (stack pointer)
;Register initialisation values, per official Nintendo documentation
sep #$20 ;A=8
lda #$80
sta $2100
stz $2101
stz $2102
stz $2103
stz $2104
stz $2105
stz $2106
stz $2107
stz $2108
stz $2109
stz $210a
stz $210b
stz $210c
stz $210d
stz $210d
stz $210e
stz $210e
stz $210f
stz $210f
stz $2110
stz $2110
stz $2111
stz $2111
stz $2112
stz $2112
stz $2113
stz $2113
stz $2114
stz $2114
lda #$80
sta $2115
stz $2116
stz $2117
stz $211a
stz $211b
lda #$01
sta $211b
stz $211c
stz $211c
stz $211d
stz $211d
stz $211e
lda #$01
sta $211e
stz $211f
stz $211f
stz $2120
stz $2120
stz $2121
stz $2123
stz $2124
stz $2125
stz $2126
stz $2127
stz $2128
stz $2129
stz $212a
stz $212b
stz $212c
stz $212d
stz $212e
stz $212f
stz $4200
lda #$ff
sta $4201
stz $4202
stz $4203
stz $4204
stz $4205
stz $4206
stz $4207
stz $4208
stz $4209
stz $420a
stz $420b
stz $420c
stz $420d
;ClearVram
LDA #$80
STA $2115 ;Set VRAM port to word access
LDX #$1809
STX $4300 ;Set DMA mode to fixed source, WORD to $2118/9
LDX #$0000
STX $2116 ;Set VRAM port address to $0000
LDX #.LOWORD(zero_fill_byte)
STX $4302 ;Set source address to $xx:0000
LDA #.BANKBYTE(zero_fill_byte)
STA $4304 ;Set source bank
LDX #$0000
STX $4305 ;Set transfer size to 65536 bytes
LDA #$01
STA $420B ;Initiate transfer
;ClearPalette
STZ $2121
LDX #$0100
ClearPaletteLoop:
STZ $2122
STZ $2122
DEX
BNE ClearPaletteLoop
;**** clear Sprite tables ********
STZ $2102 ;sprites initialized to be off the screen, palette 0, character 0
STZ $2103
LDX #$0080
LDA #$F0
_Loop08:
STA $2104 ;set X = 240
STA $2104 ;set Y = 240
STZ $2104 ;set character = $00
STZ $2104 ;set priority=0, no flips
DEX
BNE _Loop08
LDX #$0020
_Loop09:
STZ $2104 ;set size bit=0, x MSB = 0
DEX
BNE _Loop09
;**** clear SNES RAM ********
STZ $2181 ;set WRAM address to $000000
STZ $2182
STZ $2183
LDX #$8008
STX $4300 ;Set DMA mode to fixed source, BYTE to $2180
LDX #.LOWORD(zero_fill_byte)
STX $4302 ;Set source offset
LDA #.BANKBYTE(zero_fill_byte)
STA $4304 ;Set source bank
LDX #$0000
STX $4305 ;Set transfer size to 64KBytes (65536 bytes)
LDA #$01
STA $420B ;Initiate transfer
LDA #$01 ;now zero the next 64KB (i.e. 128KB total)
STA $420B ;Initiate transfer
.endmacro
;======================================================================
.segment "RODATA"
;======================================================================
zero_fill_byte:
.byte $00
Just to let you know, this macro is the first thing that gets run. One thing I did notice is:
Quote:
;Note: this should correlate with ZEROPAGE in snes.cfg
lda #$0000
tcd ;Set D = $0000 (direct page)
;Note: this should correlate with the top of BSS in snes.cfg
ldx #$1fff
txs ;Set X = $1fff (stack pointer)
That's screwed up? I don't know.
Oh, and to let you guys know, if I place an stp where the macro is over, all of ram is #$55, just as it (sort of?) should be.
There's nothing wrong with any of that code (the initialisation routine, nor the DMA routine). Proof that it works is in the fact that the same code worked just fine in mode 20. So your description of something being broken is vague and ambiguous; you aren't being clear enough in what's wrong because you aren't looking at everything you should be. :-)
The problem, as I've already said before, is almost certainly that you're blindly doing writes to SNES registers not taking into account what B is. Odds are B is probably $C0 or something, so something like
sta $420b is going to write to $c0420b not $00420b like was happening with mode 20.
I already explained that the "easiest" way to rectify this is to use 24-bit addressing (e.g.
sta $00420b, etc.), and
tepples pointed out that ca65 requires very specific syntax to make that work properly.
The other option is to set B=$80 or B=$00 (your choice -- again, please see the mode 21 memory map) and then do writes to the SNES registers there, but this may cause problems if you're trying to load, say, some data from your ROM (B affects both reads and writes). In that situation, most people use 24-bit addressing on SNES register access.
What's wrong is that ram is filled with #$55, even after putting an stp after the routine (so that nothing else is run), so the initialization routine is the only thing being run. Because it's the first (and only) thing being run, B should be fine.
Do you have some form of instant messenger communication w/ audio (ex. Skype, TeamSpeak, Discord, etc.) and a microphone? I've noticed that several of your threads are ridiculously long because it's a constant back-and-forth that progresses very very slowly, all stuff that could be rectified or gone over easier in real-time.
If you do, I can try to spend some time with you some evening going over all of this alongside a screen capture/screen share so you can see what's going on. It would save a lot of time compared to these epic threads.
The "problem" isn't the initialisation routine, it's that everything was designed to be operating under the assumption of mode 20 -- where $0000-7FFF (which includes SNES registers) are mirrored across several banks where ROM resides (e.g. $0000-7FFF is mirrored from banks $00-3F, while your ROM is in $8000-FFFF in banks $00-3F as well, hence why a simple sta $2100 works -- this isn't the case with mode 21). I don't know how many times and ways I can explain this, LOL
Oh, if B is being set equal to K, yeah, that's almost certainly wrong for HiROM. B has to be set to one of the LoROM banks for this to work correctly. (That is, B should be one of $00-$3f, or $80-$bf.)
Otherwise, $2100, $2101, etc. will end up referring to e.g. $c02100, $c02101, which is ROM, not PPU registers. That would make writing to them do absolutely nothing.
Yes, that is the problem, but it's more complex than that because he's going to run into this dilemma throughout the rest of his code, not just the "initialisation routine". He needs to understand the key/major different in mode 21's memory map and the ramifications of it, alongside properly using a debugger to figure this stuff out in real-time.
Without explaining it thoroughly/properly, he's just going to end up doing stuff like blindly copy-pasting sep #$20, lda #$00, pha, plb into places in his code and be like "well this works, but now this other thing doesn't, why doesn't it work!!" and the thread will reach 50 pages within a few days. ;-)
koitsu wrote:
Do you have some form of instant messenger communication w/ audio (ex. Skype, TeamSpeak, Discord, etc.) and a microphone?
No. I'd probably sound like a jackass too considering my sister's room is right beside mine.
Nicole wrote:
Oh, if B is being set equal to K, yeah, that's almost certainly wrong for HiROM. B has to be set to one of the LoROM banks for this to work correctly. (That is, B should be one of $00-$3f, or $80-$bf.)
I don't see that happening anywhere in the code though.
koitsu wrote:
the thread will reach 50 pages within a few days.
But it'll be EPIC.
Then how about text-based instant messaging, such as IRC, XMPP, AIM, Hangouts, or Skype? You can often find people with some Super NES programming knowledge in the #nesdev channel on EFnet.
(When mentioning #nesdev, I feel to the need to point out again that the channel is known for tolerating off-topic talk to fill the time between on-topic conversations. Super NES programming is near-topic there, so it should be fine.)
The best I can do then is some kind of Twitch stream showing problems/issues/fixing all this stuff, but the problem is that there's a ~15 full second delay between what you see/hear and what the streamer is actually doing (this is, effectively, by design). The delay makes a Q&A/troubleshooting session fairly painful. I'm happy to do it anyway if asked; one could bang out most of the problems with this code under mode 21 in half an hour, most likely.
Upload a zip/rar of the latest code and I'll do it. -- Edit: I'll just
use the code from here.
And no, 50-page threads are not "epic" -- they may be helpful to a single individual, but they're worthless as reference material, even for the individual. When a forum thread is being used as an equivalent replacement for real-time chat, then the thread has lost its purpose + lost its effectiveness.
I worked on this for about 2 hours,
streaming it on Twitch. It's probably too long of a video to watch, but it's there.
It is 100% possible to get this to work in mode 21 -- I was successful in doing that -- however, this code is complete travesty. The organisation is 100% chaotic, which is one of the biggest problems: there's a high amount of
.include usage, while within those files there's setting of
.segments so that the segment changes suddenly.
This is what happens when you take someone who's barely learning assembly, using a "bare-bones-ish" (more so than cc65!) assembler (WLA DX) and force them into using cc65, where this whole concept of segments/memory/etc. layout is extremely important. TL;DR -- cc65 is really intended for "advanced" programmers who understand what's going on and what to look for. Espozo doesn't even understand how to use a debugger, so um... yeah...
The "get it working and don't think about it" approach involved two things: 1) moving all the code into the CODE_UPPER segment (i.e. run everything out of $c08000-c0ffff -- more specifically, $008000-00ffff) so that SNES register access worked all the time. This is essentially "like mode 20 but in mode 21", in a way. It's crummy, but it worked (I tried to do it the Right Way(tm) but there's just too much to modify and it's all organised badly (see above paragraph)), and 2) re-working part of the startup routines a little bit (probably unnecessary). #1 was the important bit.
I got the code working up to this point:
Code:
c085e3 rti A:3800 X:0000 Y:0000 S:1ffb D:0000 DB:00 nvmxdIzC V:232 H: 730 F: 4
c085a3 jsr $8208 [c08208] A:3800 X:0000 Y:0000 S:1fff D:0000 DB:00 nvmxdizC V:232 H: 782 F: 4
c08208 rep #$30 A:3800 X:0000 Y:0000 S:1ffd D:0000 DB:00 nvmxdizC V:232 H: 828 F: 4
c0820a lda #$0200 A:3800 X:0000 Y:0000 S:1ffd D:0000 DB:00 nvmxdizC V:232 H: 850 F: 4
c0820d sta $000c [00000c] A:0200 X:0000 Y:0000 S:1ffd D:0000 DB:00 nvmxdizC V:232 H: 874 F: 4
c08210 tcd A:0200 X:0000 Y:0000 S:1ffd D:0000 DB:00 nvmxdizC V:232 H: 914 F: 4
c08211 sep #$20 A:0200 X:0000 Y:0000 S:1ffd D:0200 DB:00 nvmxdizC V:232 H: 928 F: 4
c08213 lda $00 [000200] A:0200 X:0000 Y:0000 S:1ffd D:0200 DB:00 nvMxdizC V:232 H: 950 F: 4
c08215 pha A:0200 X:0000 Y:0000 S:1ffd D:0200 DB:00 nvMxdiZC V:232 H: 974 F: 4
c08216 rep #$30 A:0200 X:0000 Y:0000 S:1ffc D:0200 DB:00 nvMxdiZC V:232 H: 996 F: 4
c08218 lda $01 [000201] A:0200 X:0000 Y:0000 S:1ffc D:0200 DB:00 nvmxdiZC V:232 H:1018 F: 4
c0821a pha A:0000 X:0000 Y:0000 S:1ffc D:0200 DB:00 nvmxdiZC V:232 H:1050 F: 4
c0821b rts A:0000 X:0000 Y:0000 S:1ffa D:0200 DB:00 nvmxdiZC V:232 H:1080 F: 4
c00001 brk #$55 A:0000 X:0000 Y:0000 S:1ffc D:0200 DB:00 nvmxdiZC V:232 H:1122 F: 4
This made me laugh -- it's obvious what's going on here (the RTS pulls values off the stack which are probably something that a previous routine mangled/overwrote, so PC gets set to something completely bonkers, which ends up running code that equates to
$00 55 which is BRK $55).
None of this is my fault -- this is absolutely Espozo's code going nuts and has nothing (immediately) to do with mode 21 that I can tell.
BTW, the above code correlates with start_object_identifier and stuff around there. This looks like copy-pasta up the wazoo, so it's no wonder it's busted.
I can do a second attempt (still using the crappy "stick everything in CODE_UPPER" approach) and upload the results here later this week.
P.S. -- I became infuriated at a couple weird/wonky bugs I encountered in the process (ca65 wanting some sort of label notation on a macro when it wasn't necessary), as well as MAJOR annoyance when I found out that macros in the generated assembly listing don't actually get assembled to their byte counterparts until the macro is actually called (and you only get about 12 bytes worth). RIDICULOUS. Bring back the days of linear assemblers. Stop with all this mayhem! Listings are INSANELY important, and every time I see an assembler not do them correctly my blood boils.
P.P.S -- That build.bat script is so incredibly broken in numerous regards, including ca65/ld65 argument order. *sigh* :-)
koitsu wrote:
P.P.S -- That build.bat script is so incredibly broken in numerous regards, including ca65/ld65 argument order. *sigh*
I'm guessing it's mine. Welp.
Espozo, do you have GNU Make? I think it'd be better in practice for you to start using makefiles rather than completely rely on batch files.
As an example, this could be your makefile, assuming that the main file is main.s:
Quote:
CC = ca65
LD = ld65
FIX = superfamicheck
TITLE = basic
EMU = higan.exe
all: $(TITLE).sfc
$(EMU) $(TITLE).sfc
basic.sfc: main.o
$(LD) main.o -o $(TITLE).sfc -C snes.cfg
$(FIX) $(TITLE).sfc -S -f
main.o: main.s
$(CC) main.s -o main.o
clean:
del main.o $(TITLE).sfc
This assumes that you have ca65, ld65, superfamicheck (I can give binaries later today if need be) and higan.exe.
build.bat can consist of this:
Quote:
mingw32-make
And there you go.
I'd post how to get MinGW (not sure if you have it) but can't right now.
I'm on a phone and so probably made some errors in this post, so someone correct me.
Edit: yeah. I forgot that with single module makefiles, main.s or whatever you name it needs to have a dependency on all the other .s and .i files.
koitsu wrote:
P.P.S -- That build.bat script is so incredibly broken in numerous regards, including ca65/ld65 argument order. *sigh*
Might be caused by differing cc65 versions, the required argument order was changed at one point for some arguments.
koitsu wrote:
This is what happens when you take someone who's barely learning assembly, using a "bare-bones-ish" (more so than cc65!) assembler (WLA DX) and force them into using cc65, where this whole concept of segments/memory/etc. layout is extremely important. TL;DR -- cc65 is really intended for "advanced" programmers who understand what's going on and what to look for. Espozo doesn't even understand how to use a debugger, so um... yeah...
I've long felt that Espozo's learning approach is ill-fated. I recall mentioning it about a year ago on another of his threads, and I mentioned it on page 6 of this one. I received the same response ("You don't HAVE to help me"), but doesn't he realize the large gap he has created in his mental knowledge base?
I've watched Espozo reply to another thread focused on "the basics" with a similar gap in information -- that he himself doesn't seem capable of applying -- The full context can be gathered from
here -- "Khaz" gives a great response as linked . But then, in the following post, Espozo tells this person an "off-the-cuff" advanced remark. This could tempt the OP to learn this new concept -- but the OP has not even yet understood the true main reason for turning off the screen or developed a sense of basic vblank/NMI usage -- and by this I mean at least making some programs with basic NMI usage, not just reading a tutorial example and saying "Yeah I get it" -- that's never enough.
No beginner is going to benefit from being overly aware of advanced concepts, and Espozo is a great example of this. Further, tempting a beginner to learn an advanced concept before understanding the basics is a no no.
Here's an example of what I believe to be a proper basic response to the aforementioned thread - "turning off the screen is what you'll do at the beginning of your program to safely prepare VRAM and video settings -- After the screen has been turned on, further updates to video data are typically performed regularly through a VBLANK handler." That could follow with a basic description of VBLANKing -- Anyways that's probably taught in the tutorial the OP was following.
IMO, even if I were to have an "advanced" section to my tutorials -- I would not even mention them in the beginner tutorials, but only as something accessible directly from the TOC -- something clearly only accessible for someone with the express desire to learn "advanced tips and tricks."
Learning too fast is something I am guilty of -- I've occasionally skipped over basic sections of documentation to "get to the good part" -- but I think Espozo has gone overboard.
I feel Espozo knows all about these "out-there" concepts but nothing about the basics.
Of course I sympathize for Espozo -- I know programming for SNES requires a solid understanding of a myriad of concepts, tools, and languages.
--
Getting back to the discourse on ca65 / HiROM, I recalled that
Skipp and Friends and SNESKit contain a ca65 configurable ROM-mode implementation (slow/fast lo/hi), but I couldn't post it here because I knew Espozo would not understand what to do, without too much hand-holding - and I'm already over-investing my time with this post!
nicklausw wrote:
Espozo, do you have GNU Make? I think it'd be better in practice for you to start using makefiles rather than completely rely on batch files.
I love MAKE, but yeah, let's give the guy who barely understands how to write the proper SNES asm, or use his assembler conveniently, or use a debugger, a new build system! yay!!! And let's not teach him Makefile syntax, let's just give him a file and that's it! </sarcasm>
Case in point -- I don't think Espozo has learned to properly RTFM -- someone who *has* would look up Gnu Make / Makefiles / Makefile syntax and learn about it -- maybe ask some questions if that doesn't yield results -- but I can too easily imagine him just being like *derp, what's that? lol* And I don't blame him -- because who would want to go off and learn *yet another* new thing, while already having so many things that should be understood but aren't eg (proper use of assembler, debugger, asm, memory map, etc)
I think doing the manual conversion from lorom to hirom is a GREAT idea to learn the implementation details of the memory maps -- but Espozo, why don't you try converting a super-basic project from LoROM to HiROM first, at least!
If Koitsu is right that you don't properly understand how to use your assembler control commands or organize the code -- that should be a priority as you do this task. Try not to start learning or creating YET ANOTHER NEW THING. Maybe Koitsu can help guide you. Props to him for his time investment in you that he's already made with his Twitch video (no I did not watch it). Try to build upon your good techniques - not on bad ones. And if you realize you have a bad technique or habit -> think about taking the time to improve it.
koitsu wrote:
I worked on this for about 2 hours,
streaming it on Twitch.
So much effort!
Kudos to you though. Let's just hope the OP benefits from your findings and explanations. Also nice seeing you live and in color for once.
@00:14:47 -- I couldn't agree more. Never understood why on earth neviksti stuck his SNES initialization in a
macro of all places, which in turn contains a
jsl to the actual subroutine (named InitSNES in neviksti's SNES starter kit) ... just ridiculous. At least he didn't forget to back up the return address (as said subroutine zeroes-out WRAM, just like in this project) using some read/writable -- and unused in the process -- DMA registers, and restore it just before
rtl occurs. All of this hacky maneuvering can and definitely should be avoided indeed (by simply merging all of the initialization stuff into your bootstrap).
bazz wrote:
I love MAKE, but yeah, let's give the guy who barely understands how to write the proper SNES asm, or use his assembler conveniently, or use a debugger, a new build system! yay!!! And let's not teach him Makefile syntax, let's just give him a file and that's it! </sarcasm>
I was just trying to help improve the situation with his build system. If that's your nicest way of saying it's a bad idea then uh...wow.
Anyway, if Espozo wants to change his build system then we'll continue this in PM.
koitsu wrote:
P.S. -- I became infuriated at a couple weird/wonky bugs I encountered in the process (ca65 wanting some sort of label notation on a macro when it wasn't necessary)
Not sure if you figured this out (only watched a minute or two of the stream), but it was probably caused by the macro name not being defined by the time the macro got invoked. So ca65 assumed the invocation was a label name, and thus expected ":" there. I.e.
Code:
foo ; oops, ca65 thinks this is a label that's missing a colon
.macro foo
nop
.endmacro
I wrote him a worthy Windows command/batch script for his Metasprite project thing a few months ago -- I don't know why that's not being used. And while I agree that make is convenient, I'm not really sure introducing make into the picture is a worthy cause right now given the plethora of other issues with this code. Download the .zip and dig through it yourself and you'll see the main issue has to do with code organisation.
Another "annoyance" (using term loosely here) has to do with use of opcodes like bit and stz (amongst others) with SNES registers -- many opcodes don't support 24-bit addressing, so you end up in this situation where you basically have to effectively be in a bank with $0000-7FFF mirrored from bank $00 so you can do register access, because the code itself is also accessing some data in segment RODATA (alongside direct page accesses, which are 100% OK -- except that there are several routines where he's using tcd to switch D around and that was one example at the end of my stream where I showed that was likely stomping over some area of the stack so an rti resulted in "amusing" unexpected behaviour). Other bits of code in the same area explicitly use a: to force absolute addressing, so what B is matters. It doesn't help that I harp heavily on the positive use of generated listings so and even ca65 doesn't do that great of a job making these.
In other words: the code has to be fully reorganised to have a substantial amount of it stuck in the $8000-FFFF region so that it's usable/runnable from within banks $00-3F (or $80-BF) so register access can be done with "more ease". To those of us experienced, we "get" the how/why of this situation and know what's needed. But to someone who's struggling and doesn't understand some of the basic building blocks, this is hard to grasp. That leads me to this:
Stepping back for a moment and focusing, I think it would be important for him to have a better understanding of the system he's working on (specifically the CPU and the SNES memory model (mode 20 vs. mode 21 -- stay focused on just those two)). Respectfully, he's already admitted to not understanding the K and B registers and how he should go and read the WDC 65816 manual at some point (which actually has training material in it, to some degree), implying he hasn't. He admits to never having used a debugger before (I can relate to this when I was learning 6502 though, hence the need for good listing generations). This is just the tip of the iceberg. And again I say all that with respect, not actual negativity or heavy judgement. He really needs good training, with VERY basic tools (i.e. a linear/simple assembler that just creates a ROM image and doesn't screw over someone's brain with all the "organisation chaos" that ca65/ld65 brings to the table. That's highly advanced).
In other words: I can absolutely see how and why he might feel overwhelmed and extremely confused. Had I been in his shoes back in the 80s when I started learning 6502 on the Apple II+ -- inundated with highly complicated tools that required almost wizard-like knowledge of how to use them, vs. just generating a file linearly and letting me manage it myself -- I would have given up. On the flip side as a comparison, I spent a *lot* of time pouring over the Merlin manual as well as 6502 books and Apple II manuals, plus I had a teacher who every day could point me in the right direction.
Understanding the CPU and the system he's developing on is incredibly important. He has a "general disorganised comprehension" of it, and he's accomplished a lot given the circumstances, but the situation is just going to get worse as time goes on (and as more and more of us try to help him). That's why I asked about microphones and instant messenger capability -- he really needs a good teacher with a very well-defined syllabus for learning 65816 (and exclude crazy "advanced opcodes" that are uncommon).
God I wish I could track down Norman Yen and see if he would hand over the code to x816 so it could be ported to C and Win32 binaries made from it. I still think it's the most simple and "just makes sense" 65816 assemblers there is -- nothing beat it back in the 90s.
See, part of why I suggested we bring make into the mix is the fact that I feel like a good way of reorganizing his code would be splitting it into multiple modules. Your old batch file handled this, yes, but I feel like make would handle it better.
I disagree. I don't think make vs. batch files is the problem -- the problem is how the actual source is organised; it's more to do with ca65/ld65 nomenclature and complexity than it is with batch vs. make. The way he's writing code is *absolutely* how you use a linear or "basic" assembler (lots of .incsrc / .incbin directives, and you put things where you want them and know, effectively, that's where they're going to end up in the resulting file/ROM). All make is going to do is allow for source dependency tracking (foobar.asm depends on blah.asm, blah.asm isn't newer than blah.o so don't reassemble blah.asm, etc.) -- it's not going to deal with ca65/ld65 SEGMENT/MEMORY complexities or explaining "why" certain code assembles fine but doesn't actually *work* at runtime.
I believe he was using WLA DX prior to this, which has not only some assembly code generation bugs that psychopathicteen found, but also breaks horribly bad on listing generation. Now that I've spent more time with ca65 and it's -l argument, I can say I'm not all that impressed there either -- so in turn, that means spending more time in a debugger and really getting familiar with how to work out where a breakpoint should be set (I had to use ca65 listings to figure this out and did the manual math for the memory model in my head). That's beyond him right now.
So back to my point about teaching someone the basics being important... :-) I think that's also bazz's main point.
bazz wrote:
("You don't HAVE to help me")
Never mind, you "
HAVE" to help me now.
bazz wrote:
Case in point -- I don't think Espozo has learned to properly RTFM -- someone who *has* would look up Gnu Make / Makefiles / Makefile syntax and learn about it -- maybe ask some questions if that doesn't yield results -- but I can too easily imagine him just being like *derp, what's that? lol* And I don't blame him -- because who would want to go off and learn *yet another* new thing, while already having so many things that should be understood but aren't eg (proper use of assembler, debugger, asm, memory map, etc)
I know, what a complete dumbass. With a whopping 0 years of prior programming experience before landing here, you'd really think he'd know better. "</sarcasm>"
bazz wrote:
I sympathize for Espozo
I'm sensing truck loads of sympathy. Koitsu, on the other hand, is a total kaizo, telling people that they're doing everything wrong without giving any specific, usefull advice on how to improve. "</sarcasm>"
nicklausw wrote:
I was just trying to help improve the situation with his build system. If that's your nicest way of saying it's a bad idea then uh...wow.
Remember though, "</sarcasm>" indicates it was a joke and that you must treat it as such, meaning there was absolutely no harm meant to be done. I have several examples above on how to properly use "</sarcasm>" above for reference, incase you don't know how to RTFM for being humorous.
koitsu wrote:
I worked on this for about 2 hours, streaming it on Twitch. It's probably too long of a video to watch, but it's there.
What a trooper!
(I feel so special...
)
koitsu wrote:
The way he's writing code is *absolutely* how you use a linear or "basic" assembler (lots of .incsrc / .incbin directives, and you put things where you want them and know, effectively, that's where they're going to end up in the resulting file/ROM).
I don't understand why they're not made like this now, especially considering how important the position of everything in memory is.
koitsu wrote:
I wrote him a worthy Windows command/batch script for his Metasprite project thing a few months ago -- I don't know why that's not being used. And while I agree that make is convenient, I'm not really sure introducing make into the picture is a worthy cause right now given the plethora of other issues with this code. Download the .zip and dig through it yourself and you'll see the main issue has to do with code organisation.
The reason for this is that I was tired of writing all the imports, exports, and all that stuff, and I didn't think I had to.
koitsu wrote:
so what B is matters.
It's sad I already forgot, but isn't B for selecting the bank for the program counter? What would you generally recommend having it as for HiROM?
Just thinking, how does the assembler split up banks and stuff? I mean, if the program counter just wraps around and doesn't increment the top 8 bits, then it should try to fit everything in 64KB regions. (banks) I imagine you'd want the same with data, not just code.
koitsu wrote:
"annoyance" (using term loosely here) has to do with use of opcodes like bit and stz (amongst others) with SNES registers -- many opcodes don't support 24-bit addressing
I hope ca65 will complain that there's a memory overflow rather than just discarding the top 8 bits of the address?
koitsu wrote:
Other bits of code in the same area explicitly use a: to force absolute addressing, so what B is matters.
Again, B is just kind of an extra 8 bit extension of D?
koitsu wrote:
he really needs a good teacher with a very well-defined syllabus for learning 65816 (and exclude crazy "advanced opcodes" that are uncommon).
Unfortunately, only a handful of people have this knowledge, none that I know locally, (I have a good deal of friends, but they don't understand why I like doing this) pretty much just the people here. I suppose I could set up something, but the problem with this kind of communication is that you need to meet up at certain times, and due to homework and chores and crap, it's not always the same. I get home around 4, and don't go to bed until about 12 (Local time. I don't know how it compares to you).
Espozo wrote:
I'm sensing truck loads of sympathy. Koitsu, on the other hand, is a total kaizo, telling people that they're doing everything wrong without giving any specific, usefull advice on how to improve. "</sarcasm>"
koitsu wrote:
I worked on this for about 2 hours, streaming it on Twitch. It's probably too long of a video to watch, but it's there.
What a trooper!
(I feel so special...
)
I seems like this was miscommunicated, but what koitsu meant was that he recorded a two hour video to explain it to you:
https://www.twitch.tv/koitsu/v/58543863
Rather than inundate the thread with a huge amount of inline quotes, I'll just answer the stuff here:
1. B is the Data Bank Register and is 8 bits (1 byte). You can think of B as the "upper 8 bits" of the effective 24-bit address for memory access. Example: lda $1234 would load into the accumulator data from $1234 in whatever bank B is set to. Most memory access uses B -- the exception is anything using direct page addressing (e.g. lda $12) or anything using full 24-bit addressing (e.g. lda $45aaaa or in ca65 nomenclature lda f:$45aaaa). bsnes-plus's debugger uses the name "DB" for B.
Opcodes like phb (push B) and plb (pull B) are how you can manipulate the B register.
Do not confuse this register with the upper and lower bytes of the 16-bit accumulator, which are sometimes referred to as "B" and "A" (ex: see xba opcode). The full 16-bit accumulator is also sometimes called "C" (referenced in opcodes like tcd, tcs, tdc, tsc). You learn what's referring to what based on context and experience.
2. K is the Program Counter Bank Register and is 8 bits (1 byte). You can think of K as the "upper 8 bits" of the effective 24-bit address of where code is actively running. Example: jml $45aaaa would jump to $aaaa in bank $45, hence K=$45 PC=$aaaa after the jump. bsnes-plus doesn't show K as a separate register, instead it displays PC as a full 24-bit address. (Note: jml is not a real opcode, the real opcode is jmp; it's a common alias, and is the same as doing jmp f:$45aaaa).
Opcodes like phk allow you to push K onto the stack but there's no equivalent pull operation -- instead manipulation of K is accomplished via like jml $45aaaa (24-bit long jump), or rtl (which pulls a full 24-bit address (K and PC) off the stack).
3. D is the Direct Page Register and is 16-bits (2 bytes). It's defines the "base address" of where direct page access happens; direct page is always in bank $00. For example, lda #$0800, tcd, lda $0a would set D=$0800 (hence $000800) and attempt to load a 16-bit value from direct page location $0a (hence effective addresses $00080a and $00080b). D has no relation to K or B.
There are some SNES games (ex. Chrono Trigger) that do something clever using D: lda #$2100, tcd. From this point forward, something like sep #$20, lda #$0f, sta $00 would write $0f to $2100 (SNES register). They essentially use D as a way to "quickly" access SNES registers (because direct page is always in bank $00) without having to deal with the complexity of B. The downside is that they end up having to use either 16-bit absolute addresses for accessing their variables (which are usually in direct page), or the full 24-bit addresses, depending on what their needs are. I only point this out because it's cute/clever and might be convenient, especially in mode 21.
4. "What would you generally recommend having it {not sure if referring to B or K} for mode 21?" -- that's the problem: it depends. You need to understand the mode 21 memory map to understand why the answer "it depends".
For example, the first issue with most of your code stemmed from the fact that jml {next instruction}, phk, plb was being used. In essence, that set K=$C0, followed by setting B=$C0. Subsequently all SNES register I/O, e.g. sta $2100, were therefore (effectively) doing sta $c02100 -- that's not going to work (again: see mode 21 memory map). This is one of the main things that makes mode 21 tricky. The fact that banks $00-3f and $80-bf have their upper 32KBytes ($8000-ffff) mirrored from banks $c0-ff plays an important role, but makes things "difficult" to comprehend for a newcomer.
5. "... If the program counter just wraps around and doesn't increment the top 8 bits, then it should try to fit everything in 64KB regions. (banks) I imagine you'd want the same with data, not just code" -- your line of thinking here is absolutely correct.
On other 65816 systems (ex. Apple IIGS), the memory layout is substantially different (IMO better), and you end up using full 64KByte banks more commonly. There's nothing different about the 65816 in the SNES in this regard -- instead the complication stems from the memory map.
You need to look closely at the mode 21 memory map, specifically the difference between banks $c0-ff and banks $00-3f (or $80-bf), to understand exactly the complication. In mode 21, given the need to do SNES register access, you absolutely must end up writing code that goes into the upper half ($8000-ffff) of $c0-ff (thus ends up in the upper half of banks $00-3f or $80-bf). I don't know how else to describe this without writing a bloody book. It's a lot easier to just show someone the memory map and say "think about this fact".
6. (re: attempting to use 24-bit addressing with opcodes that don't support it) "I hope ca65 will complain that there's a memory overflow rather than just discarding the top 8 bits of the address?" -- correct. It spits out a message that's a little more helpful. (I ran into this myself at one point, I forgot stz doesn't support long addressing).
All in all, I really would suggest you spend some time cleaning up your code (maybe tepples could help with this, he's good with ca65/ld65) and stick with mode 20. I really do not think you are anywhere near ready for mode 21 yet. Spend more time learning the 65816, the SNES memory map (generally speaking), and ca65/ld65 complexities. Worry about moving from mode 20 to mode 21 when you actually have a need for it.
I'm not gonna dive too deep into this discussion, but I feel that the nomenclature around 65816 registers is further messing things up...
Koitsu prefers the shorthand names that can be inferred from the mnemonics, which is concise but quite confusing. "K" and "B" are both inferred from the word "BanK", so the abbreviations don't hold any clues about the purpose of the registers. (Great work with the stream btw!)
WDC, bsnes, and most other documentation use these names:
DB / DBR = Data Bank (Register)
PB / PBR = Program Bank (Register)
DP / DPR = Direct Page (Register)
Using "B" to refer to DB is rather confusing indeed, since "B" is what WDC calls the upper (hidden) 8-bits of the accumulator.
Edit: Of course we can blame the demented 3-letter mnemonics WDC uses for all instructions with implied operand... I fully sympathize with Nocash's motivation for using a custom 65816 dialect (and have dabbled with my own take on the problem as well, but always come to the conclusion that using the standard mnemonics is least confusing in the long run).
You know... If you put your LoRom code as-is into a HiRom model, with the current code going into the upper 32k of the preliminary banks, you could effectively use your code unmodified, and write the rest under the hirom model.
There is a
thread by AWJ observing other commercial games that do this and use the lower 32k just for data.
EDIT: This LoRom code should be executed from "system" banks $00-$3F / $80-$BF so it can access the WRAM / PPU regs etc.
Espozo wrote:
It's sad I already forgot, but isn't B for selecting the bank for the program counter? What would you generally recommend having it as for HiROM?
K (program bank) is for the program counter; B (data bank) is for data. Point B to the data you're currently using. If you're using data in two different banks, such as data in ROM and data in RAM, use 24-bit addressing with one set of data, and point B at the set that you're using with an addressing mode that doesn't support 24-bit.
Espozo wrote:
I mean, if the program counter just wraps around and doesn't increment the top 8 bits, then it should try to fit everything in 64KB regions. (banks) I imagine you'd want the same with data, not just code.
The program counter does not increment the top 8 bits. Indexed addressing modes for data (
al,x and
[d],y) increment the top 8 bits.
Espozo wrote:
I hope ca65 will complain that there's a memory overflow rather than just discarding the top 8 bits of the address?
It should provide "Range error".
Espozo wrote:
Again, B is just kind of an extra 8 bit extension of D?
No. D (direct page base) and S (stack pointer) always point into bank $00. B is used only for 16-bit addressing modes.
koitsu wrote:
In mode 21, given the need to do SNES register access, you absolutely must end up writing code that goes into the upper half ($8000-ffff) of $c0-ff (thus ends up in the upper half of banks $00-3f or $80-bf).
Or, with code in bank $C0-$FF, write to
f:$0021xx or
f:$0043xx. If the
f: prefix is ugly, you could instead write to
$8021xx or
$8043xx.
koitsu wrote:
I don't know how else to describe this without writing a bloody book. It's a lot easier to just show someone the memory map and say "think about this fact".
I don't know if it'd count as "writing a bloody book" to you, but you could do the "common pitfalls" method: list each wrong way that you've seen often, explain why it's wrong with an excerpt from the memory map, show the corresponding right way, and explain why it's right with an excerpt from the memory map.
At this point, I wonder if it'd be better to teach mode 21 first, not bothering with mode 20, because mode 21 can be used in a way more similar to Apple IIGS.
Optiroc wrote:
Using "B" to refer to DB is rather confusing indeed, since "B" is what WDC calls the upper (hidden) 8-bits of the accumulator.
Does anything refer to bits 15-8 of A as "B" other than the XBA instruction? If not, we could retcon that as "eXchange Bytes of A".
tepples wrote:
Optiroc wrote:
Using "B" to refer to DB is rather confusing indeed, since "B" is what WDC calls the upper (hidden) 8-bits of the accumulator.
Does anything refer to bits 15-8 of A as "B" other than the XBA instruction? If not, we could retcon that as "eXchange Bytes of A".
Programming the 65816 (WDC) refers to it as such where applicable. Here's the register file reference by the way (fig. 4-1, p. 46):
Edit: sorry about the weirdly scaled image... And that would be a good retcon! Although in a case like this I don't think it's warranted, and calling the bank registers "B" and "K" is a bad idea (and so were the official mnemonics).
How about this? (joking)
- Accumulator low, high, full: Instead of calling them A, B, and C, call them AL, AH, and AX
- Program Bank Register: Instead of calling it K, call it CS
- Data Bank Register: Instead of calling it B, call it DS
- Direct Page Register: Instead of calling it D, call it BP
"This is the result of 1584 improvements to the 6502. We call it 8086."
tepples actually got me to LOL :D
Weirdly scaled image, transcribed to be searchable:
Code:
65816 Native Mode Programming Model
(16-bit accumulator and index registers, mx = 00)
23 16 15 8 7 0
[Accumulator (B) (A) or (C) Accumulator (A)]
[ Data Bank Reg.(DBR) ]
[ X Index|Register (X) ]
[ Y Index|Register (Y) ]
[0 0 0 0 0 0 0 0][ Direct|Page Register (D) ]
[0 0 0 0 0 0 0 0][ Stack|Pointer (S) ]
[Program Bank Reg.(PBR)][ Program|Counter (PC) ]
Processor Status Register
7 6 5 4 3 2 1 0
[e] - Emulation 0 = Native Mode
[n][v][m][x][d][i][z][c]
| | | | | | | +------- Carry 1 = Carry
| | | | | | +----------- Zero 1 = Result Zero
| | | | | +------- IRQ Disable 1 = Disabled
| | | | +--------- Decimal Mode 1 = Decimal, 0 = Binary
| | | +--- Index Register Select 1 = 8-bit, 0 = 16-bit
| | +-- Memory/Accumulator Select 1 = 8-bit, 0 = 16-bit
| +---------------------- Overflow 1 = Overflow
+------------------------- Negative 1 = Negative
tepples wrote:
Espozo wrote:
I mean, if the program counter just wraps around and doesn't increment the top 8 bits, then it should try to fit everything in 64KB regions. (banks) I imagine you'd want the same with data, not just code.
The program counter does not increment the top 8 bits. Indexed addressing modes for data (
al,x and
[d],y) increment the top 8 bits.
Unfortunately it seems DMA wraps within a bank. So it depends what sort of data you're talking about.
I'm under the impression that pretty much nothing should be between banks?
The biggest problem would be if I were to pull through with the original project of this thread, that would be unavoidable. (The buffer is too big to fit in one bank, so you'd have to find a way to draw between them, and the split could be wherever, vertically or horizontally, as it goes by tiles.)
Anyway, I haven't watched the video yet because I've been busy with school work (I'm going to be missing a few days of school and wanted to get the crap done, not that it matters) but I'll try and get to it whenever the time presents itself.
bazz wrote:
You know... If you put your LoRom code as-is into a HiRom model, with the current code going into the upper 32k of the preliminary banks, you could effectively use your code unmodified, and write the rest under the hirom model.
This is pretty much what I ended up doing during my stream. I stuck almost everything under the CODE_UPPER segment ($C08000-FFFF, a.k.a. $008000-FFFF). The code broke elsewhere, but unrelated to my changes. It's likely the "best" way to manage mode 21 -- you leave your lower 32KB ($0000-7FFF) for primarily raw data, and access it either via DMA or 24-bit addressing or whatever other means available.
None of this changes any of my prior points though. What took me 2 hours of struggling and lots of
this should've really been a 15-20 minute job tops. But the deeper I went, the more and more madness I kept finding. And the madness made sense when I considered the situation and details (already discussed in past 2-3 pages).
Bring back x816. ;-)
Dang, I finally got through that video (I split it into 3 parts so I wouldn't totally loose interest) I'm totally amazed how you didn't completely loose your shit. (I liked "macro bitch" though.
) Yeah, I realize how much of a screw up I was. I had a false sense of security from the fact that all the addressing I'd been doing seemed to work without me thinking about it, and that definitely changed in going into HiROM. I hadn't even considered memory mapping or addressing, I just kind of thought the assembler completely took care of it, which I now realize is impossible. I have no clue why they made the memory map so complicated when they could have just put ram from blank to blank, and rom from blank to blank but that have it all interweaved and mirrored a crap. I guess it's meant to save cycles in not doing 24bit addresses or whatever, but it doesn't seem to help any in HiROM. In fact, do you know of a good illustration of HiROM and LoROM that's not part of the manual? Constantly looking through a PDF is kind of cumbersome. Yeah though, the trying to force "initializesnes" into being a routine was really odd and dumb.
On the object initialization routine, the rts was meant to serve as a jump to the object's code, as the address of the object had been pushed onto the stack. I have no idea why it flipped out that way, and believe it or not, that's actually one of the few things I didn't copy/paste.
A few last things though: What should I do to better organize the code? Group things into fewer files? Not use ".include" and instead do that ".import" and ".export" nuisance? Also, where's the actual code that you fixed? Don't tell me I was supposed to be following along...
Oh yeah, and thank you very much for enduring those two and a half hours of suffering for me.
Espozo wrote:
do you know of a good illustration of HiROM and LoROM that's not part of the manual?
This doc explains memory mapping in general using LoROM as an example.
This section breaks down the memory map
In fact, the section header really sums up the common SNES cart mappings
Code:
$00-3F : System Area
$40-6F : All ROM
$70-7D : SRAM + ROM
$7E-7F : System RAM
$80-FF : Upper Mirror
There is deeper detail in the doc, but that's a nice general rule of thumb.
HiROM is probably identical to LoROM, with the following exceptions:
- "All ROM" section has linear 64K ROM chunks instead of upper 32K
- "System Area" ROM is no longer linear 32K chunks of ROM, but "gapped" upper 32K of each 64K ROM bank.
- SRAM section is 64K instead of SRAM + ROM interleaved (32K + 32K).
---
I like to count my lorom banks from $00+, and my HiROM banks from $40+ which mirror their upper 32K to $00+ (the "upper mirror" is an afterthought, aside from the unique $FE-FF banks).
An interesting note is that in LoROM, "System Area" banks will mirror into "All-Rom" banks ONLY if there isn't enough unique Cart ROM to reach to "All-Rom" banks. But in HiROM, the upper 32K of "All-Rom" banks is always mirrored directly to the upper 32K of "System Area."
This creates the unique issue where you can never have a HiROM cart with completely unique "System Area" ROM from its "All ROM" ROM, and you can never have a LoROM cart with unique 64K in it's "All ROM" section. This might cause one to dream of a combination mapper, This would require a more sophisticated mapper, and perhaps this is what is referred to as ExLoRom or ExHiROM. I'm not familiar with those mappings.
---
I want to elaborate how mirrors are generated primarily from two circumstances.
A) Cart peripheral (eg memory) Does not have as wide an address range as the full SNES address range (and therefore repeats itself when the SNES "transcends" its range. The smaller a peripheral's range, the more times it will repeat.
B) Cart peripheral (eg memory) address lines aren't connected "1:1" to SNES address lines. (eg. lorom)
A cart could be made to not mirror, but it's cheaper to keep it "dumb." It could also be considered a "protection mechanism" (eg. accessing data at different mirrors to circumvent cloning / partial emulation solutions)
"The Doc" URL above fully explains this phenomenon, providing full examples of binary transitions of 7FFF - 8000 and why it creates a mirror in LoROM, and why 00:FFFF-01:0000 calls upon the next 32K section of ROM.
---
I would not be surprised if LoROM was created as an answer to the non-linearity of the HiROM mapping to the "System Area" banks -- which are the most commonly used ones. At least, this makes LoROM an ideal mapping for games that can fit within the System Area banks IMO. But I am quick to retract my opinion, because I know the memory mapping is truly best declared around a project which will call for a certain design. However, for a game that could fit in the System Area, the only attraction I can see to not using LoROM is the ability to use straight 64K data, and this requires a sacrifice of every 32K of ROM data to not reside in the System Area. So, as I said, it's really based on project, preference, coding style, etc.
And yes, I edited this post a shit-ton. I guess the memory map is a passion of mine
Espozo wrote:
In fact, do you know of a good illustration of HiROM and LoROM that's not part of the manual? Constantly looking through a PDF is kind of cumbersome.
No, there really isn't a good depiction of the layout in any format that I consider "worthy". The official developers manual, still to this day, IMO, has the best depiction possible. Re: looking through PDFs: I feel you. What's stopping you from printing out onto paper the pages that are relevant? You want pages 2-21-1 through 2-21-4. I still rely heavily on paper, and in many ways prefer it over digital.
Espozo wrote:
Yeah though, the trying to force "initializesnes" into being a routine was really odd and dumb.
The core complexity is that the routine erases WRAM ("work RAM"). I had completely forgotten that $7E0000-1FFF is the same piece of RAM that's in $000000-1FFF. When the DMA routines zeros out $7E0000-FFFF and $7F0000-FFFF, it's zeroing out where both direct page and the stack are. Hence, when RTS happens, the return address has been lost.
So using a macro is one possibility, and I have no problem with macros, but I do have a problem with how ca65 generates code listings in relation to macros. I really don't like how the assembler prints no actual instructions for code in the macro until the macro is found to be used -- and then all you get are about 12 bytes of the assembled results. Ridiculous. (And no, printing more of the bytes is not helpful, because you really need to see the bytecode associated with each instruction/line of code -- in other words, it should be shown when displaying the macro itself, not when the macro is found to be used) IMO, all of this is because ca65/ld65 are tools from the cc65 suite, which is a C compiler. (Macros in C tend to be very short/small)
Generally speaking, it never ceases to amaze me how much time and effort I spend these days fighting with tools -- I spend more time dealing with this crap than I do actually writing code or getting shit done. It wasn't always like this, for the record -- as I put on my crotchety old man hat: things *were* better 20 years ago.
Espozo wrote:
A few last things though: What should I do to better organize the code? Group things into fewer files? Not use ".include" and instead do that ".import" and ".export" nuisance?
This is probably worthy of a 2 hour talk in itself. Honestly most of the organisation/insanity stems from what I already described in a previous post: you're organising things much like how I would in a classic assembler (which is perfectly fine!), but ca65/ld65 really isn't intended to be used like that. (The tools are also highly focused on 6502/65c02 and not so much 65816, which doesn't help either)
Those who use these for SNES programming would act as a better source of information on this subject than me -- I'm used to classic assemblers (no linker, everything is linear, and if you're lucky it might have support for SNES-isms). But I don't know who else has used these tools for SNES work.
Espozo wrote:
Also, where's the actual code that you fixed? Don't tell me I was supposed to be following along... :lol:
I have it laying around here, but I'm not uploading it because it really isn't worth uploading. Basically the entire 2+ hours was a waste, because ultimately the "crummy solution" was to segment the lower and upper 32KByte sections of bank $00 (re: CODE_LOWER and CODE_UPPER) and then stick all your code into CODE_UPPER (so that it works from bank $00). bazz
hinted at this exact thing. It's something like this:
Code:
MEMORY {
ZEROPAGE: start = $000000, size = $0100; # $0000-00ff -- zero page
# $0100-01ff -- stack (on power-on/reset)
BSS: start = $000200, size = $1e00; # $0200-1fff -- RAM (also $7e0000-7e1fff)
BSS7E: start = $7e2000, size = $e000; # SNES work RAM, $7e2000-7effff
BSS7F: start = $7f0000, size = $10000; # SNES work RAM, $7f0000-$7ffff
ROM0_LOWER: start = $C00000, size = $8000, fill = yes; # $C00000-C07FFF
ROM0_UPPER: start = $C08000, size = $8000, fill = yes; # $C08000-C0FFFF (maps to $008000-00FFFF)
ROM1: start = $C10000, size = $10000, fill = yes; # $C10000-C1FFFF
ROM2: start = $C20000, size = $10000, fill = yes; # $C20000-C2FFFF
ROM3: start = $C30000, size = $10000, fill = yes; # $C30000-C3FFFF
ROM4: start = $C40000, size = $10000, fill = yes; # $C40000-C4FFFF
ROM5: start = $C50000, size = $10000, fill = yes; # $C50000-C5FFFF
ROM6: start = $C60000, size = $10000, fill = yes; # $C60000-C6FFFF
ROM7: start = $C70000, size = $10000, fill = yes; # $C70000-C7FFFF
}
# Logical areas code/data can be put into.
SEGMENTS {
CODE: load = ROM0_LOWER, start = $C00000;
RODATA: load = ROM0_LOWER, align = $100;
CODE_UPPER: load = ROM0_UPPER, start = $C08000;
SNESHEADER: load = ROM0_UPPER, start = $C0FFC0;
CODE1: load = ROM1, align = $100, optional = yes;
RODATA1: load = ROM1, align = $100, optional = yes;
CODE2: load = ROM2, align = $100, optional = yes;
RODATA2: load = ROM2, align = $100, optional = yes;
CODE3: load = ROM3, align = $100, optional = yes;
RODATA3: load = ROM3, align = $100, optional = yes;
CODE4: load = ROM4, align = $100, optional = yes;
RODATA4: load = ROM4, align = $100, optional = yes;
CODE5: load = ROM5, align = $100, optional = yes;
RODATA5: load = ROM5, align = $100, optional = yes;
CODE6: load = ROM6, align = $100, optional = yes;
RODATA6: load = ROM6, align = $100, optional = yes;
CODE7: load = ROM7, align = $100, optional = yes;
RODATA7: load = ROM7, align = $100, optional = yes;
ZEROPAGE: load = ZEROPAGE, type = zp;
BSS: load = BSS, type = bss, align = $100, optional = yes;
BSS7E: load = BSS7E, type = bss, align = $100, optional = yes;
BSS7F: load = BSS7F, type = bss, align = $100, optional = yes;
}
...followed by many things modified to use
.segment "CODE_UPPER" instead of just CODE.
koitsu wrote:
Espozo wrote:
In fact, do you know of a good illustration of HiROM and LoROM that's not part of the manual? Constantly looking through a PDF is kind of cumbersome.
No, there really isn't a good depiction of the layout in any format that I consider "worthy". The official developers manual, still to this day, IMO, has the best depiction possible.
What's the first thing I could do to improve
my illustration?
Quote:
I do have a problem with how ca65 generates code listings in relation to macros. I really don't like how the assembler prints no actual instructions for code in the macro until the macro is found to be used -- and then all you get are about 12 bytes of the assembled results. Ridiculous. ([The listing] should be shown when displaying the macro itself, not when the macro is found to be used)
Say you want to list a macro at definition time. How would you recommend to do this if has a lot of conditional assembly, such as
.if statements? List each branch? And how would you recommend to handle long
.repeat statements? If you want, you can take your improvement suggestions to a new topic so that we can hammer out the details before formally filing a feature request on GitHub to improve expansion of macros in the listing.
Quote:
It wasn't always like this, for the record -- as I put on my crotchety old man hat: things *were* better 20 years ago.
And why don't we have the tools now that we had then (i.e. x816)? Proprietary software is why.
espozo wrote:
Yeah though, the trying to force "initializesnes" into being a routine was really odd and dumb.
Oh, when I said that, I was referring to the assembler wanting to do that with the colon. I don't feel like starting another argument...
koitsu wrote:
What's stopping you from printing out onto paper the pages that are relevant?
I didn't think of that...
koitsu wrote:
have it laying around here, but I'm not uploading it because it really isn't worth uploading. Basically the entire 2+ hours was a waste, because ultimately the "crummy solution" was to segment the lower and upper 32KByte sections of bank $00 (re: CODE_LOWER and CODE_UPPER) and then stick all your code into CODE_UPPER (so that it works from bank $00).
Instead, you wanted to have it in bank C0? Isn't it mirrored anyway? I'm not sure what the problem is... I just need to look at the memory map again.
It's mirrored, but telling ld65 to link the first half into $C00000-$C07FFF and the second half into $808000-$80FFFF (or equivalently $008000-$00FFFF if you don't care about fast ROM) allows instructions that access the program bank register (JML, JSL, and PHK) to work as expected for B=K operation.
And it's not doing that? So, is this an assembler problem?
tepples wrote:
Say you want to list a macro at definition time. How would you recommend to do this if has a lot of conditional assembly, such as .if statements? List each branch? And how would you recommend to handle long .repeat statements? If you want, you can take your improvement suggestions to a new topic so that we can hammer out the details before formally filing a feature request on GitHub to improve expansion of macros in the listing.
Handle it the same way you would _outside_ of a macro -- for things which are variable or which it cannot determine, ca65 appears to print "rr" for the relevant byte(s). Here's a demonstration of the problem:
Code:
000000r 2 .macro InitializeSNES
000000r 2 ; Register initialisation values, per official Nintendo documentation
000000r 2
000000r 2 sep #$20 ; A=8
000000r 2
000000r 2 lda #$80
000000r 2 sta $2100
000000r 2 stz $2101
000000r 2 stz $2102
000000r 2 stz $2103
000000r 2 stz $2104
000000r 2 stz $2105
000000r 2 stz $2106
000000r 2 stz $2107
000000r 2 stz $2108
000000r 2 stz $2109
000000r 2 stz $210a
000000r 2 stz $210b
000000r 2 stz $210c
000000r 2 stz $210d
000000r 2 stz $210d
000000r 2 stz $210e
000000r 2 stz $210e
000000r 2 stz $210f
000000r 2 stz $210f
000000r 2 stz $2110
000000r 2 stz $2110
000000r 2 stz $2111
...
{snipping for brevity -- just assume there's an .endmacro eventually}
...
000076r 2 A2 FF 1F ldx #$1fff
000079r 2 9A txs
00007Ar 2
00007Ar 2 E2 20 A9 80 InitializeSNES
00007Er 2 8D 00 21 9C
000082r 2 01 21 9C 02
0001E4r 2
...
In other words: listings generation of macros is nearly worthless. What I EXPECT to see are the assembled bytes next to each instruction/line of code in the macro. Showing 12 bytes when the macro is called = might as well not use macros (IMO). This basically defeats the point of a generated listing.
koitsu wrote:
Handle it the same way you would _outside_ of a macro -- for things which are variable or which it cannot determine, ca65 appears to print "rr" for the relevant byte(s).
This is not really possible in any reasonable way because of the possibility of conditionals within the macro (as tepples said). The problem gets even worse when you add parameters to the macro.
Showing the relevant lines of the macro at the macro invocation site is a reasonable request, though. That said, the chances of the feature appearing out of the blue are slim, unless somebody who
really wants that feature also implements it. (There's a bit of a community around cc65, but not many people are actively working on new features, it's mostly bugfixes.)
Perhaps one solution is to show the listing for both the .if and .else branches, again with rr replacing things that cannot be determined at macro definition time. But perhaps that'd be too much effort just to illustrate a point to koitsu, as nontrivial macros would end up producing a listing that's almost entirely rr. Sometimes not even the length is known at macro definition time.
koitsu: So that more of us can understand what you're requesting, can you define a branching macro in x816 or another preferred assembler, instantiate it, and show the listing file it produces?
Code:
D:\Console\asm6>type test.asm
MACRO mymacro
lda #$12
ldx #$ff
sta $2001
lda $2002
ENDM
MACRO mymacro2 arg1,arg2
ldx #arg1
ldy #arg2
nop
ENDM
ORG $8000
:- lda #$00
mymacro
jmp :-
:- mymacro2 $aa,$ee
jmp :-
D:\Console\asm6>asm6 -L test.asm test.out test.lst
pass 1..
test.out written (23 bytes).
test.lst written.
D:\Console\asm6>type test.lst
MACRO mymacro
lda #$12
ldx #$ff
sta $2001
lda $2002
ENDM
MACRO mymacro2 arg1,arg2
ldx #arg1
ldy #arg2
nop
ENDM
ORG $8000
08000 A9 00 :- lda #$00
08002 mymacro
08002 A9 12 lda #$12
08004 A2 FF ldx #$ff
08006 8D 01 20 sta $2001
08009 AD 02 20 lda $2002
0800C 4C 00 80 jmp :-
0800F :- mymacro2 $aa,$ee
0800F A2 AA ldx #$aa
08011 A0 EE ldy #$ee
08013 EA nop
08014 4C 0F 80 jmp :-
Remarkably hard, isn't it? *shakes head*
That's a start. How about a macro taking arguments or including some
IF statements? (Source:
ASM6 1.6 Users Guide)
Just covered it (see edit). So, in other words: ca65's listing generation can suck it when it comes to macros. :P
Thanks. So far the rule I'm guessing is "Generate nothing in the hex column beside a macro definition, but generate hex beside each line of each macro instantiation as if it were written directly in the source file."
If I can see how ASM6 listings treat IF and REPT, I might have enough to bother
cc65 Issues.
Listing results are exactly what I'd expect:
Code:
D:\Console\asm6>type test.asm
MYVAR = $44
MYADDR = $e001
RICE = 1
MACRO mymacro3 value,addr
lda value
sta addr
IF RICE > 0
ldx #(RICE + (2 * value))
ELSE
ldx #$16
ldy #RICE
ENDIF
ENDM
ORG $8000
clc
:- mymacro3 MYVAR,MYADDR
nop
i=0
REPT 4
i=i+1
mymacro3 MYVAR+i,MYADDR+i
lda #i
ENDR
RICE = 0
mymacro3 $ff,MYADDR
jmp :-
D:\Console\asm6>asm6 -L test.asm test.out test.lst
pass 1..
test.out written (57 bytes).
test.lst written.
D:\Console\asm6>type test.lst
MYVAR = $44
MYADDR = $e001
RICE = 1
MACRO mymacro3 value,addr
lda value
sta addr
IF RICE > 0
ldx #(RICE + (2 * value))
ELSE
ldx #$16
ldy #RICE
ENDIF
ENDM
ORG $8000
08000 18 clc
08001 :- mymacro3 MYVAR,MYADDR
08001 A5 44 lda MYVAR
08003 8D 01 E0 sta MYADDR
08006 IF RICE > 0
08006 A2 89 ldx #(RICE + (2 * MYVAR))
08008 ELSE
08008 ldx #$16
08008 ldy #RICE
08008 ENDIF
08008 EA nop
08009 i=0
08009 REPT 4
08009 i=i+1
08009 mymacro3 MYVAR+i,MYADDR+i
08009 lda #i
08009 ENDR
08009 i=i+1
08009 mymacro3 MYVAR+i,MYADDR+i
08009 A5 45 lda MYVAR+i
0800B 8D 02 E0 sta MYADDR+i
0800E IF RICE > 0
0800E A2 8A ldx #(RICE + (2 * MYVAR+i))
08010 ELSE
08010 ldx #$16
08010 ldy #RICE
08010 ENDIF
08010 A9 01 lda #i
08012 i=i+1
08012 mymacro3 MYVAR+i,MYADDR+i
08012 A5 46 lda MYVAR+i
08014 8D 03 E0 sta MYADDR+i
08017 IF RICE > 0
08017 A2 8B ldx #(RICE + (2 * MYVAR+i))
08019 ELSE
08019 ldx #$16
08019 ldy #RICE
08019 ENDIF
08019 A9 02 lda #i
0801B i=i+1
0801B mymacro3 MYVAR+i,MYADDR+i
0801B A5 47 lda MYVAR+i
0801D 8D 04 E0 sta MYADDR+i
08020 IF RICE > 0
08020 A2 8C ldx #(RICE + (2 * MYVAR+i))
08022 ELSE
08022 ldx #$16
08022 ldy #RICE
08022 ENDIF
08022 A9 03 lda #i
08024 i=i+1
08024 mymacro3 MYVAR+i,MYADDR+i
08024 A5 48 lda MYVAR+i
08026 8D 05 E0 sta MYADDR+i
08029 IF RICE > 0
08029 A2 8D ldx #(RICE + (2 * MYVAR+i))
0802B ELSE
0802B ldx #$16
0802B ldy #RICE
0802B ENDIF
0802B A9 04 lda #i
0802D
0802D RICE = 0
0802D mymacro3 $ff,MYADDR
0802D A5 FF lda $ff
0802F 8D 01 E0 sta MYADDR
08032 IF RICE > 0
08032 ldx #(RICE + (2 * $ff))
08032 ELSE
08032 A2 16 ldx #$16
08034 A0 00 ldy #RICE
08036 ENDIF
08036
08036 4C 01 80 jmp :-
This is something that bothered me a bit when I first started using ca65, since I was used to asm6. Writing the instructions as they're generated by the macro as if they were explicitly written in the source feels like the logical thing to do IMO, and not hard at all. Modifying an existing assembler to do it might not be so trivial, though.
I've considered actually going through the code in asm6 in attempt to extend it to support 65816. Sounds easy, but likely isn't, especially given syntactical pains and 24-bit addressing.
koitsu wrote:
Remarkably hard, isn't it? *shakes head*
Nobody said that this kind of thing is hard, in fact I said the exact opposite and that it's a reasonable think to ask for. You, however, were asking for the listing to be generated when the macro is
defined, which ASM6 doesn't do either.
I'm sure ca65 already has all the necessary groundwork to implement a feature like that (the debug file it generates receives all of the line information from the compilation, including nested macros), but obviously nobody has wanted it badly enough, yet.
thefox wrote:
Nobody said that this kind of thing is hard, in fact I said the exact opposite and that it's a reasonable think to ask for. You, however, were asking for the listing to be generated when the macro is defined, which ASM6 doesn't do either.
Fair enough. I should have been more clear: the bytecode has to be shown at "definition time" (with the caveats being macro arguments or certain contents can't be calculated at this phase, hence putting placeholder letters, ex. "rr" and the like), or at "use time" (no caveats I can think of), alongside/next to their relevant instructions (this part is important). Spending an entire page discussing this is exactly what I was trying to avoid in the first place, but I should've known by now it's par for the course around here... heh :P
The way asm6 does it is how I've seen it done in pretty much all assemblers (I believe that includes WLA DX, which has its own set of problems/complexities). asm6 also only expands macros and REPT in listings with the -L option; non-expanded listings are accomplished via -l.
There's irony in all of this -- one of the reasons Espozo moved from WLA DX to ca65 was because listing generation in WLA DX is an even bigger clusterfuck (don't get me started on the interspersed output problem), alongside some actual bugs psychopathicteen found. Yet here we are, discussing ca65's listings generation being somewhat sub-par. The fact that "nobody has wanted this" is, needless to say, quite shocking (to me anyway).
Showing the bytecode at definition time is a weird request IMO, because a lot of things are unknown unless the macro is actually being used. Parameters can be used to make decisions, create loops, and so on, so it's not just a matter of "marking the gaps".
Also, nothing is written to the ROM at definition time, so I don't see any reason to generate any bytecode then.
koitsu wrote:
listing generation in WLA DX is an even bigger clusterfuck (don't get me started on the interspersed output problem), alongside some actual bugs psychopathicteen found.
Do you guys actually document the bugs you find in WLA-DX? Ideally, you would report a bug or issue you find to the
WLA-DX Github repository, or more directly the
"issues" tab.
What (if any) bugs have been discussed on (S)NESDev but not actually filed to the official repo for fixing? If you are aware of any such bugs, please help document them properly - through an issue over Github. At the very least, discuss there the issue. If you have actual time, please continue to read this post.
For many months, WLA-DX repo has been shipping with a
bug_exhibition directory. This directory has a minimal "build system" for each of the consoles that WLA-DX supports. You can use this as a starting point for demonstrating a bug.
I can't fully express how much it would mean to me right now if you show your support, because I still believe in this assembler. A lot of these bugs sound like things that can be fixed, and are probably things I don't want to run into myself. And that doesn't mean I want to hear how big and bad some bug is -- just file the bug, and that's it. As long as it's documented for the people who actually care about it, I'm already happy.
EDIT: This means that IF YOU CAN, please don't start a sub-discussion in this thread about specific WLA-DX bugs. Go direct to the github and file your bugs there! Thank you!!!!!
Honestly, if we're having this much of a problem with macros not being assembled in the listing, referring to my code, I'd suggest just dumping the whole damn thing into header.
I'm still not entirely sure what the problem is with my code... I forgot again, the rom C0 to FF, but where is it also mirrored to? I remember you said you had to put everything in CODE_UPPER, and it appears that that starts at $C08000, while regular CODE starts at #$C0000. What's that about? I swear I've heard tepples explain something similar about LoROM.
koitsu wrote:
Spending an entire page discussing this is exactly what I was trying to avoid in the first place
If it would help, PM me a good split point.
Quote:
The way asm6 does it is how I've seen it done in pretty much all assemblers (I believe that includes WLA DX, which has its own set of problems/complexities). asm6 also only expands macros and REPT in listings with the -L option; non-expanded listings are accomplished via -l.
Thanks. Now I think I have enough info to go on for a feature request on ca65's issues page on GitHub if someone else doesn't beat me to it.
Espozo wrote:
I forgot again, the rom C0 to FF, but where is it also mirrored to?
Banks $C0 through $FD are mirrored into $40 through $7D. The second half of each bank $C0 through $FF is also mirrored into $00-$3F and $80-$BF. Specifically in the case of
CODE_UPPER, $C08000 and $808000 are mirrors.
I can think of one problem with generating listings for macros at their definition, rather than at expansion.
Using ca65 as an example:
Code:
SOME_CONSTANT = 16
.macro Foo arg
.if arg
rep #$20
.a16
.else
sep #$20
.a8
.endif
lda #SOME_CONSTANT
.endmacro
How should the listing for the
lda #SOME_CONSTANT line be generated? As
a9 10 00, or
a9 10? Whichever you pick, it'll be wrong for one of the macro cases.
bazz wrote:
Do you guys actually document the bugs you find in WLA-DX? Ideally, you would report a bug or issue you find to the
WLA-DX Github repository, or more directly the
"issues" tab.
Please do. Saying WLA DX has this problem after that problem and then ignoring that it has
not been abandoned isn't really helping anyone.
...Just don't bother with the .b, .w and .l stuff. WLA is like xkas and bass, its design just can't accommodate figuring out which opcodes to use for you. I think I've put that problem on my personal list of issues with the assembler that should just be considered a weakness.
The two "moderators" of the repository aren't usually around, though. Maybe I should ask to have more privileges? Unlike them, I don't have much else to do, so.
tepples wrote:
Banks $C0 through $FD are mirrored into $40 through $7D.
So $FD-$FF is not mirrored?
tepples wrote:
The second half of each bank $C0 through $FF is also mirrored into $00-$3F and $80-$BF.
You mean like in this diagram, we take the bottom two blocks on the far left, put them in the upper half (4 rectangles lined up on the top row) of $00 through $3F, then take the upper two blocks on the far left, and put them into the upper half (4 rectangles lined up on the top row) of $80 through $BF? What's even in the blank space in the bottom 4 rectangles of $00 through $3F and $80 through $BF? What's even with all this mirroring anyway? I'd rather have more rom space than the seemingly pointless mirror of $40 through $7D.
When I looked at the M92's memory space, I don't remember it being nearly as screwy as this...
Espozo wrote:
tepples wrote:
Banks $C0 through $FD are mirrored into $40 through $7D.
So $FD-$FF is not mirrored?
$FD is mirrored into $7D, but $7E and $7F are special banks because WRAM overrides anything else that might be decoded there.
Espozo wrote:
What's even with all this mirroring anyway?
Mirroring was cheaper than full decoding, but both were cheaper than filling the entire memory space with unique ROM.
Espozo wrote:
I'd rather have more rom space than the seemingly pointless mirror of $40 through $7D.
Then make an ExHiROM (mode $25) and get 63 Mbit to play with. The first 4 MB go to banks $C0-$FF; the rest of the ROM goes to $40-$7D. It just would have been really expensive back in the day; only
Tales of Phantasia and a few games using S-DD1 and SPC7110 ever exceeded 32 Mbit. And even then, ROM was so expensive that a decompression ASIC was still cheaper than a bigger ROM.
tepples wrote:
Mirroring was cheaper than full decoding
Wait, what even is decoding?
tepples wrote:
Then make an ExHiROM and get 63 Mbit to play with.
Oh, yeah, I forgot about that...
tepples wrote:
ROM was so expensive that a decompression ASIC was still cheaper than a bigger ROM.
Yeah, that's pretty sad...
Probably not the case anymore.
Espozo wrote:
tepples wrote:
Mirroring was cheaper than full decoding
Wait, what even is decoding?
Logic circuitry that decides whether or not to respond to an address.
Every bit you need to "decode" takes one more layer of logic.
i.e. listening for any address that matches the bit pattern 1XXX1 only requires 2 bits to decode, but listening for just 10001 specifically takes 5. The 'X' represents "don't care", also known as "mirroring".
Mirroring on NESdev Wiki: the basic concepts apply
Actually, I've been doing nothing for whatever reason, and just knowing a little more about the SNES memory map, I want to ask something about this:
Quote:
I stuck almost everything under the CODE_UPPER segment ($C08000-FFFF, a.k.a. $008000-FFFF). The code broke elsewhere, but unrelated to my changes.
I probably already asked this (sorry!
) But what's so significant about "8000" vs. "0000"?
so it will mirror into "System Area" banks.
$c08000-$c0ffff gets mirrored to $008000-$00ffff. $c00000-$c07fff doesn't get mirrored to bank $00 at all, because $000000-$007fff is taken by RAM/PPU registers/etc., just like in LoROM.
Because bank $00 is still laid out basically the same for LoROM and HiROM ($0000-$7fff is system area, $8000-$ffff is ROM), putting all your code originally written for LoROM in CODE_UPPER is a somewhat hackish way of making it work without modification for HiROM.
I see... Well koitsu, can you upload the code anyway? I imagine I could handle getting it at least past clearing ram. I can delete all that other crap then. Because I loose it every other day, where's the PDF for SNES hardware info that you use? I realize I kind of need to know LoROM in order to know what to try and fix.
These charts are a bit messy because I hand-drew them, but maybe they'll help clear up the memory map for you? Hopefully they don't make things worse, anyway.
The banks in $80-$ff are identical mirrors of $00-$7f, besides the different ROM access speed, and $7e-$7f being devoted to WRAM.
Notice that the only difference between LoROM and HiROM is how the ROM is arranged. Everything else remains the same.
"ROM 0" in this chart is equivalent to your code's
CODE segment, whereas "ROM 1" is equivalent to your
CODE_UPPER segment. Notice how for HiROM, only "ROM 1" is in bank $00. Your original reset code was in "ROM 0" ($400000-$407fff, or $c00000-$c07fff), but the SNES was starting at that location in bank $00 ($000000-$007fff), where WRAM, etc. are.
koitsu's fix was to move all your code from "ROM 0" to "ROM 1", because then it'd be mapped to the same location it was for LoROM, and behave the same.
Edit: I guess it's also worth mentioning that in your code, because you're labeling them as 64 KB sections,
ROM0_LOWER = "ROM 0",
ROM0_UPPER = "ROM 1",
ROM1 = "ROM 2-3",
ROM2 = "ROM 4-5", etc. It does make things a bit confusing; maybe I should've used those labels in my chart...
Nicole wrote:
Hopefully they don't make things worse, anyway.
I think they did.
I'll just have to look for the chart that koitsu was using.
Yeah, I think I get the whole "CODE_UPPER" thing now. It's because originally, CODE didn't start at $000000 where all the hardware registers are, but instead $008000, but now it starts at $000000. So "CODE_UPPER" in HiROM is the same thing as "CODE" in LoROM. I know it's an easy fix, but when you're using the SNES hardware registers, isn't it about the only thing you can do? Or wait, is this where the B or K (I really can't remember which one handles data) register comes in?
If you have the data bank register (B) pointed at a system area bank ($00-$3F or $80-$BF), you can read and write these with 16-bit addressing modes:
- $0000-$1FFF: Mirror of RAM at $7E0000-$7E1FFF
- $2100-$21FF: The B bus, including PPU, APU, and WRAM streaming
- $4016-$4017: Controller port PIO
- $4200-$421F: Memory controller
- $4300-$437A: DMA channels
- $8000-$FFFF: The second half of a ROM bank
The difference between LoROM (mode $20) and HiROM (mode $21) is that all banks in LoROM are second halves. This is why
CODE_UPPER, the second half of the first bank in this HiROM linker script, behaves the same way as
CODE in a LoROM.
Espozo wrote:
B or K (I really can't remember which one handles data)
@koitsu, here's to all your honorable efforts.
Think you could be a better teacher?
The names are dumb anyway. What the heck does "K" stand for? They could have just called them "PB" for program bank and "DB" for data bank, but as said, they wanted every opcode to be 3 letters no matter what, for whatever reason.
tepples wrote:
If you have the data bank register (B) pointed at a system area bank ($00-$3F or $80-$BF), you can read and write these with 16-bit addressing modes:$0000-$1FFF: Mirror of RAM at $7E0000-$7E1FFF $2100-$21FF: The B bus, including PPU, APU, and WRAM streaming $4016-$4017: Controller port PIO $4200-$421F: Memory controller $4300-$437A: DMA channels $8000-$FFFF: The second half of a ROM bank
Yeah, so "B" (Which I now know is for data) would be $00, while "K" could be whatever else.
Maybe I need to watch the video again (although I hope not
) because what's the great difficulty in doing this? Like you said, koitsu, it seems really important to know where you're putting your code and data in your game, which is why I understand you said you liked linear assemblers.
Espozo wrote:
Think you could be a better teacher?
Nah, you really wouldn't want to know what I'm thinking about you (as a learner) at this point.
Although what I can tell you is that ...
Espozo wrote:
Maybe I need to watch the video again
... I don't think that'll help. However, feel free to prove me otherwise. Or better yet -- don't.
Ramsis wrote:
However, feel free to prove me otherwise. Or better yet -- don't.
I'm not even sure I get the joke...
Espozo wrote:
Ramsis wrote:
However, feel free to prove me otherwise. Or better yet -- don't.
I'm not even sure I get the joke...
Let's say Program Bank (K) is $40, Data Bank (B) is $18, and Direct Page (D) is $0300.
Then:
Code:
; Short jumps use program bank (K)
jmp $2000 ; $402000
; Long jumps are absolute
jml $672000 ; $672000
; Direct page (D) is always in bank $00
lda $50 ; $000350
sta $80 ; $000380
and $70 ; $000370
; 16-bit data addresses use data bank (B)
lda $1234 ; $181234
sta $5678 ; $185678
and $9012 ; $189012
; 24-bit data addresses are absolute
lda $101010 ; $101010
sta $7e4444 ; $7e4444
and $808080 ; $808080
The key point here is that the code you're executing does not have to be in the same bank as your data.
If you set PB (K) to $40, and DB (B) to $00, then you could read and write to PPU registers with
sta $2120, etc. as you pleased, because those instructions use DB (B), even though you're executing code out of bank $40.
If you wanted to get even trickier, you could set direct page (D) to $2100, and then
sta $20 would write to $002120.
So, you've got lots of flexibility in how you want to arrange things.
Nicole wrote:
If you set PB (K) to $40, and DB (B) to $00, then you could read and write to PPU registers with sta $2120, etc. as you pleased, because those instructions use DB (B).
OP has been told this exact same thing ~100 times before, but he's proven like ~1000 times that it's simply beyond him. Kudos to you (Nicole) though for still trying to get OP to learn.
What's with all the passive aggressiveness? I tried to play along, but you won't take a break.
Anyway, It's not a hard concept to understand if it's said properly. K is the bank the code is being read out of, and B is the bank of the data you're loading. (And of course, this is more than "lda", like if we're talking adc $XXXX. The number XXXX is part of the code, but the number at XXXX is the data)
If you were trying to load $002100 (screen display register) from $E08000 or something and you didn't want to use absolute addressing, then K would be $E0 (or $A0) and B would be $00 (none of the hardware registers are mirrored, right?)
I actually found a way to remember B vs. K: "K" is "Kode".
$000000-$007fff is mirrored across every bank in $00-$3f and $80-$bf.
It doesn't even matter if you access them in the slow banks or the fast banks, because they always have the same access speed. (MEMSEL only affects ROM access speed.)
Espozo, I don't know much about SNES development, but from what I've been following of your thread(s), it does seem like you're not interested in studying how the SNES works so you can do things properly, you're instead randomly trying out things until you magically achieve the desired result. And it seems that once you get there, you don't even care why.
And it's also true that people have tried to explain the same things over and over, and there's a noticeable lack of interest on your part to make good use of that information.
Maybe you're too eager to get to the interesting part, but where has that gotten you so far? You seem to be stuck on things that prevent your program from even booting properly, am I correct? Don't you think maybe it's time to take a step back and grow a better understanding of the platform you're working on so you can make a program with a good foundation on top of which you can code a proper game?
tokumaru wrote:
it does seem like you're not interested in studying how the SNES works so you can do things properly, you're instead randomly trying out things until you magically achieve the desired result.
I wasn't aware there was that much to it, until the more I found out. No one ever went to me and said "here, read this." In fact, I remember when I first tried the Nerdy Nights tutorial for the NES (I looked at the NES first) I thought I remember they just kind of threw you right into the middle of the action without really explaining how everything worked.
tokumaru wrote:
it seems that once you get there, you don't even care why.
tokumaru wrote:
there's a noticeable lack of interest on your part to make good use of that information.
I don't know what you're trying to say.
tokumaru wrote:
Maybe you're too eager to get to the interesting part, but where has that gotten you so far?
That makes more sense.
tokumaru wrote:
You seem to be stuck on things that prevent your program from even booting properly, am I correct?
Well, It's more like it did boot properly, and then I decided I'd do something else, not knowing the consequences for my actions, and then I screwed it up. The problem was me not knowing fundamental things like K and B, because I never learned about them.
I was new to programming before coming here, which is why I don't know basic stuff like that, but instead more SNES specific stuff(the "exciting" things) like how the video hardware works and how to interact with it. I feel like there should be some sort of guide for people like myself who had no programming experience prior to coming here to work on the SNES.
And no, the book does not count. That'd be a huge turn-off. I feel it's more of a reference than a guide. Maybe you could have a list that says what to learn about in there or something and what order to go. It could be like a school, except instead of a teacher, it's a guide, and it makes you refer to the SNES development manual at specific points. I don't know, I just never understand how anyone else got started here, but save nicklausw, I'm probably the youngest user.
In my experience the SNES memory map is the hardest thing to understand about the system. Once you get it, everything else should be relatively easy. (Then again, I had about a decade of C++ and two of Matlab before I started this, so programming wasn't exactly new to me...)
This may not be helpful to you, but it makes sense to me:
I think the key thing to get is the arrangement of ROM areas* and system areas**. Basically, the first and third quarters of the memory map (banks $00-$3F and $80-$BF) are split between system areas in the bottom half and ROM areas in the top half. The system areas are identical in every bank; the ROM areas are not.
The second and fourth quarters of the map ($40-$7F and $C0-$FF) are all ROM, with the exception of $7E and $7F, which are WRAM. The bottom 8 KB of $7E ($0000-$1FFF) is mirrored in all of the system areas, so if your data bank register is anywhere in the first or third quarters of the memory map, you can access that RAM.
LoROM uses 32 KB banks so as to fit in the ROM areas in the first and third quarters (ie: the ROM banks show up in the upper halves of the corresponding SNES banks). HiROM uses 64 KB banks, meaning if a bank is accessed in the first or third quarter of the map, the bottom half is overridden by the system area; you need to access in the second or fourth quarter to be able to see all of the data. (And as you've found, the same is true of the program counter; you can't run code out of ROM below $8000 in the first or third quarter of the map.)
For small programs, it is exceedingly likely that ROM will be mirrored between $00+, $40+, $80+, and $C0+. This means that all four of those locations are identical from $8000-$FFFF in each bank. With LoROM, that corresponds to $0000-$7FFF in your ROM image, which is the whole bank (32 KB). With HiROM, it's $8000-$FFFF, meaning the bottom half of a bank is missing from $00+ and $80+, but it's there at $40+ and $C0+.
*"ROM areas" can also include stuff like SRAM, and can even be open bus. It depends on the cartridge.
**By "system area", I don't mean the cartridge isn't accessed. It certainly can be, and most (all?) special chips use reserved areas in that range. But beginners will generally use it for PPU access, DMA, IRQs, shadow RAM, multiply/divide... internal system stuff.
Espozo wrote:
What's with all the passive aggressiveness?
It's not "passive aggressiveness", at least not from my part. Rather, it's active annoyanceness. (Yes, I'm perfectly aware that word doesn't even exist.)
Espozo wrote:
I tried to play along, but you won't take a break.
*Yawn*
You "playing along" is just so entertaining ...
Espozo wrote:
I feel like there should be some sort of guide for people like myself who had no programming experience prior to coming here to work on the SNES.
So some sort of "Easy 65816" counterpart to the
Easy 6502 tutorial?
Quote:
I just never understand how anyone else got started here, but save nicklausw, I'm probably the youngest user.
I was writing 6502 assembly on Apple II at age 14 if it counts.
Okay, this doesn't have much to do with the SNES, but on the subject of starting people off...
Maxim's tutorials don't assume much experience from what I remember.
http://www.smspower.org/maxim/HowToProgram/Lesson1I'm not saying we should start people off in the SNES scene with a tutorial for a Sega console with a Z80 processor, but the tutorial really starts from the basics so I think it'd be a nice basis for a set of tutorials designed for the SNES. You could even learn some from it as is, who knows.
Here's a simplified, high-level look at the SNES's memory map.
DiagramThe top-left diagram shows the memory map as the SNES sees it, basically. The white areas are passed to the cartridge, and the SNES doesn't care about how exactly the cartridge maps ROM to that space.
The second row shows how the mappers for LoROM and HiROM map given addresses to ROM, and the last row shows the complete memory map.
There's ways this can become more complicated (which 93143 delved into), but it's probably best not to get into that right now.
Ramsis wrote:
You "playing along" is just so entertaining ...
Until this happens:
viewtopic.php?f=5&t=14081 I've found it's best to assume everyone on the internet is a complete stranger.
tepples wrote:
So some sort of "Easy 65816" counterpart to the Easy 6502 tutorial?
Possibly even "easier". I don't like how they go to "our first program" before even learning everything. It first gives you a false sense of confidence learning how to do something as easy as that, and then you find out you don't know squat later and it ends up hurting you more than anything. Maybe this is too basic to some people (he talks about learning hexadecimal at one point though) but I feel like it should start off as simple as learning what bits are. It seems right off the bat, he assumes you know how loading and storing values is really the basis of a computer program, which when I started, I had no clue how it worked. Maybe it should be like "if you understand what this means, skip this lesson." I don't know. I'm not a good teacher though, so I don't have a lot of room to talk.
tepples wrote:
I was writing 6502 assembly on Apple II at age 14 if it counts.
Wow... I just started looking over stuff when I was about that age. (I think that's when I first looked at "Nerdy Nights" for the NES.)
93143 wrote:
In my experience the SNES memory map is the hardest thing to understand about the system. Once you get it, everything else should be relatively easy.
And what makes no sense is that it simply doesn't appear to be taught. If I don't know what hexadecimal is, I sure as hell don't understand the concept of a memory map. I always assumed everything on the SNES was kind of somehow able to reach everything, as if something like "FrameCounter" would be the same in every bank, because I sort of though the SNES was running in a type of absolute addressing. The only difference in what I thought of "regular" addressing (I'm not sure what's the name for regular 16bit addressing, if there even is one) and absolute is that absolute addressing wasn't affected by direct page.
Basically, now knowing the SNES memory map (or even memory mapping in general) makes me realize how big of an idiot I was. That's the one thing I like about the prospect of machine code: If you don't know something, you know you don't know it, unlike me in assembly where I've been blissfully unaware of even basic concepts.
93143 wrote:
I think the key thing to get is the arrangement of ROM areas* and system areas**. Basically, the first and third quarters of the memory map (banks $00-$3F and $80-$BF) are split between system areas in the bottom half and ROM areas in the top half. The system areas are identical in every bank; the ROM areas are not.
Wait, so "sta $2100" is the same in banks $00-$3F as it is in banks $80-$BF? One thing I've noticed though is it looks like in the memory map I found, there's extra random junk near $8000 in banks $00-$0F, $30-$3F, and $80-$8F.
What exactly is open bus? is it just trying to access rom where there isn't any? I always wondered though, what happens if you try and access a hardware register that doesn't even exist, like $3000, for example. Or wait, is this space reserved for special chips, which in that case, it behaves just like if you're trying to load from rom that doesn't exist?
"Open bus" means a place that isn't mapped to anything. AFAIK, what happens when you read from open bus is that you end up with whatever the last written/read byte anywhere was, even if it was a completely different location.
So, if you read $77 from ROM, then read from open bus, you'd just read $77 again. If you wrote $88 to RAM, then read from open bus, you'd read $88. If you wrote $99 to open bus, then read from open bus, I believe you'd read $99.
Nicole wrote:
So, if you read $77 from ROM, then read from open bus, you'd just read $77 again. If you wrote $88 to RAM, then read from open bus, you'd read $88. If you wrote $99 to open bus, then read from open bus, I believe you'd read $99.
Actually, when code and data share the same bus, what you get when you read from open bus is the last byte of the instruction, because that's was the last thing on the bus, afer the instruction was completely fetched.
Ah, that makes sense. In any case, you don't really need to worry about what exactly you get from reading open bus; just think of it as garbage data.
You generally shouldn't rely on open bus, but some games on the NES do. When you read from hardware registers that don't return information on all 8 bits, the unused bits return open bus. A couple of NES games are known to exploit this when reading the controllers, making comparisons that assume specific values in the unimplemented bits. This ended up breaking these games on the PowerPak, which for some electric reason I'm not capable of explaining, changes the open bus behavior. When making your own programs, you should definitely not rely on open bus, unless you wan to use it a type of copy protection (which is also done in at least one NES game), but even then you're risking breaking something on an obscure hardware revision or something.
tokumaru wrote:
unless you wan to use it a type of copy protection (which is also done in at least one NES game)
If you're referring to Low G Man, I really don't believe there has ever been any good proof that it was
really done for copy protection. All cases of relying on open bus behavior that I've seen have been (I believe) due to bugs in code, or misunderstanding of the hardware.
Low G Man is a bug, for sure.
The only thing I can think of is the
modified CNROM, maybe?
tokumaru wrote:
You generally shouldn't rely on open bus, but some games on the NES do.
There's good relying on open bus, and then there's bad relying on open bus. The bug in the music engine of
Low G Man is the bad kind. But I had planned to use open bus at $4016 to help identify the connected controllers and whether it's an NES or a Famicom. I won't explain here because it's an advanced technique; those who are interested can see
Riding the open bus.
Quote:
[Mindscape games' reliance on unused bits of controller ports] ended up breaking these games on the PowerPak, which for some electric reason I'm not capable of explaining, changes the open bus behavior.
Pull-up resistors on D0-D7 was a hack added early on to fix OAM DMA problems in early revisions of the PowerPak.
Even mapper 185 is really just testing for "Contains correct data / does not contain same data", not exactly relying on open bus...
Seems like I was wrong about the copy protection thing, sorry about that!
Espozo wrote:
Wait, so "sta $2100" is the same in banks $00-$3F as it is in banks $80-$BF?
Yes. In all cases it's a write to INIDISP.
Quote:
One thing I've noticed though is it looks like in the memory map I found, there's extra random junk near $8000 in banks $00-$0F, $30-$3F, and $80-$8F.
Yeah, some games undoubtedly put DSP registers or RAM in there, and there's no reason
that has to be identical across all banks with system areas. But the internal stuff is, unless I'm sorely misled.
DSP? You mean like the DSP-1 enhancement chip? (That's probably not even close...
)
Yes, actually. In a Mode 21 cartridge with a DSP-1, the I/O registers for the DSP-1 are mapped to $6000-$7FFF in banks $00-$0F and $80-$8F. There also appears to be an area of extra RAM in the same address range in banks $30-$3F, but I imagine that's optional...
Just be aware that if you start getting fancy, you can end up with extra stuff mapped in. If you aren't using the extra stuff, you don't need to worry about it yet. I know a fair bit about the Super FX memory map, even features no game ever used, because I think I'll need that stuff. The SA-1? Forget it - if I ever start seriously working on F-Zero SX or something like that, I'll have to learn it, but right now there are better places to focus my energy...
93143 wrote:
if I ever start seriously working on F-Zero SX or something like that
Will you be able to go 3000+km/h with snaking, shift boosting, and MTS? You have to at least add Fat Shark.
Naw, I was thinking I'd back off a little on the unphysical exploit-type maneuvers. The part that makes you go fast is your engines, and the physics model would reflect this. Drifters would work like in the GBA games, where you can pre-steer by compensating with the opposite drifter, but everything would be carefully tuned and tested to make sure there's no way to move any faster than by just driving in a straight damn line.
And I'm not sure the Fat Shark fits in with the rest of the palettes; there are only five or six free and every machine on the screen has to fit into those. In any case I wasn't going to have more than about 12-16 machines total - ROM limits, you see. I came up with a cool idea, and it eats ROM for breakfast... I may have to trash it... (I'd say "we'll see", but quite frankly we almost certainly won't...)
This would be a prequel of sorts to F-Zero X, set before the big accident; accordingly the speeds would be roughly X level at best, Blood Falcon definitely wouldn't be in it, and the driver of the Red Gazelle would be Clinton Gazelle (
yes. And if you're curious,
also yes)...
None of this is set in stone, obviously. It's basically daydreaming at this point, and likely to remain so.
93143 wrote:
Naw, I was thinking I'd back off a little on the unphysical exploit-type maneuvers. The part that makes you go fast is your engines, and the physics model would reflect this. Drifters would work like in the GBA games, where you can pre-steer by compensating with the opposite drifter, but everything would be carefully tuned and tested to make sure there's no way to move any faster than by just driving in a straight damn line.
Yeah, I think F-Zero GX has become more about executing all these maneuvers without crashing than just driving, if that even makes sense. They're practically required for story mode though. The funny thing though, is that I started to do them all during the Grand Prix races, and even on Master, there was virtually no competition, as I'd finish a whole 30 seconds above anyone else on some courses. That's why it's more fun to play on max speed.
Anyway though, yeah, I have no clue how all those sort of "bugs" exist. I've heard that some of the staff ghosts use them though, like shift boosting in Lateral Shift, which would explain how they go so fast even in some of the worse vehicles.
93143 wrote:
there are only five or six free and every machine on the screen has to fit into those.
Yeah, I've always thought that even the SNES's color count was sorry. It's in a weird spot where it's big enough to where you don't want to have everything share palettes like a Genesis game, but not big enough to where you can safely have everything have its own palette. I still really want to try mid-screen palette swaps for sprites. Really, the number of typical sized sprites per line is about equal to 8 anyway. The biggest problem with this is actually programming it.
93143 wrote:
ROM limits, you see. I came up with a cool idea, and it eats ROM for breakfast... I may have to trash it...
I don't understand why people let ROM restrict them in this day and age, even after hearing people's reasons for it. They talk about trying to make the games authentic, but frankly, I find looking at the "true power" of the system is more interesting. I'm assuming you're not using mode 25?
93143 wrote:
This would be a prequel of sorts to F-Zero X, set before the big accident; accordingly the speeds would be roughly X level at best, Blood Falcon definitely wouldn't be in it, and the driver of the Red Gazelle would be Clinton Gazelle (yes. And if you're curious, also yes)...
Looks like someone knows their F-Zero.
I just thought of something... Are you no longer working on your other game?
Espozo wrote:
I don't understand why people let ROM restrict them in this day and age, even after hearing people's reasons for it.
Because they lack the finances to pay artists enough money to create enough graphics and maps to fill 8 MB, the limit for mode $25 on Super NES and for oversized BNROM on NES. Or because they lack the finances to create a game that's a compelling purchase by itself, and instead seek to be included on a collaborative multicart. It's like writing a short story instead of a 1000 page door-stopper. How is each of these reasons invalid?
tepples wrote:
they lack the finances to pay artists enough money to create enough graphics and maps to fill 8 MB, the limit for mode $25 on Super NES
I mean, they don't have to fill up the whole thing. It's more like telling the artists to stop because they've ran out of rom space.
tepples wrote:
Or because they lack the finances to create a game that's a compelling purchase by itself, and instead seek to be included on a collaborative multicart.
They shouldn't even have to worry about space then.
Espozo wrote:
tepples wrote:
they lack the finances to pay artists enough money to create enough graphics and maps to fill 8 MB, the limit for mode $25 on Super NES
I mean, they don't have to fill up the whole thing. It's more like telling the artists to stop because they've ran out of rom space.
On the NES, adding more memory often means adding more bank bits to the mapper, which means the mapper must ship on a larger CPLD ($). Larger CPLDs and larger flash chips also tend to have voltage incompatibility with the 5.0 V signal environment of an NES or Super NES, requiring level shifters ($), and many lack an 8-bit mode, requiring multiplexers ($).
Has "telling artists to stop" been a problem with your past projects?
Espozo wrote:
tepples wrote:
Or because they lack the finances to create a game that's a compelling purchase by itself, and instead seek to be included on a collaborative multicart.
They shouldn't even have to worry about space then.
They do if the multicart's inclusion policy rejects the entries with the lowest fun-to-space ratio so as to maximize the fun in a given amount of ROM space. Otherwise, the multicart would have to go to a bigger ROM. To take an extreme example, a 1 Mbit 5 V 8-bit DIP flash chip is cheaper than a 64 Mbit 3.3 V 16-bit TSOP flash chip ($) and the additional circuitry to make it work in a 5.0 V 8-bit environment ($) and make it flashable in circuit so as not to need a
$4,000 programmer.
tepples wrote:
Has "telling artists to stop" been a problem with your past projects?
You know very well that I have no past projects.
I'd probably try and be my own artist anyway. (Well, the problem is that I'm slower than dirt though.)
tepples wrote:
Quote:
[Mindscape games' reliance on unused bits of controller ports] ended up breaking these games on the PowerPak, which for some electric reason I'm not capable of explaining, changes the open bus behavior.
Pull-up resistors on D0-D7 was a hack added early on to fix OAM DMA problems in early revisions of the PowerPak.
Pull-up resistors were there from the start for voltage level conversion. The later hack was series resistors on the data lines.
Espozo wrote:
93143 wrote:
ROM limits, you see. I came up with a cool idea, and it eats ROM for breakfast... I may have to trash it...
I don't understand why people let ROM restrict them in this day and age, even after hearing people's reasons for it. They talk about trying to make the games authentic, but frankly, I find looking at the "true power" of the system is more interesting. I'm assuming you're not using mode 25?
I'm talking about machine graphics using up nearly all of the 8 MB addressable by the Super MMC (unless someone knows better; it sure looks like 8 MB is the limit), even when tepples' sprite scaling scheme is taken into account. This cool trick of mine roughly triples the amount of space taken by the graphics. Compression is an option, I suppose; unpacking something like LZ4 on the SA-1 might end up a fairly small fraction of the overall time required to get a sprite ready for VRAM...
One cool thing about prerendered graphics is that once you have the 3D model, it's not all that time-consuming to make more frames of animation from it.
Quote:
Are you no longer working on your other game?
It's a bit stalled right now, but my plans haven't changed. I've pretty much satisfied myself that the SNES-side display engine will work, and I've continued planning how best to use the Super FX to plug the holes in the console's feature set. The next major step is a bullet test on the GSU, but it's on hold while I try out a BRR looping trick (which is likely to be useful for the shmup) and then test the cool sprite trick I mentioned above (which isn't). I suppose I could rearrange my priorities...
The big thing holding me back in SNES development is my day job, basically. It's unfortunately not the kind of thing where you can just go there for 8 hours and then come home and be free. I'm trying to finish a Ph.D., and it's sucking up all my mental energy. Worse, it's cognitively similar to SNES development, so any "free time" in which I have enough mental energy to work on this hobby could in principle be better spent on my research...
93143 wrote:
Super MMC
What now? I didn't think there was some sort of special chip involved.
93143 wrote:
even when tepples' sprite scaling scheme is taken into account.
Wait, how? The sprites can't be that big (I imagine about 48x32?) And you don't need that many frames of animation, just turned sprites of the objects, and they can be horizontally flipped for turning in different directions. The only other thing I think you'd need are animations for the vehicle rotating when you press L and R, but that really only needs to be in the head on view. I guess one other thing you could do is have the vehicles flip out before they explode, but that only needs to be a simple animation and you might not even have that anyway, although it's hilarious watching your vehicle bounce around like a pinball and explode, leaving behind a giant raisin.
I'm curious though, but what do you want the SA-1 for anyway? I've always figured you could kind of use mipmapping for objects when they shrink because it takes less CPU time to shrink a smaller object, and it's highly unlikely ever object onscreen is going to appear full size. It's a speed for memory tradeoff. I still imagine this would be smaller than any sort of compression.
93143 wrote:
One cool thing about prerendered graphics is that once you have the 3D model, it's not all that time-consuming to make more frames of animation from it.
They also already exist:
http://128bit.me/index.php?topic=27054. ... #msg192165Yeah though, I really just need to get my initialization routine working with the change to HiROM, and somehow organize things better, although I'm not entirely sure what I'm doing wrong here, which is a problem. Koitsu, would it even be at all beneficial for me to use the code you edited, or should I just try with what I have?
I should really try to do something like Pong first before I do something as ambitious as this thread's title...
Espozo wrote:
I should really try to do something like Pong first before I do something as ambitious as this thread's title...
This is probably the best thing I've seen you post in this (or the other gargantuan snesdev thread) so far. Yes, you can and should do exactly that, and you don't need mode 21 for that either.
If you really want proof of what you can do of that fact, go look up some homebrew games like the one called Shoot Your Load (don't let the name fool you into thinking its some pr0n thing, though the title screen does have something funny) -- that's a 4-player (you heard me right) Asteroids-like game for the SNES -- 4mbit (512KByte) ROM, using mode 20, but only about half the ROM is used (i.e. it's more like a 2mbit/256KByte game). It's PAL (meaning 50Hz/50 fps), but it'll run fine on an emulator. Example gameplay video:
https://www.youtube.com/watch?v=uXEXLFFMV0MMode 21 is absolutely the least of your concerns right now. You have way, WAY bigger fish to worry about.
Espozo wrote:
93143 wrote:
Super MMC
What now? I didn't think there was some sort of special chip involved.
It's part of the SA-1. It maps 8 MB of ROM to 4 MB of address space (more or less - the SA-1 map is the most complicated I've ever seen).
Quote:
Wait, how?
Multiaxial rotation. With non-flat tracks, I need views from above and maybe even below. Combined with my memory-eating cool trick, just carelessly proliferating desired viewing angles could easily consume over a megabyte for one machine.
Quote:
I'm curious though, but what do you want the SA-1 for anyway?
Sprite scaling, real-time HDMA table generation for non-flat Mode 7 tracks, map compositing at 4 KB per frame for two-player mode or possibly even more for faking multilayer Mode 7 in single-player mode, software rendering of track past the Mode 7 horizon, software rendering of heavily-fogged terrain below Mode 7 tracks, 3D projection and line drawing for windowing tricks... and of course that cool sprite trick I've been talking about, which is actually very CPU-heavy... not to mention decent A.I. and 3D physics for several racers on screen at once and more offscreen. And most of the graphical elements have to update at 60 fps, otherwise it's not an F-Zero game.
The Super FX is more powerful, but it has a couple of disadvantages:
1) it can only address 2 MB. Anything more has to be installed in parallel, so only the S-CPU can see it. Notice how the above litany of computationally intensive tasks requires access to pretty much all of the game data...
2) it can't share memory with the S-CPU. The SA-1 can generate an HDMA table and then simply leave it in BWRAM for the SNES to use. With the Super FX, the S-CPU would have to DMA it out into WRAM first, or else lock the GSU out of its own RAM, rendering it useless (and unable to update the table). And since DMA and HDMA are incompatible in general on early S-CPUs, this DMA would have to happen during VBlank, restricting bandwidth for other stuff.
Also, the Super Accelerator is sadly underutilized, and it would be neat for it to have a showcase game...
Quote:
I should really try to do something like Pong first before I do something as ambitious as this thread's title...
I was thinking the same thing. So far I've just been exploring the feasibility of this shmup port, and I've done some neat mockups that stretch the system, but before I start development in earnest I want to have a better idea of how games are put together.
Another member here did a (primitive) game per month for a year; I don't know if I want to get
that hardcore about it, but it's certainly the right idea...
Espozo wrote:
I should really try to do something like Pong first before I do something as ambitious as this thread's title...
For goodness sake, try to do (i.e., get done)
something --
anything! -- before yadayada.
Espozo wrote:
93143 wrote:
even when tepples' sprite scaling scheme is taken into account.
Wait, how?
The technique is essentially a software version of Neo Geo sprite shrinking. You store a mipmap of sprites at power of 2 sizes (48x32, 24x16, 12x8), and then you use a lookup table to shrink each bitplane of each 8x1 pixel sliver. The table shrinks each bitplane aligned to the right; for the right half of a 16x16, you'd use the multiplier register to shift the shrunken sliver to the left.
93143 wrote:
Another member here did a (primitive) game per month for a year; I don't know if I want to get that hardcore about it, but it's certainly the right idea...
It seems like if you've done one, you've done them all. Well, they'd have to vary in complexity.
koitsu wrote:
This is probably the best thing I've seen you post in this (or the other gargantuan snesdev thread)
Which one?
Anyway, I've always had this on my mind, but I never liked the prospect of doing something like Pong that I really can't use as a base for anything else, but seeing that I've already started over before and it's already become hopeless again, I really have no choice.
koitsu wrote:
and you don't need mode 21 for that either.
I wanted to switch over to it so I wouldn't have to change everything latter, but like I said earlier, I'm probably not even going to use anything from it anyway. So yeah, I'll stay with mode 20 for this. The weird thing to me though is that it never seemed like I had to deal with the memory map there (which is why I said I was so oblivious to it). I found the document with the memory maps on it, and I'm downloading it now.
koitsu wrote:
You have way, WAY
WAY,
WAY Actually though, I thought of one thing: Could you upload what you did for me anyway, because I want the batch file then. (Or, you could always just upload that.)
Yeah, I got rid of all the fancy stuff I did and cleaned stuff up a bit (the formatting was all over the place) but I can't even get it past the initialization macro at the beginning, in that I put a breakpoint right after it but nothing happened. I pressed step several hundred times and it was all good until it got into a loop for clearing CGRAM. I don't know if it broke there in an infinite loop, but it had to go through this whole thing for 256 colors. I put a breakpoint a few bytes after where the loop was, but it didn't work. The fact that it doesn't output anything for a macros is a bit of a problem... I'm "trying" though.
No, I won't upload what I currently have (I ran it through diff -ruN and it's somewhat of a mess), but I can clean it up so that it's got as few/minimal changes as possible vs. the original yet still experiences the same bug as before (which has nothing to do with my changes). I'm a little disheartened that despite 3+ pages of text and several people you still aren't exactly sure what to do/fix/change though. :\ On the bright side, you're at least persistent and dedicated.
I'll try to do that tonight or tomorrow and upload the results here. I can't make any promises though; I have stuff I need to do tonight and this week for work.
P.S. -- You can use mode 20 to address up to almost 32mbits/4MBytes of ROM. Board type SHVC-8PV5B supports this; banks $00-7D (126 banks) at 32KBytes ($8000-FFFF) each, means 126 * 32768 = 4,128,768 maximum amount of data (just short of 4MBytes; you lose 2 banks because of banks $7E-7F being RAM). Page 1-2-27 describes this memory model. So ask yourself why you're fooling around with mode 21 especially this early in your learning process. If you really need something larger than 32mbits (up to about 63mbits (just short of 64mbit because of banks $7E-7F)), then mode 25 is your only option, and I absolutely downright refuse to go into any of that given what I've just witnessed in this thread. IMO, respectfully, you have absolutely no reason to be focused on any memory model other than mode 20 right now.
koitsu wrote:
You can use mode 20 to address up to almost 32mbits/4MBytes of ROM. Board type SHVC-8PV5B supports this; banks $00-7D (126 banks) at 32KBytes ($8000-FFFF) each, means 126 * 32768 = 4,128,768 maximum amount of data (just short of 4MBytes; you lose 2 banks because of banks $7E-7F being RAM).
"Almost"? Why can't one just use banks $80-$FF with mode $20 and get the entire 32 Mbit?
koitsu wrote:
I'm a little disheartened that despite 3+ pages of text and several people you still aren't exactly sure what to do/fix/change though. :\
Mostly in what to organize.
Actually, screw it though, this is only like 3 actual files so it's not at all hard to figure it out. If anyone has any tips to better organize it, here it is. It's LoROM.
Attachment:
Starting Over, Again....zip [189.2 KiB]
Downloaded 85 times
tepples wrote:
koitsu wrote:
You can use mode 20 to address up to almost 32mbits/4MBytes of ROM. Board type SHVC-8PV5B supports this; banks $00-7D (126 banks) at 32KBytes ($8000-FFFF) each, means 126 * 32768 = 4,128,768 maximum amount of data (just short of 4MBytes; you lose 2 banks because of banks $7E-7F being RAM).
"Almost"? Why can't one just use banks $80-$FF with mode $20 and get the entire 32 Mbit?
Yeah, I didn't consider that. Sure, go ahead.
Here you go. Associated video:
https://www.twitch.tv/koitsu/v/61436037 (a little over an hour long)
What I ended up doing was getting rid of all of your
.segment nonsense inside of each individual .asm file -- except for Main.asm, which is what effectively says what file should use what segment. As such, you'll see pretty much everything ends up in
CODE_UPPER so that the code (effectively) works the same way as it did in mode 20.
One complication stems from use of the
RODATA segment (for things like the
zero_fill_byte byte used by
InitializeSNES). You only had one thing using that, and one thing that "wanted" to use it but was commented out. I stuck those two things in Main.asm's
RODATA segment.
I also got rid of the bootstrap routines I made for the vectors, since sticking everything in
CODE_UPPER effectively nullifies the need for that (it only matters if you end up needing to run code outside of the upper 32KByte regions of banks $00-3F).
Near the end of the video, I point you
Rachel Weil's github repository for a simple little NES (not SNES) thing she made. I point you to this as an example of how to organise code -- you'll find
ONE ASM FILE. Your existing SNES project has code scattered throughout __TEN FILES__ (not to mention the whole Objects\Object1 stuff), yet the code really is not "complex enough" to remotely justifying that amount of files. It's mindboggling to me. I've done entire projects where I've used maybe at most 3 .asm files, which assemble tens of thousands of lines of code. If you're going to split things into tons of .asm files, you need to understand how to rely on a better build system and not repeated use of
.include statements -- the latter is very KISS principle (and I fully support it!) but it isn't a style that ca65/ld65 caters to (it's a style "linear" assemblers cater to).
Now, all that said:
You have a massive/major/serious/catastrophic problem with some piece of code somewhere within start_object_identifier. This has _nothing_ to do with my changes, and I prove that in the video. As such, I've commented out
jsr start_object_identifier within
InfiniteLoop, and added this comment:
Code:
; Some code in start_object_identifier (or code called within that)
; is doing something bad with either direct page, the D register, or
; possibly the stack. Eventually crazy things start happening, such
; as B=$45 and other nonsense.
This is followed by erroneous behaviour that's quite funny to watch in the debugger.
My gut feeling is that you screwing with D (you set D=$0200) causes major issues because other existing code/routines/setup previously expected direct page at $0000. In other words: I believe you're setting D=$0200, and then continue on without putting it back to where it should be. The next time routines run, they read variables/data from direct page, but all those are now wrong (e.g.
lda $00 is going to read from $0200 not $0000 like before).
Please don't go blindly adding some lda #$0000, tcd statements to some code until it works. You need to understand what it is you're doing with your own code. Work through it, find the bug, and understand it.
Otherwise you have some code that's stomping all over RAM contents (I peeked at the memory contents at one point -- they become a giant mess, usually filled with a lot of zeros), and it's not my job to figure out why. :-) You're the author!
Good luck -- this is as much as I'm willing to do.
koitsu wrote:
Here you go. Associated video:
https://www.twitch.tv/koitsu/v/61436037 (a little over an hour long)
I'll be sure to watch it.
koitsu wrote:
I also got rid of the bootstrap routines I made for the vectors, since sticking everything in CODE_UPPER effectively nullifies the need for that (it only matters if you end up needing to run code outside of the upper 32KByte regions of banks $00-3F).
I thought that was basically the point though.
Anyway though, I'll probably come back to this, but I'll try and get Pong in LoROM first. I value your time, even if it doesn't seem like it
Even if I don't use the exact files you've given me, I'll (hopefully) have the knowledge to make them.
koitsu wrote:
I point you to this as an example of how to organise code -- you'll find ONE ASM FILE.
Looking at it, it appears that everything code related is in ONE ASM FILE, but all the graphics data and other stuff is in separate files.
Following that concept, I've crammed everything (aside for the header and memory map. I'm not entirely sure how to fix those) into one file. The initialization routine is no longer a macro and is in the code because there's no point in it not being, and the clear vram and clear cgram are separate. I also cleaned other stuff up, as I'm making a big effort now to put everything into my own format.
Attachment:
Main.asm [12.95 KiB]
Downloaded 79 times
By some miracle, it actually appears to work in that some values in ram is affected by the controllers (and ram is all #$00 aside for these registers)
Just one question, is it possible to set something equal to something else (like RButton= 16) in the "Code" segment if it's not in a procedure? All my stuff is currently at the top of the file.
No -- the point (as I describe in the video) was to get your "stuff" working in mode 21 as easily and with as few changes as possible. I only made one actual "idealistic" change, which was getting rid of all the .segment crap in each file and instead just relied on Main.asm to define it as appropriate.
You can re-visit this problem when 1) you understand the SNES memory map better, and 2) when your program code starts exceeding 2MBytes of (banks $00-3F = 64 banks, upper 32KBytes only available, 64 * 32768 = 2,097,152).
You can still put whatever you want (ex. data) in the lower 32KBytes that's associated with banks $C0-FF, though right now snes.cfg only defines an entry for one of those (lower 32KB in bank $C0); the other banks are essentially undefined. But it's up to you to ensure that you're accessing the data there correctly (using 24-bit addressing, changing B to the appropriate bank, a combo of the two (likely), or using DMA (also likely)).
Equates, or what ca65 calls
"numeric constants", can be set anywhere (in any scope -- including
globally), but they need to be defined before they can be used. There are examples of such being used in your InitializeController.asm file for some of the SNES memory-mapped registers (those would be within the scope of the CODE_UPPER segment though, given the changes I made to Main.asm).
With a "linear assembler", all this segment and scope crap doesn't really matter -- it's all "programmers masturbation material". All that really matters is 1) if what ends up in the ROM file is correct (meaning it does what intended/assembled to the correct instructions), and 2) that the code was written with knowledge of what the memory map (mode 20, 21, etc.) was. Seriously. I've said this before, but I spend more time "fighting" with ca65/ld65 than I do actually writing code -- in olden days, it was the exact opposite.
P.S. -- Please do not use that avatar; I thought you were rainwarrior for a moment. I can assure you that you will get banned pulling shit like that. This site has a very simple rule: don't be a dick (
Wheaton's Law). The logic is pretty simple: "if you don't know what that encompasses or means, then you're probably being a dick".
koitsu wrote:
I can assure you that you will get banned pulling shit like that.
Yeah, I don't feel like getting banned, so it's gone.
What was wrong with his avatar?
I wonder why there's all this fuss about ca65? Byuu has a very nice assembler on his website, it has none of that segment crap on it. It's pretty much tell the assembler what kind of ROM it is, and what CPU address to start the code at, and that's all.
koitsu wrote:
Otherwise you have some code that's stomping all over RAM contents (I peeked at the memory contents at one point -- they become a giant mess, usually filled with a lot of zeros), and it's not my job to figure out why.
His Vblank routine is messed up:
Code:
.proc VBlank
rep #$30 ;A=16, X/Y=16
pha
phx
phy
phd
phb
php
lda #$0000
tcd
----------------------- snip -----------------------
jsr get_input ;update joypad data
lda $4210 ;clear NMI Flag
plp
plb
pld
ply
plx
pla
rti
.endproc
It pushes a 16-bit value for A, but it only pulls back an 8-bit value. (The get_input subroutine does a php / plp, so when it returns from that routine, A is still 8-bit.) That means that the stack pointer decreases by (at least) one 60 times per second, hence the data mess in RAM. Edit: My mistake, it should indeed pull 16-bit values back, thanks to the
plp. Not exactly the most obvious/elegant way to do it, IMHO.
Also worth noting is that the
php /
plp combo in the Vblank routine is completely useless (Edit: though not in this particular case actually, see above), as the processor status gets pushed onto the stack
automatically as soon as NMI fires, and
rti pulls it back along with K and PC.
Also, the php occurring after the initial rep #$30 doesn't make any sense at all. (Using
php, you'll want to preserve an
unchanged processor status and therefore not mess with
rep /
sep beforehand.)
Anyway, the bug must be lurking elsewhere in the code.
Besides, I have a question, I see many people do pha / pla phx / plx during VBlank, I know what it is I use the Stack but is it really necessary for VBlank?
I mean, normally you don't execute code before calling the Vlank (unless you make a wai) or you made a loop to wait VBlank?
Well, you can wait for the VBlank ... But the VBlank can also occurs when you execute code in your main loop. You never know ...
I understand, but In general I always look at the CPU , so my code is done fast enough before the VBlank activates.
Kannagi wrote:
I understand, but In general I always look at the CPU , so my code is done fast enough before the VBlank activates.
Can you guarantee that, even in a worst-case slowdown situation? It's only about a dozen or so cycles.
Just don't be at 90% CPU (it is a good exercise).
After this depends on if it is true that my game loop is too long may be that I will reflect to a push / pop ( I worked first on the optimization).
You could disable the interrupts and re-enable it at the end of the frame, just to be on the safe side. You might get a black bar at the top from entering vblank late on busy frames, but it's still a lot more stable than interrupting an unfinished frame.
Too bad I can't do this because I run software sprite rotation code while it's waiting for vblank. I think I used to do this way, actually.
Kannagi wrote:
Besides, I have a question, I see many people do pha / pla phx / plx during VBlank, I know what it is I use the Stack but is it really necessary for VBlank?
I mean, normally you don't execute code before calling the Vlank (unless you make a wai) or you made a loop to wait VBlank?
Yes, it is necessary. Think about this situation:
VBlank routine (and is tied to NMI) does this
Code:
lda #$1234
rti
Main code (outside of VBlank):
Code:
lda #$4444
sta $2122
Now tell me what gets written to $2122 when NMI fires between the two instructions in the main code. :-) Use of
wai etc. doesn't change this situation. Henceforth, pushing A/X/Y/B/D/P and pulling P/D/B/Y/X/A is an extremely standard/common operation in interrupt routines.
I don't really understand your example but in my code $2122 is $44
(But $ 2122 must be written in the VBlank normally).
I changed the code a bit I did this:
Game loop :
Code:
lda $00
sta 2121
rep #$20
lda #$4444
sta $2122
sep #$20
;code game
VBlank :
Code:
;code
rep #$20
lda #$1234
sep #$20
rti
And I didn't see change.
Kannagi wrote:
I don't really understand your example but in my code $2122 is $44
(But $ 2122 must be written in the VBlank normally).
The whole point is: If NMI should fire right in between the two instructions in the main code, then an immediate value (i.e., the number) of $1234 gets written to the destination (whatever that may be) instead of the intended immediate value of $4444.
What Ramsis said. In other words: I don't think you understand the fact that NMI can actually happen at any time (it's tied to VBlank in most cases, so you can "sort of" know when it's going to happen, but not definitively -- hence tepples asking the question).
When an interrupt occurs, the contents of all registers aren't saved or restored automatically. So in the example I gave, think about what happens. This is LITERALLY what goes on within the processor:
Code:
Main: lda #$1234 ; A=$1234
{NMI begins}
NMI: lda #$4444 ; A=$4444
NMI: rti
{NMI ends}
Main: sta $2122 ; $4444 is written to $2122
Now, if you save/restore the accumulator in your NMI routine, all is fine:
Code:
Main: lda #$1234 ; A=$1234
{NMI begins}
NMI: pha
NMI: lda #$4444 ; A=$4444
NMI: pla ; A=$1234
NMI: rti
{NMI ends}
Main: sta $2122 ; $1234 is written to $2122
So, to recap: "is saving/restoring all the registers in an interrupt routine really needed?" The answer is YES.
I'm well aware that on 65816 you can use
wai to pause/wait for NMI (VBlank) to end, but there's going to come a time where you've got code running in a main (non-VBlank) loop and NMI kicks in, and suddenly all your registers change for a reason not immediately apparent. So if your NMI routine uses A/X/Y registers, tweaks B, and/or changes register sizes, then you absolutely need to push A/X/Y/B/P and restore those before doing the
rti, otherwise once returning control to your main loop (non-VBlank), previous register contents are trashed.
Yes I understand, the VBlank, it can occur at any time?
Normally it does not come at the end of the final line ?
There is an example or VBlank comes before?
When does it NMI besides vblank?
Let's put it this way: yes, you're going to try to do the best you can to make sure your code doesn't take too long and create lag frames. But in case it does happen (and it's likely there will be some edge case where it does), which would you rather happen?
a) The game just lags.
b) The game behaves erratically, writing unpredictable values to unpredictable locations, possibly crashing and burning, drawing garbage to the screen, making branches misfire, sending garbage to the SPC, causing all sorts of unreproducable, nigh-undebuggable problems.
Because b) is what you're risking if you don't preserve everything in your interrupt handlers. It is not worth it to leave those out just to save a few cycles in your handlers, just because you're pretty sure you won't lag.
Kannagi wrote:
Yes I understand, the VBlank, it can occur at any time?
Normally it does not come at the end of the final line ?
There is an example or VBlank comes before?
When an interrupt occurs, it can happen at any time, including in the middle of an instruction. The interrupt handler won't kick in until the actively-executing instruction completes. In other words, even if NMI occurs in the middle of the
lda #$1234 instruction in my example, transfer to the NMI vector won't happen until after the
lda #$1234 completes. This is well documented throughout all decent 65xxx documentation (for the 65816, see Western Design Center's documentation). Use of
wai can be used to wait for an interrupt (NMI in this case) to occur and finish, but you already know that.
None of the above has any relevance to what's being discussed.
What's being discussed is the fact that register contents are not automatically saved/restored when entering/exiting an interrupt. It's your responsibility to save them. Otherwise when the interrupt exits (
rti) and control is returned to the "main" program, register contents are whatever they were prior to
rti being executed. This is also documented clearly in 65xxx documentation, but I'll even quote WDC for you (page 277) -- I've made cyan the relevant parts:
Quote:
Interrupt-Handling Code
To correctly handle 65x interrupts, you should generally, at the outset, save all registers and, on the 6502 and in emulation mode, clear the decimal flag (to provide a consistent binary approach to arithmetic in the interrupt handler). Returning from the interrupt restores the status register, including the previous state of the decimal flag.
During interrupt handling, once the previous environment has been saved and the new one is solid, interrupts may be enabled.
At the end of handling interrupts, restore the registers in the correct order. RTI will pull the program counter and status register from the stack, finishing the return to the previous environment, except that in 65802/65816 native mode it also pulls the program bank register from the stack. This means you must restore the mode in which the interrupt occurred (native or emulation) before executing an RTI.
In other words: back up your registers (including P, D, B, or anything else you tweak in your NMI routine), and restore them (in the correct order) before doing
rti. The end.
psycopathicteen wrote:
When does it NMI besides vblank?
NMI is generated on any rising edge of the signal, so anything that causes it to go from low to high can trigger a spurious NMI.
On the NES this can happen if you flip the mask bit via the control register during vblank. I believe SNES has an equivalent thing? Though, this is actually a case where you know exactly where the NMI is going to fire in your code, since it's directly triggered by a register write.
I don't think koitsu was intentionally talking about this, though. I think he just meant to express that it's hard to predict where vblank is going to begin relative to your code, so you should never rely on this without good reason to do so.
koitsu wrote:
In other words: back up your registers (including P, D, B, or anything else you tweak in your NMI routine), and restore them (in the correct order) before doing rti. The end.
Once again, P (processor status) is saved
automatically when an interrupt fires, and restored
automatically by the
rti opcode. In other words, there's normally no need to do
php /
plp at the beginning and end of your IRQ/NMI handler.
Nicole wrote:
b) The game behaves erratically, writing unpredictable values to unpredictable locations, possibly crashing and burning, drawing garbage to the screen, making branches misfire, sending garbage to the SPC, causing all sorts of unreproducable, nigh-undebuggable problems.
And this doesn't just apply to registers -- you also have to watch your variables.
Here's an example:
In bunnyboy's
SNES PowerPak firmware, there's an 8-byte long DP (= Direct Page) variable called "temp". He
uses this variable as a quick/known-volatile data storage for like everything,
including a scrolling subroutine executed during Vblank. What he didn't account for in his software design was the fact that NMI could fire at any time -- and I didn't realize/encounter any bugs either, at least not until I decided to use
temp through
temp+7 during active display for filename calculations. I was just stuck wondering why filenames randomly wouldn't show up at all in the file browser -- until I realized my (i.e., his) mistake, and made
this commit. Everything's been working fine ever since.
@Nicole
I don't speak of good practice
Each one is free to choose his technique, but that for me my game must end before the VBlank
So point b) never happens, and yet I don't make small a game.
And it reminds me of C language, a bad pointer = a program that does not work
But I want to know I'm doing that, so I wanted to have confirmation that the NMI = VBlank and so if my finished code before VBlank, the stack is not used really.
Of course I don't advise it.
Edit:
Ramsis is right, I have to save the temporary variable, and increase my security code on the buffer dma :/
I think it is easier than the finished code before VBlank.
Ramsis wrote:
koitsu wrote:
In other words: back up your registers (including P, D, B, or anything else you tweak in your NMI routine), and restore them (in the correct order) before doing rti. The end.
Once again, P (processor status) is saved
automatically when an interrupt fires, and restored
automatically by the
rti opcode. In other words, there's normally no need to do
php /
plp at the beginning and end of your IRQ/NMI handler. :wink:
You're absolutely right, I had completely forgotten about/overlooked that fact.
psycopathicteen wrote:
I wonder why there's all this fuss about ca65? Byuu has a very nice assembler on his website, it has none of that segment crap on it. It's pretty much tell the assembler what kind of ROM it is, and what CPU address to start the code at, and that's all.
In my experience with bass:
Putting white space where it shouldn't be on accident freezes up the program; bass is the only assembler that seems to have this problem.
Opcode sizes have to be watched carefully by the user; ca65 manages this for you.
The error messages don't say a whole lot.
Comments are done with //, so snippets from other people (even yourself) with comments from ANY other assembler will throw you off.
Where's the enum feature? It isn't very convenient, having to map out variable offsets on your own through defines.
I'd say these are enough reasons to keep Espozo with ca65 for now. So much difference in assemblers, I doubt he'd ever get around to actually coding (I probably wouldn't).
From what I hear, it sounds like if you want to, say write to $7f0000, instead of writing this:
lda #$55
sta $7f0000
You'd write this:
lda #$55
sta BSS7F_SEGMENT+$0000
or some crazy stuff like that.
psycopathicteen wrote:
From what I hear, it sounds like if you want to, say write to $7f0000, instead of writing this:
lda #$55
sta $7f0000
You'd write this:
lda #$55
sta BSS7F_SEGMENT+$0000
or some crazy stuff like that.
Assuming BSS 7F is a segment that starts at $7f0000, you'd declare like this:
Code:
.segment "BSS 7F" : far
variable: .dsb 1
and reference like this:
Code:
lda #$55
sta variable
Or if you don't want any variables:
Code:
lda #$55
sta $7f0000