How big is your object table? (addressing problems)

How big is your object table? (addressing problems)
by Drew Sebastino on 2016-08-23 (#178103)

I realized I'm heading toward disaster. I have everything in bank $00, and that includes everything in ram in that nothing leaves the first 2KB. If I want my object table to be a good size, let's say 32 bytes x 64 objects = 2048 bytes, that'd be exactly the first 2KB. I was wondering, I know $00-$3F and $80-$BF have 2KB of ram at the beginning, but is it the same 2KB, because if so, that's not helpful to me. The main problem is that if I have my object table in banks $7E or $7F, I don't have access to any rom data. Actually, wait, I have still have direct page. But wait, it can't move out of bank $00... So if I do that, I'm stuck with only the first 32KB of rom data. Because you're pretty much always going to have to use the object table, you're basically stuck with only using that much data for anything that isn't being DMA or HDMA transferred, so any tables for metasprite or animation information have to fit in there. I suppose one thing I could do is have some of the object tables in the first 2KB of ram. It'll be nice, because I'll have an overlap of object data that can access rom and the rest of ram, depending on what I have the data bank register set to. Can you only change the bank register via PHB and PLB? (I've never done it before, so I wouldn't know) Yeah though, I guess this won't be as hopeless as I originally thought. I can even DMA some data over to ram on startup that I'll use as rom. Too bad it'll only be at 2.65MHz though... :lol:

Re: How big is your object table? (addressing problems)
by psycopathicteen on 2016-08-23 (#178105)

Don't you mean 8kb?

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-08-23 (#178106)

Wait, really? Wow, I'm an idiot... False alarm. :lol:

Re: How big is your object table? (addressing problems)
by tokumaru on 2016-08-23 (#178107)

Espozo wrote:

If I want my object table to be a good size, let's say 32 bytes x 64 objects = 2048 bytes

That doesn't sound like a lot, considering this is the SNES we're talking about... The Genesis has half the RAM and Sonic still has 96 64-byte object slots. Even the 8-bit Sonic games dedicate 64 bytes to each object, but they obviously have fewer slots (32 maybe?).

I don't know what kind of game you're making, but I can tell you I'm having a hard time making everything fit in just 32 bytes per slot in my NES game. It's true that I do need more physics stuff than most NES games, but from the stuff I read here it looks like you have a lot of complexity related to sprites and patterns. Are you sure you don't want to use 64 bytes per slot? Even if you don't need all that most of the time, doing this will at least save you the trouble of figuring out which offsets can be reused for different purposes on different object types by analysing the requirements of each one to avoid conflicts.

Re: How big is your object table? (addressing problems)
by psycopathicteen on 2016-08-24 (#178134)

It's funny how Sonic has more objects than the Genesis has sprites. It sounds like they were making sure that Sonic never runs out of objects.

Re: How big is your object table? (addressing problems)
by tepples on 2016-08-24 (#178135)

As for Sonic the Hedgehog series on the Genesis

Among the 96 sprites in the sprite table, only those sprites that overlap the 320x224 pixel camera window need to be included in the list of 80 sprites sent to the VDP. The rest can be clipped out. But now that I think about it, the 2-player mode of Sonic 2 has two camera positions. Does it send a second sprite table during forced blanking in the middle?

As for Super NES RAM capacity

The Super NES has 128 KiB of RAM, but many addressing modes apply only to data in the current data bank. And if you want to access both RAM and ROM using these addressing modes at the same time, you'll need to put RAM and ROM in the same bank. This means you're stuck with the 8 KiB at $7E0000-$7E1FFF, which are mirrored into banks $00-$3F and $80-$BF.

Re: How big is your object table? (addressing problems)
by tokumaru on 2016-08-24 (#178137)

tepples wrote:

But now that I think about it, the 2-player mode of Sonic 2 has two camera positions. Does it send a second sprite table during forced blanking in the middle?

I have no idea, I also wonder about that.

Quote:

The Super NES has 128 KiB of RAM, but many addressing modes apply only to data in the current data bank. And if you want to access both RAM and ROM using these addressing modes at the same time, you'll need to put RAM and ROM in the same bank. This means you're stuck with the 8 KiB at $7E0000-$7E1FFF, which are mirrored into banks $00-$3F and $80-$BF.

Are you saying that a typical SNES game has less RAM that it can directly access than an NES game with 8KB of WRAM on the cartridge? SNES memory mapping is really confusing... Why would they use such a convoluted design in a 16-bit console? Are these absurd limitations imposed by the CPU or did Nintendo really screw this up by trying to keep things somewhat similar to how they were on the NES?

Re: How big is your object table? (addressing problems)
by Revenant on 2016-08-24 (#178139)

tokumaru wrote:

Are you saying that a typical SNES game has less RAM that it can directly access than an NES game with 8KB of WRAM on the cartridge?

A SNES game can directly access all 128kb (or more) of RAM at all times. The 8kb of RAM that is mirrored to multiple locations only affects the possible size/speed of some read/write instructions.

Quote:

Are these absurd limitations imposed by the CPU

Yes, sort of.

One reason RAM and ROM are mapped the way they are on the SNES is because the stack has to be in RAM, while the interrupt vector table has to be in ROM, but both the stack and the vector table are always in the 000000-00FFFF range (bank 0).

Quote:

did Nintendo really screw this up by trying to keep things somewhat similar to how they were on the NES?

Also yes. (Because arguably, if they weren't aiming for similarity with the NES then they could have used a different processor entirely.)

Re: How big is your object table? (addressing problems)
by psycopathicteen on 2016-08-24 (#178146)

The only 2 limitations for accessing memory are:

1) The direct page and stack can only be located within bank 0.
2) You can't do long indexing with Y, unless it's indirect.

Re: How big is your object table? (addressing problems)
by Stef on 2016-08-24 (#178150)

tokumaru wrote:

tepples wrote:

But now that I think about it, the 2-player mode of Sonic 2 has two camera positions. Does it send a second sprite table during forced blanking in the middle?

I have no idea, I also wonder about that.

Indeed it completely rewrite the sprite table during forced blanking in middle of screen

Re: How big is your object table? (addressing problems)
by Nicole on 2016-08-24 (#178152)

It is possible to access all of WRAM through the WRAM registers as well ($2180-2183), though that's not really "direct access".

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-08-24 (#178172)

Nicole wrote:

It is possible to access all of WRAM through the WRAM registers as well ($2180-2183), though that's not really "direct access".

Yeah, that's damn near useless. :lol:

Stef wrote:

Indeed it completely rewrite the sprite table during forced blanking in middle of screen

That's pretty impressive... 96 objects really does seem excessive for any of the Sonic games though, it's not like it's Contra.

psycopathicteen wrote:

1) The direct page and stack can only be located within bank 0.

I don't care about the stack, but the deal with direct page is a pretty big one. If you're loading from the other 120KB of ram, you're stuck with only using the data in bank $00. I'll probably just DMA some data to ram on startup. Object palettes are an example, because my routine needs access to them and it's not like the palettes are going to take an unrealistic amount of space. Plus, the data is rewritable. Donkey Kong Country has about 64 (Edit: I remembered that wrong, it's more like 128) sprite palettes total. Metal Slug probably has no more than twice that. (It's insane though, it appears Metal Slug does some sort of dynamic palette updating. Why? It never goes past 64 onscreen palettes, ever :lol:

)

Revenant wrote:

if they weren't aiming for similarity with the NES then they could have used a different processor entirely.

It seemed that this is the only reason the 65816 was ever used. I don't know anything else that used it except the Apple IIGS, and it used it there for Apple II compatibility.

I don't think the SNES's problem is the processor itself, but rather, the ram that's not even as fast as the CPU that's not exactly known for its horsepower (although I think it's lack of power is exaggerated, but that's another story). The memory mapping is silly (like I said, you can't load from any ram past the first 8KB and rom at the same time, it's like, why did they even put the other 120KB of ram in there to begin with?) and the communication with the SPC700 is really poor in that it takes too much CPU time to upload a reasonable amount of data. I know I'm going far overboard (my animation scheme and my palette changing stuff) but I find that most of the CPU time is spent making up for the PPU's shortcomings. I know that's ridiculous to say, but I can honestly see just about any arcade game from the time period running at the speed they do with the 65816, but there's so many background layers and animation and palettes and sprites that unless you're doing all this crap to compensate with the CPU (like sprite multiplexing, dynamic animation, etc.) which would potentially slow down the CPU too much, the PPU won't cut it.

tokumaru wrote:

but from the stuff I read here it looks like you have a lot of complexity related to sprites and patterns.

Basically the only thing I've been doing in terms of programming has been object or dma to vram related. 32 bytes was kind of preposterous though. I'm at something like 26 bytes from all my routines, but I haven't even implemented any AI or physics stuff yet. I'll probably need at least 48 bytes, which with 96 objects (probably what I'd settle with. Maybe higher later, but 128 seems a little ridiculous) would be 4.5KB. However, say I want 64 bytes and 128 objects, I'm using the entire 8KB.

I think I found out my battle plan though. Like I said earlier, have whatever variables that have routines that don't need any information outside of the first bank will be past the first 8KB of ram. Whatever else will be in the first 8KB. Luckily for me, a lot of variables are (or are expected to) not touched by the actual object code. For example, I have X and Y position variables, but they are just for the total level. Then, I have onscreen X and Y position variables that a routine generates, and then these are used by the metasprite routine and other stuff. Non-onscreen X and Y are really only for object code, while onscreen is for everything else. What's nice (I want to make my engine as all-purpose as possible, just swap out a few routines for different types of games) is that I can have a simple "subtract x and y by camera x and y" or I can do something more complicated that uses multiplication and whatnot for a mode 7 racer.

Edit: (It's not worth to double post for this) Is "bmi" not the same thing as "and #$8000, bne" because it's not giving me the same result. I've been trying to speed up my metasprite routine, and I found that this was something minor I could do to help.

Also, what is the difference between "and" and "bit"? They both do the exact same thing, accept it affects an additional "V" flag, which is the "overflow" flag. I didn't know what that meant, so I looked it up, and I still don't get it.

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-08-25 (#178264)

(I felt that this was actually worth another post, obviously because I'm did just that:)

Damn it, you know what? I realized that when using direct page, the address can be no more than 1 byte. The reason this is a problem is that I recently slit all my tables apart into multiple because I was led to believe there was no advantage to doing it the other way. Turns out, I was wrong. :lol:

So, great, upon formatting a table, I need to keep in mind all of the routines that are using it, and with an object table, that's a hell of a lot. I'm considering reverting my object table back to how it was in that everything is together like how it was. I'll just have to suck it up and fit the object table into the first 8KB while still having room to spare. I'll probably do some sort of weird number like 56 bytes x 112 objects for 6272 bytes, a little over 6KB. What's nice though, is I realized that direct page is always faster than x or y indexed, so in terms of processing power, I have about nothing to lose unless I'm trying to load a random variable in the first 8KB of ram, because I'll have to use absolute addressing instead of direct page, but in a routine that deals with the object table, I'm more likely going to load from it than the random variables.

You know what's funny? My old routines were actually closer to where I want to go than I am right now, in that I had the object table being indexed by direct page... :lol:

Yeah, somebody should have written a warning about this. It'd go like...

Quote:

If using direct page, have a struct of arrays.

If using x and y, have an array of structs.

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-08-27 (#178414)

Triple post!

I found out that if I have 64, 2 sprite objects, the metasprite routine uses about 7/8's of the CPU time. :shock:

(well, not quite that, the routine that makes the top 32 bytes of oam takes a decent size too.) Although there are some random optimizations I can make here and there, the real only way around it is to have a separate metasprite for if an object is flipped or not. As I've said before, I'm not to worried about space. If I do this though, I think I'll be able to get the CPU time down to 3/4's! It's not too hopeless though, as I'm not even using FastROM yet, which might get it down to 5/8's. I don't see me doing that palette thing though, unfortunately.

Well, I could also just make everything an unrolled loop... :lol:

Re: How big is your object table? (addressing problems)
by psycopathicteen on 2016-08-27 (#178419)

Can you post some code?

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-08-28 (#178492)

Sure. However, I actually already deleted the flipping part of the code. It wasn't way too optimized, but at the same time, I think the speed increase would be marginal.

Here's the metasprite code though. I got rid of the single metasprite code because there's no need now. However, I might bring it back, because it'll be marginally faster in that I shouldn't need to add the x or y position from the metasprite, I'd just use the object's x and y positions, so that'll save me two sets of clc and adc! :lol:

I guess the other thing would be that I don't need to check if we're at the end of the table again, but that's even less impressive in terms of saving cpu time. I don't care though, when you're going through a routine (at most) 128 times, any cycle saved helps.

Oh yeah, "big_metasprite" is the exact same thing except that the values for checking if it's out of bounds are different. I figured I'd waste a few cycles (and go against what I said earlier :lol:

) to combat overdraw. The comments are pretty much useless, but I think they're accurate. The comments on the metasprite are, I know.

Code:

.proc metasprite_handler
  rep #$30   ;A=16, X/Y=16
  lda #ObjectTable
  tcd
  ldy #$0000
  bra continue_metasprite_finder

metasprite_finder:
  tdc
  clc
  adc #ObjectSlotSize
  tcd
  cmp #ObjectTable+ObjectTableSize
  bne continue_metasprite_finder
  sty a:SpriteCount
  rts

continue_metasprite_finder:
  ldx ObjectSlot::MetaspriteOffset
  beq metasprite_finder
  lda a:$0000,x
  sta a:MetaspriteCount

metasprite_loop:
  lda a:$0006,x
  and #$0001
  sta a:SpriteBuf3+2,y      ;sprite size
  bne big_sprite
  lda a:$0002,x
  clc
  adc ObjectSlot::OnscreenXPosition
  cmp #256
  bcc sprite_x_not_out_of_bounds
  cmp #65528
  bcs sprite_x_not_out_of_bounds
  txa
  clc
  adc #$0008
  tax
  dec a:MetaspriteCount      ;decrement MetaspriteCount by 1
  brl metasprite_loop      ;back to the loop...

metasprite_finder_branch:
  bra metasprite_finder

sprite_x_not_out_of_bounds:
  and #$01FF
  sta a:SpriteBuf1,y      ;Store sprite X position SpriteBuf1+y
  sta a:SpriteBuf3,y      ;Store sprite X position SpriteBuf1+y
  lda a:$0004,x         ;2nd byte = sprite Y position (value 0-255)
  clc
  adc ObjectSlot::OnscreenYPosition
  cmp #224
  bcc sprite_y_not_out_of_bounds
  cmp #65528
  bcs sprite_y_not_out_of_bounds
  txa
  clc
  adc #$0008
  tax
  dec a:MetaspriteCount      ;decrement MetaspriteCount by 1
  brl metasprite_loop      ;back to the loop...

sprite_y_not_out_of_bounds:
  sta a:SpriteBuf1+1,y
  lda a:$0006,x
  sta a:SpriteBuf3+2,y      ;sprite size
  lda a:$0008,x
  ora ObjectSlot::Attributes
  sta a:SpriteBuf1+2,y      ;extra/character
  iny
  iny
  iny
  iny
  cpy #$0200         ;sees if all 128 sprites are used up
  bne continue_sprite_y_not_out_of_bounds
  sty a:SpriteCount
  rts

continue_sprite_y_not_out_of_bounds:
  dec a:MetaspriteCount      ;decrement MetaspriteCount by 1
  beq metasprite_finder_branch
  txa
  clc
  adc #$0008
  tax
  brl metasprite_loop      ;back to the loop...

Code:

TestMetasprite:
  .word $0002   ; Number of metasprite table entries below
        ;XPos YPos NextTile/Size Extra/Character
  .word $0000,$0000,$0000,$0001
  .word $0000,$0008,$0101,$4000

Yeah though, I actually got it to only cover "only" half the screen now. Much better though. The rest of my stuff takes up about a forth of that, and I know that can be optimized more than this can. Additionally, like I said, I'm not using FastROM either. I think I've learned though that anything that doesn't have to be done at runtime (like flipping metasprites), I'm not doing it.

I know I keep going on and on, but I'm confused, if you're doing a "rep" or "sep", if the accumulator is 16 bit, will that add 2 extra cycles? I've been trying to be a bit smarter in terms of the size of the accumulator, and x and y. Unfortunately, I really can't do anything with x and y in the above routine. It's a pain in the ass that x and y can't be different sizes, because you could easily make two (really four now :roll:

) different routines that deal with the different 256 byte halves of oam. There's no feasible way to make x 8 bit here. (I suppose you could have a different routine for every 256 bytes... Yeah, put the metasprite data at the beginning of every bank... :lol:

) It's also a pain in the ass that direct page can't escape bank 0, because it would be perfect for indexing metasprites as each slot is only a handful of bytes. With the object table, anything you use is just as effective. The reason I'm using direct page on the object table is because it's the fastest, and most of my object routines are probably going to deal with data outside of the first 8KB or ram or the data outside of bank $00.

Oh yeah, one final thing, an obvious optimization I saw with your hioam filling code is that you can use direct page instead of x or y to save a cycle for every "ora". X and y can then just be 8 bit, because you're only indexing 32 bytes instead of 512.

Re: How big is your object table? (addressing problems)
by psycopathicteen on 2016-08-28 (#178494)

You can use a separate routine for big and small sprites, and forget about storing the size bits separate from the x corrdinates.

Code:

big_sprite_x_not_out_of_bounds:
  and #$01FF
  sta a:SpriteBuf1,y      ;Store sprite X position SpriteBuf1+y
  ora #$0200               ;Set size bit
  sta a:SpriteBuf3,y      ;Store sprite X position SpriteBuf1+y

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-08-28 (#178500)

I actually do have a separate routine for small and large sprites, but I didn't think about that. Good call!

By doing that and some other stuff I did, I got it down by about 8 scanlines I think. I can think of a couple more fallback things, but they increase the code size ten fold for a measly amount of cycles.

Do you have any clue how to increase the rom speed to FastROM? I don't. I'm just saying, if Rendering Ranger R2 is SlowROM, (and probably even if it weren't) it must be hardcoded to hell. I really don't know how much CPU time games generally use toward creating metasprites. I know that no game on the SNES (maybe aside from Super R-Type on level 2 when it's freaking out) even uses that many sprites though.

Re: How big is your object table? (addressing problems)
by 93143 on 2016-08-29 (#178595)

Using FastROM in the general case involves understanding memory mapping, because you have to be accessing ROM in banks $80-$FF for it to work. FastROM is activated by writing a 1 to $420D, but this only affects the upper half of the memory map.

The method I use - which is fairly standard - is to insert a long jump near the beginning of my code, to jump to $800000 plus the absolute address of the instruction following the jump. That is, label the next instruction, use an assembler expression to add $800000 (or $7F0000 in the glitchy old version of WLA DX I've been using) to the value of the label, and JML to that. This works because in both the HiROM and LoROM standard mappings, bank $00 is the same as bank $80, so you end up executing the same code as if you hadn't jumped. Then just write 1 to $420D, if you haven't already, and you're off to the races.

Oh, and change the data bank to somewhere in $80-$FF too (unless you're using WRAM, or a special feature of the cartridge that isn't the same between halves of the memory map). That way data accesses in ROM will be fast.

Code:

jml $7F0000+high_speed   ; $800000 doesn't work for some reason, but this does - assembler bug? (ancient version of WLA DX)
high_speed:
   lda #$01
   sta $420D
   phk
   plb

Re: How big is your object table? (addressing problems)
by Revenant on 2016-08-29 (#178596)

This should also work for WLA:

Code:

.base $80
    jml +
+   lda #$01
    ...

Re: How big is your object table? (addressing problems)
by 93143 on 2016-08-29 (#178601)

Yeah, should. The version from 2003 that I got with Neviksti's SNES Starter Kit seems to have a buggy base directive... I seem to recall it being case-specific, causing me to have to experiment with the no$sns debugger to get it to work (or was that the method I showed above? In any case Neviksti's .inc template had a comment to the effect that base wasn't working properly)...

I'm sure the latest version is much better, but last time I tried to use it, it not only failed to work at all but somehow managed to make the old version not work either, so I'm not in a hurry to try again... my SNES development environment needs an overhaul if I'm going to do more than noodle around, but it will have to wait because I'm busy with my actual job...

Thanks for mentioning it, though. It's probably the way you're supposed to do it... but now that I think of it, IIRC Espozo's using ca65, and I haven't gotten very far with that yet; there's probably some voodoo that you need to do...

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-08-30 (#178643)

93143 wrote:

IIRC Espozo's using ca65

Yes.

I think I can actually solve the problem right here though, maybe...

Code:

# ca65 linker config for 256 KiB (2 Mbit) sfc file

# Physical areas of memory
MEMORY {
  ZEROPAGE:   start =  $000000, size =  $0100;   # $0000-00ff -- zero page
                                                 # $0100-01ff -- stack
  BSS:        start =  $000200, size =  $1e00;   # $0200-1fff -- RAM
  BSS7E:      start =  $7e2000, size =  $e000;   # SNES work RAM, $7e2000-7effff
  BSS7F:      start =  $7f0000, size = $10000;   # SNES work RAM, $7f0000-$7ffff
  ROM0:       start =  $008000, size =  $8000, fill = yes;
  ROM1:       start =  $018000, size =  $8000, fill = yes;
  ROM2:       start =  $028000, size =  $8000, fill = yes;
  ROM3:       start =  $038000, size =  $8000, fill = yes;
  ROM4:       start =  $048000, size =  $8000, fill = yes;
  ROM5:       start =  $058000, size =  $8000, fill = yes;
  ROM6:       start =  $068000, size =  $8000, fill = yes;
  ROM7:       start =  $078000, size =  $8000, fill = yes;
}

# Logical areas code/data can be put into.
SEGMENTS {
  CODE:       load = ROM0, align =  $100;
  RODATA:     load = ROM0, align =  $100;
  SNESHEADER: load = ROM0, start = $ffc0;
  CODE1:      load = ROM1, align =  $100, optional = yes;
  RODATA1:    load = ROM1, align =  $100, optional = yes;
  CODE2:      load = ROM2, align =  $100, optional = yes;
  RODATA2:    load = ROM2, align =  $100, optional = yes;
  CODE3:      load = ROM3, align =  $100, optional = yes;
  RODATA3:    load = ROM3, align =  $100, optional = yes;
  CODE4:      load = ROM4, align =  $100, optional = yes;
  RODATA4:    load = ROM4, align =  $100, optional = yes;
  CODE5:      load = ROM5, align =  $100, optional = yes;
  RODATA5:    load = ROM5, align =  $100, optional = yes;
  CODE6:      load = ROM6, align =  $100, optional = yes;
  RODATA6:    load = ROM6, align =  $100, optional = yes;
  CODE7:      load = ROM7, align =  $100, optional = yes;
  RODATA7:    load = ROM7, align =  $100, optional = yes;

  ZEROPAGE:   load = ZEROPAGE, type = zp;
  BSS:        load = BSS,   type = bss, align = $100, optional = yes;
  BSS7E:      load = BSS7E, type = bss, align = $100, optional = yes;
  BSS7F:      load = BSS7F, type = bss, align = $100, optional = yes;
}

I'm guessing all I need to do is change "ROM0-7" to bank $80. I'll try it soon. I swear though, if I can't get this down to 1/3 of the screen, I'm (seriously) hardcoding metasprites into each object's code, although I know I'll end up regretting it. The problem is that collision detection probably takes up way more time than creating metasprites, because even if each collision check would take about half the time metasprites take me, if we're doing something like 32x32, that's a whopping 1024 checks. I really hope I don't have to hardcode everything. :lol:

The one thing I really like about having each object have its metasprite hardcoded in is that for something like a bullet or explosion, you're really not doing much of anything to create the metasprite. I also thought about how if you have a multi sprite object, you could just fill out the different x and y positions of each sprite back to back. However, with heavily animated objects that change size often, this setup completely falls apart and pretty much becomes impossible. I honestly have no clue what I'll do.

Edit: Wait a minute, where does the SNES start to read code on power up? $000000? Actually, wait, it couldn't be because that's in ram.

Re: How big is your object table? (addressing problems)
by 93143 on 2016-08-30 (#178654)

It starts at the address defined by the RESET vector, which is given at $FFFC in bank $00. The RESET vector is 16-bit, so the SNES can't start anywhere other than bank $00.

Re: How big is your object table? (addressing problems)
by koitsu on 2016-08-30 (#178659)

Amusingly, I went over all this months ago, including on a live twitch stream. RIP.

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-08-31 (#178671)

Amusingly, I forget things.

Anyway, I'll fix try this when I get home.

Re: How big is your object table? (addressing problems)
by psycopathicteen on 2016-08-31 (#178674)

Quote:

if we're doing something like 32x32, that's a whopping 1024 checks

Do you mean 32 objects colliding with 32 objects?, or 32x32 sprites colliding with 32x32 sprites?

Re: How big is your object table? (addressing problems)
by tepples on 2016-08-31 (#178677)

Standard ways to make collision among 32 objects manageable:

Partition the objects into sets by type, where only certain type pairs will produce a collision. For example, player bullets won't collide with each other, nor will enemy bullets. And in many games, enemies don't collide with each other.
Sort all objects by their X or Y center point, and reject those outside a certain radius.

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-08-31 (#178719)

Wait a minute...

93143 wrote:

It starts at the address defined by the RESET vector, which is given at $FFFC in bank $00. The RESET vector is 16-bit, so the SNES can't start anywhere other than bank $00.

How do you have the cartridge not be very large while simultaneously having data in both bank $00 and bank $80? I had thought of this upon editing that code I posted.

psycopathicteen wrote:

Do you mean 32 objects colliding with 32 objects?

This.

Actually, let me break down collision. If the theoretical game is a two player shooter with a bunch of crap going on, you could probably break down total number of objects into 40 player bullets, 48 enemy bullets, 16 enemies, and the rest miscellaneous (explosions, shrapnel, actually the player ships). Because both enemies and bullets are harmful to the player, add them both up for 64, x 2 = 128, then add 40 x 16 for 768 total checks. Better, but still not at all good. If the SNES took that big of a hit from just creating metasprites, I'm screwed. However, I always did plan to hardcode collision detection, because it's easy to do so and collision can be more flexible than a standard routine could offer, like changing the velocity of an object after being hit from a certain direction.

Re: How big is your object table? (addressing problems)
by Revenant on 2016-08-31 (#178720)

Espozo wrote:

How do you have the cartridge not be very large while simultaneously having data in both bank $00 and bank $80? I had thought of this upon editing that code I posted.

Banks $00 and $80 are physically identical on (nearly) all cartridges. That is, you can put your vectors at $80FFE4+ and they will be mirrored to $00FFE4. The only difference is that bank $80 is accessed faster when FastROM is enabled.

Re: How big is your object table? (addressing problems)
by psycopathicteen on 2016-09-01 (#178755)

Quote:

Actually, let me break down collision. If the theoretical game is a two player shooter with a bunch of crap going on, you could probably break down total number of objects into 40 player bullets, 48 enemy bullets, 16 enemies, and the rest miscellaneous (explosions, shrapnel, actually the player ships). Because both enemies and bullets are harmful to the player, add them both up for 64, x 2 = 128, then add 40 x 16 for 768 total checks. Better, but still not at all good. If the SNES took that big of a hit from just creating metasprites, I'm screwed. However, I always did plan to hardcode collision detection, because it's easy to do so and collision can be more flexible than a standard routine could offer, like changing the velocity of an object after being hit from a certain direction.

Are you talking about shmups or run'n'guns? I think run'n'guns typically have less bullets per player, while shmups typically have more bullets per player but don't usually support multiplayer. I think 20 bullets for a one player shmup is what most shmups do.

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-09-01 (#178759)

psycopathicteen wrote:

Are you talking about shmups or run'n'guns?

shmups. I thought I said that, but maybe I didn't. A run and gun would be next to impossible to make with one person because of all the animation and general complexity involved. Shmups are generally a lot less varied, or at least they can get away with it better. Hell, most don't even have BG collision. I'm actually trying to make something else actually that shouldn't be so CPU intensive, but I'm just curious. Really, a shmup could probably be a lot more hardcoded. Animation is next to none, objects are typically only comprised of a couple of sprites, there's often no BG collision, and objects are generally fairly dumb. What counteracts this is that there's a lot of stuff.

Revenant wrote:

Cool. I actually have a dumb English paper I have to write, but after that (probably not today) I'll see how much of an improvement FastROM is. I probably won't hardcode metasprites for what I'm trying to do now (doesn't involve even close to 128 objects) but I might at some other point. I thought about how me trying to hardcode in metasprites would ultimately be me either creating new code for every frame (impractical) or basically me just copying over the code from my metasprite routine (which would probably even be slower).

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-09-03 (#178871)

I must have done something wrong, because it's not any faster at all. Here's what I did to try and make it work:

Code:

.proc Main

  sei      ;Disable interrupts
  clc      ;Clear carry, used by XCE
  xce      ;Set 65816 native mode
  jml $800000+MainFastROM   ;Needed to set K (bank of PC) properly if MODE 21 is ever used;
                     ;see official SNES developers docs, "Programming Cautions"
.endproc      

;=========================================================================================
;=========================================================================================

.proc MainFastROM
  lda #$01
  sta a:$420D
  phk      ;Push K (PC bank)
  plb      ;Pull B (data bank, i.e. data bank now equals PC bank)

Code:

ROMSPEED_120NS = $10
  .byte MAPPER_LOROM|ROMSPEED_120NS

I thought I was going crazy and didn't implement it, but somehow, this doesn't make it crash:

Code:

  phk
  pla
  cmp #$80
  beq infinite_loop
  stp

Because everything gets mirrored, I didn't bother messing with the file I posted earlier that defined where the banks were.

Re: How big is your object table? (addressing problems)
by Nicole on 2016-09-04 (#178877)

A few problems here:

If your segment is already set to be bank $80, you don't need to write jml $800000+MainFastROM, just jml MainFastROM is sufficient, because ca65 will assemble that as a long jump to bank $80.

Come to think of it, it's even possible that you adding $800000 is putting you in bank $00, because if MainFastROM is already $801234, adding $800000 would make it overflow to $001234. Not sure if that's exactly how ca65 behaves, but it's possible, in which case you'll get no speedup at all.

The other problem is that you're trying to set $420D to #$01 way too early.

For starters, you haven't set the processor flags. I don't know if A is guaranteed to be 8-bit after switching to native mode after startup, but if not, you risk crashing almost immediately, because if A is 16-bit, the CPU will attempt to read an extra byte for lda, causing it and every instruction after it to be completely misinterpreted.

Secondly, you need to set $420D after setting the data bank, not before. If the data bank is uninitialized, it could potentially be one of $40-$7f, or $c0-ff, where the system area doesn't exist. That would mean $420D doesn't get set at all, because it would try to write somewhere like $50420D.

Re: How big is your object table? (addressing problems)
by tepples on 2016-09-04 (#178894)

If you write to $80420D or f:$420D, it'll use absolute long mode, which doesn't depend on the current data bank.

Also make sure the init code isn't writing to $420D later on.

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-09-04 (#178898)

tepples wrote:

Also make sure the init code isn't writing to $420D later on.

Apparently, it was being zeroed latter... :lol:

It works fine now, so thanks everybody.

Nicole wrote:

If your segment is already set to be bank $80

No. It's all in bank $00.

So now, I'm using 3/8 the screen instead of 1/2. It's better at least. :lol:

Re: How big is your object table? (addressing problems)
by Nicole on 2016-09-04 (#178917)

Espozo wrote:

No. It's all in bank $00.

That's a problem. You want the segment to be in bank $80, otherwise if you need to long jump to a label in that segment later, you'll end up back in bank $00. You'd end up being forced to add $800000 manually every single time for no good reason to avoid ending up back in slow ROM. Just make it bank $80, and do jml MainFastROM instead of jml $800000+MainFastROM.

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-09-04 (#178920)

Nicole wrote:

Just make it bank $80

I don't know how. I tried editing the earlier file I posted to where "ROM0" was at $808000, but it freaked out.

Re: How big is your object table? (addressing problems)
by Nicole on 2016-09-04 (#178922)

"Freaked out" in what way?

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-09-04 (#178924)

It didn't assemble. It said that it was "too low", if I remember correctly.

Re: How big is your object table? (addressing problems)
by Nicole on 2016-09-04 (#178936)

...You're gonna have to be more helpful than that, but as a guess, this:

Code:

SNESHEADER: load = ROM0, start = $ffc0;

might need to be this:

Code:

SNESHEADER: load = ROM0, start = $80ffc0;

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-09-07 (#179154)

Yeah, that was the problem. I haven't been able to work on this in a while, so I hadn't been able to check what the deal was. I never thought about it, but it kind of sucks that I have to write ".LOWORD" for everything now.

Re: How big is your object table? (addressing problems)
by TOUKO on 2016-09-08 (#179161)

Was 120ns the rom speed needed for snes's fast rom ??? :shock:

Re: How big is your object table? (addressing problems)
by 93143 on 2016-09-08 (#179163)

Yes. 200 ns for SlowROM, 120 ns for FastROM. As I understand it, the 5A22 used the classic 65xx half-cycle strobe with 6 master clocks per internal cycle and on-die wait state behaviour, so for a slow cycle (8 master clocks), phi1 would take an internal half-cycle (3 master clocks, or ~140 ns) and phi2 would take the rest of the cycle (5 master clocks, or ~230 ns). Fast cycles (6 master clocks) left just 3 master clocks for phi2 rather than 5, hence the dramatic difference in the speed requirement.

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-09-08 (#179172)

I don't know how, but I I'm still having the SNES crash, and I found out that it's doing it on "jml [LongJumpLocation]", even though it worked perfectly before.

Code:

.proc object_identifier
  rep #$20
  lda .LOWORD(ObjectTable)
  sta a:ObjectTableOffset

object_identifier_loop:
  tcd
  lda ObjectSlot::Identity
  beq next_object
  sta a:LongJumpLocation
  lda ObjectSlot::Identity+1
  sta a:LongJumpLocation+1
  jml [LongJumpLocation]
next_object:
  lda a:ObjectTableOffset         ;says how many objects have been identified
  clc
  adc #ObjectSlotSize
  sta a:ObjectTableOffset      ;store the result for the next time we go through the loop
  cmp .LOWORD(ObjectTable)+ObjectTableSize
  bne object_identifier_loop      ;if so, quit searching
  rts
.endproc

Just incase I did something wrong in creating the memory map, here it is:

Code:

# ca65 linker config for 256 KiB (2 Mbit) sfc file

# Physical areas of memory
MEMORY {
  ZEROPAGE:   start =  $000000, size =  $0100;   # $0000-00ff -- zero page
                                                 # $0100-01ff -- stack
  BSS:        start =  $000200, size =  $1e00;   # $0200-1fff -- RAM
  BSS7E:      start =  $7e2000, size =  $e000;   # SNES work RAM, $7e2000-7effff
  BSS7F:      start =  $7f0000, size = $10000;   # SNES work RAM, $7f0000-$7ffff
  ROM0:       start =  $808000, size =  $8000, fill = yes;
  ROM1:       start =  $818000, size =  $8000, fill = yes;
  ROM2:       start =  $828000, size =  $8000, fill = yes;
  ROM3:       start =  $838000, size =  $8000, fill = yes;
  ROM4:       start =  $848000, size =  $8000, fill = yes;
  ROM5:       start =  $858000, size =  $8000, fill = yes;
  ROM6:       start =  $868000, size =  $8000, fill = yes;
  ROM7:       start =  $878000, size =  $8000, fill = yes;
}

# Logical areas code/data can be put into.
SEGMENTS {
  CODE:       load = ROM0, align =  $100;
  RODATA:     load = ROM0, align =  $100;
  SNESHEADER: load = ROM0, start = $80ffc0;
  CODE1:      load = ROM1, align =  $100, optional = yes;
  RODATA1:    load = ROM1, align =  $100, optional = yes;
  CODE2:      load = ROM2, align =  $100, optional = yes;
  RODATA2:    load = ROM2, align =  $100, optional = yes;
  CODE3:      load = ROM3, align =  $100, optional = yes;
  RODATA3:    load = ROM3, align =  $100, optional = yes;
  CODE4:      load = ROM4, align =  $100, optional = yes;
  RODATA4:    load = ROM4, align =  $100, optional = yes;
  CODE5:      load = ROM5, align =  $100, optional = yes;
  RODATA5:    load = ROM5, align =  $100, optional = yes;
  CODE6:      load = ROM6, align =  $100, optional = yes;
  RODATA6:    load = ROM6, align =  $100, optional = yes;
  CODE7:      load = ROM7, align =  $100, optional = yes;
  RODATA7:    load = ROM7, align =  $100, optional = yes;

  ZEROPAGE:   load = ZEROPAGE, type = zp;
  BSS:        load = BSS,   type = bss, align = $100, optional = yes;
  BSS7E:      load = BSS7E, type = bss, align = $100, optional = yes;
  BSS7F:      load = BSS7F, type = bss, align = $100, optional = yes;
}

Re: How big is your object table? (addressing problems)
by TOUKO on 2016-09-10 (#179275)

93143 wrote:

i understand now why the snes's CPU was only clocked to 2,68 mhz ..
It's a shame that nintendo(ricoh in fact) has kept this half cycle access for memory, unlike hudson did for his HU6280 .

Re: How big is your object table? (addressing problems)
by Drew Sebastino on 2016-09-10 (#179343)

Espozo wrote:

lda .LOWORD(ObjectTable)

God I'm an idiot... How did I not see I forgot the "#" until just now, and how did I accidentally get rid of it in the first place? :lol:

Re: How big is your object table? (addressing problems)
by 93143 on 2016-09-11 (#179352)

TOUKO wrote:

i understand now why the snes's CPU was only clocked to 2,68 mhz ..
It's a shame that nintendo(ricoh in fact) has kept this half cycle access for memory, unlike hudson did for his HU6280 .

Technically it's 3.58 MHz (3.55 PAL) with penalties. The internal speed is 6 master clocks per cycle regardless of the state of $420D; it's just that slow access cycles take an extra two master clocks of wait state.

(This has the mildly interesting side effect that counting CPU cycles typically only gets you a rough estimate of the time taken by a procedure. To get an exact number you have to count master cycles.)

I suppose redesigning the 65C816 so radically was outside their budget... same with using 16-bit system busing with a smart interface to the 8-bit CPU bus, like the SA-1 did with its ROM... or maybe it was because they were going for backward compatibility with the Famicom until it was too late to make big changes. Either way, the CPU is definitely the least impressive of the three major subsystems.

Espozo wrote:

God I'm an idiot... How did I not see I forgot the "#" until just now

I think that happens to every 65xx programmer at least once.

One time I managed to convince myself that the description of $212E (TMW) was backwards, because when I tried to load #$1F (but without the "#") I got $1F, which was initialized to 00h. Attempting to load #$00 got me $00, which some previous code had set to 9Fh (note that $212E only uses bits 0-4)...

Re: How big is your object table? (addressing problems)
by TOUKO on 2016-09-11 (#179362)

Quote:

I suppose redesigning the 65C816 so radically was outside their budget...

It's really strange because a small company like hudson did it .
I think nintendo knew that they could upgrade the snes's CPU easily and cheaper if needed,this is why they don't care to have a better CPU on a stock machine .
Costs were their main priority,and maybe they did not have the time to redoing it ??.

Quote:

I think that happens to every 65xx programmer at least once.

Ahaha, yes .

Re: How big is your object table? (addressing problems)
by Revenant on 2016-09-11 (#179367)

93143 wrote:

I think that happens to every 65xx programmer at least once.

Not just amateur programmers, either.

There are various points in Enix's Soul Blazer where a RNG output is read and then ANDed with some constant to get a random number within a small range, for example:

Code:

LDA $0302
AND #$nn
... (several compares, branches, etc.)

... with the result usually leading to one of four or so different code paths (or more/fewer, depending on what's done with the value at $0302).

Except there's one place in the code where the programmer clearly made a typo:

Code:

LDA #$0302
AND #$03
...

The code then still checks all four "possible" results to determine the "random" outcome, except obviously three of those things will never happen.

I haven't figured out where that particular botched RNG call actually occurs in the game, though, so I'm not sure yet what exactly the player is missing out on because of those three potential "random" outcomes that aren't actually possible to trigger.

Re: How big is your object table? (addressing problems)
by tepples on 2016-09-11 (#179370)

If you do figure it out, TCRF probably wants to know about which options were left on the cutting room floor.

Re: How big is your object table? (addressing problems)
by Revenant on 2016-09-11 (#179371)

That's actually most of the reason I was looking into it in the first place (I'm an admin there

) but it will probably be a while before I have enough time to look that deep into the game again.