Actually making progress: tricked out metasprite routine

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Actually making progress: tricked out metasprite routine
by on (#174513)
I know that the reason I crashed and burned last time I tried to implement my crazy vram setup is that I was trying to do a lot, but not having any way to test if what I was doing even worked correctly, and at that point I realized what would make the most sense is to make my metasprite routine work with my vram setup even before I made an animation engine or a vram slot finder. (I do have a tile uploader that works though.) Because I really liked psychopathicteen's linked list idea for vram slots, I decided to implement that into my metasprite routine by having the feature to where it will either stay on the same spot in vram, or will go to the next location on the linked list. It's not very efficient how I did it I imagine, partly because I used up x, y, and the direct page so I had to push and pull x. I actually don't want to have 16x16's and 32x32's for what I'm planning to do (not that many sprites total, but there are a lot of overlaying ones) so I only implemented 16x16 vram slots, as I have a miniature offset for a specific 8x8 in a 16x16 sized slot.

One problem that I encountered is that if I have a smaller sized sprite, it'll flip just like the larger one, so I had it check if the sprite is small or large, and then I would add 8 to the sprite's position if it were flipped. I did it in a very lousy way, but I don't know how else to do it. Also, metasprites just flip wherever instead of according to the width, because I don't really know how to program this and don't feel like thinking to hard considering I pretty much just did this all today.

Enjoy...

Attachment:
Metasprite.zip [234.41 KiB]
Downloaded 163 times

Kind of random, but that stobe-like effect has proven to be pretty useful as a CPU usage meter and especially to tell me if what I am programming has crashed or not.
Re: Actually making progress: tricked out metasprite routine
by on (#174563)
Nice. I'm glad your making progress.
Re: Actually making progress: tricked out metasprite routine
by on (#174619)
Thanks! I had a lot of problems thinking about how to handle things like double buffering, but I think I got a solution. Instead of having a space in ram for the start of where each metasprite's tiles are, I'll have a separate space for the double buffered area. So, on a 64x64 double buffered object, there will be a slot for each 64x32 spot. The top part will still be linked to the bottom on the vram table, but it will first see if the bottom part even exists in vram (and if it doesn't, it will upload it). This is also useful for if you had a tank or something and needed to animate the treads but nothing else. The part that differs (the treads) would follow into the commonly shared part (the tank body). This is somewhat limited, but my original idea was way overcomplicated and had no real purpose, because if you wanted to have it as complicated as I wanted it originally, (kind of like the same as above, but each slot could go into any other, which totally screwed up the metasprite routine) then you would just use another object slot, which I edited to where if the identity is #$0001 (#$0000 is nothing) the object identifier won't jump to it, but the object slot searcher also won't overwrite it like it will with #$0000. So if I am animating a tank in my game engine, the body and treads will be one object, but the turret will be a separate object that doesn't actually have any code, as it is really only for visual purposes. Yeah, I'm not entirely sure how I'm going to program my vram idea, but it doesn't seem too hard.
Re: Actually making progress: tricked out metasprite routine
by on (#174926)
Okay, so I've been successful in nearly everything related to making this, except one thing: I can't seem to get it to where I'm uploading from the correct address. So, it is able to find what slots are empty, make the linked list correctly, and upload the tiles in the correct location, but it just isn't able to upload them from the correct location.

I even tried to make it static to where it is loading "LOWORD(Test1Tiles)" and stuff like that, but it still doesn't work, inexplicably. I had noticed that I have direct page somewhere other than 0, (it's at the start of the object it is currently looking at) and so I've put an "a:" in front of anything else. Is this not always the same as loading something normally when direct page is #$0000? I mean, that's all I can think of.

The code is a mess because I kept running out of hardware registers and things are named poorly (which is easy enough to fix though). I already know I'll have to go back and optimize it, but that's for another day. :lol:

This almost looks like gibberish to me so I don't expect anyone else to be able to understand it, but I figure I might as well post it here. The object's identity is #$0004. #$0000 counts as nothing for objects, and it also counts as nothing for vram slots, that's why a majority of the tables are being offset by -2.

Code:
.proc vram_engine
  rep #$30   ;A=16, X/Y=16
  lda #ObjectTable
  tcd
  ldy #$0002

vram_engine_loop:
  ldx ObjectSlot::RequestedFrame
  beq next_object
  lda a:AnimationFrameSlotUsageTable-2,x
  beq find_vram
  inc a:AnimationFrameSlotUsageTable-2,x
  lda a:AnimationFrameLinkedListTable-2,x
  sta ObjectSlot::VramOffset
  stz ObjectSlot::RequestedFrame

next_object:
  stz ObjectSlot::RequestedFrame
  inx
  inx
  tdc
  clc
  adc #ObjectSlotSize
  cmp #ObjectTable+ObjectTableSize
  bcs vram_engine_done
  tcd
  bra vram_engine_loop

vram_engine_done:
  rts

find_vram:
  lda a:TilesInFrameTable-2,x
  sta a:TilesInFrame
  lda a:VramAddressFrameTable-2,x
  sta a:VramAddressOfFrame
  lda a:VramBankByteFrameTable-2,x
  sta a:BankByteOfFrame

find_vram_loop_1:
  lda a:VramLinkedListTable-2,y
  beq open_slot_found_1
  iny
  iny
  cpy #$0102
  bcc find_vram_loop_1
  bra next_object

open_slot_found_1:
  tya
  sta a:AnimationFrameLinkedListTable-2,x
  sta ObjectSlot::VramOffset

  phy
  ldy a:TileRequestCounter16x16
  lda a:VramAddressToTransferAddressTable-2,x
  sta a:TileRequestTable+VramAddress,y
  lda a:VramAddressOfFrame
  sta a:TileRequestTable+TileAddress,y
  lda a:BankByteOfFrame
  sta a:TileRequestTable+BankNumber,y
  lda a:VramAddressOfFrame
  clc
  adc #$0020
  sta a:VramAddressOfFrame
  lda a:TileRequestCounter16x16
  clc
  adc #$0006
  sta a:TileRequestCounter16x16
  ply

  dec a:TilesInFrame
  beq next_object
  tyx
  iny
  iny
  cpy #$0102
  bcc find_vram_loop_2
  bra next_object

find_vram_loop_2:
  lda a:VramLinkedListTable-2,y
  beq open_slot_found_2
  iny
  iny
  cpy #$0102
  bcc find_vram_loop_2
  bra next_object

open_slot_found_2:
  tya
  sta a:VramLinkedListTable-2,x

  phy
  ldy a:TileRequestCounter16x16
  lda a:VramAddressToTransferAddressTable-2,x
  sta a:TileRequestTable+VramAddress,y
  lda a:VramAddressOfFrame
  sta a:TileRequestTable+TileAddress,y
  lda a:BankByteOfFrame
  sta a:TileRequestTable+BankNumber,y
  lda a:VramAddressOfFrame
  clc
  adc #$0020
  sta a:VramAddressOfFrame
  lda a:TileRequestCounter16x16
  clc
  adc #$0006
  sta a:TileRequestCounter16x16
  ply

  dec a:TilesInFrame
  beq jump_to_next_object
  iny
  iny
  cpy #$0102
  bcc find_vram_loop_2

jump_to_next_object:
  brl next_object
.endproc
Code:
;=========================================================================================
.segment "RODATA"
;=========================================================================================

VramAdressToTileNumberTable:
  .word $0000,$0002,$0004,$0006,$0008,$000A,$000C,$000E
  .word $0020,$0022,$0024,$0026,$0028,$002A,$002C,$002E
  .word $0040,$0042,$0044,$0046,$0048,$004A,$004C,$004E
  .word $0060,$0062,$0064,$0066,$0068,$006A,$006C,$006E
  .word $0080,$0082,$0084,$0086,$0088,$008A,$008C,$008E
  .word $00A0,$00A2,$00A4,$00A6,$00A8,$00AA,$00AC,$00AE
  .word $00C0,$00C2,$00C4,$00C6,$00C8,$00CA,$00CC,$00CE
  .word $00E0,$00E2,$00E4,$00E6,$00E8,$00EA,$00EC,$00EE
  .word $0100,$0102,$0104,$0106,$0108,$010A,$010C,$010E
  .word $0120,$0122,$0124,$0126,$0128,$012A,$012C,$012E
  .word $0140,$0142,$0144,$0146,$0148,$014A,$014C,$014E
  .word $0160,$0162,$0164,$0166,$0168,$016A,$016C,$016E
  .word $0180,$0182,$0184,$0186,$0188,$018A,$018C,$018E
  .word $01A0,$01A2,$01A4,$01A6,$01A8,$01AA,$01AC,$01AE
  .word $01C0,$01C2,$01C4,$01C6,$01C8,$01CA,$01CC,$01CE
  .word $01E0,$01E2,$01E4,$01E6,$01E8,$01EA,$01EC,$01EE
 
VramAddressToTransferAddressTable:
  .word $0000,$0020,$0040,$0060,$0080,$00A0,$00C0,$00E0
  .word $0200,$0220,$0240,$0260,$0280,$02A0,$02C0,$02E0
  .word $0400,$0420,$0440,$0460,$0480,$04A0,$04C0,$04E0
  .word $0600,$0620,$0640,$0660,$0680,$06A0,$06C0,$06E0
  .word $0800,$0820,$0840,$0860,$0880,$08A0,$08C0,$08E0
  .word $0A00,$0A20,$0A40,$0A60,$0A80,$0AA0,$0AC0,$0AE0
  .word $0C00,$0C20,$0C40,$0C60,$0C80,$0CA0,$0CC0,$0CE0
  .word $0E00,$0E20,$0E40,$0E60,$0E80,$0EA0,$0EC0,$0EE0
  .word $1000,$1002,$1004,$1006,$1008,$100A,$100C,$100E
  .word $1200,$1220,$1240,$1260,$1280,$12A0,$12C0,$12E0
  .word $1400,$1420,$1440,$1460,$1480,$14A0,$14C0,$14E0
  .word $1600,$1620,$1640,$1660,$1680,$16A0,$16C0,$16E0
  .word $1800,$1820,$1840,$1860,$1880,$18A0,$18C0,$18E0
  .word $1A00,$1A20,$1A40,$1A60,$1A80,$1AA0,$1AC0,$1AE0
  .word $1C00,$1C20,$1C40,$1C60,$1C80,$1CA0,$1CC0,$1CE0
  .word $1E00,$1E20,$1E40,$1E60,$1E80,$1EA0,$1EC0,$1EE0

;=========================================================================================

TilesInFrameTable:
  .word $0002,$0002

VramAddressFrameTable:
  .word .LOWORD(Test1Tiles)

VramBankByteFrameTable:
  .word .BANKBYTE(Test1Tiles),$00

;=========================================================================================

Test1Tiles:
  .incbin "Test1.pic"

Test2Tiles:
  .incbin "Test2.pic"

;=========================================================================================

Kind of random, but about things that are only used during one routine, I think I'll just have everything use its own space in vram and then just replace the names of everything so that they're all the same, like "Temporary1" or something like that. I have bigger concerns right now though.

Actually, hell, why not, here's the rom:

Attachment:
Vram Engine Test.zip [236.93 KiB]
Downloaded 122 times

Dang it, I keep realizing I have things to say... The vram engine doesn't have any sort of double buffering thing because I realized that I really don't need it right now. I'll incorporate one if I ever get to 16x16 and 32x32 sized sprites.
Re: Actually making progress: tricked out metasprite routine
by on (#175012)
Of course, in my "lol plz help omg" moment, I actually realized half of the mistakes I made. (For starters, the frame's identity was 4, but not all the tables even went to that. :roll: )

I actually have it working now after somewhat randomly moving things around and then trying to make sense of the result, except one thing: I don't get it, if you had a 16x16, 4bpp tile at one offset and then several after that follow it, wouldn't you add #$80 to get the address of each additional tile? I mean, 16x16=256/2=128.

For whatever reason, it's not working, and it's leading me to believe that the assembler is causing the problem in how it is arranging data, because if offset the thing by #$40, it shows half of the first tile.

Yeah, is the data here non-linear?

Code:
TestTiles:
  .incbin "Test1.pic"
  .incbin "Test2.pic"
Re: Actually making progress: tricked out metasprite routine
by on (#175022)
That should work fine as far as I know, assuming those files are the right size.
Re: Actually making progress: tricked out metasprite routine
by on (#175029)
Espozo wrote:
I mean, 16x16=256/2=128.

For whatever reason, it's not working, and it's leading me to believe that the assembler is causing the problem in how it is arranging data, because if offset the thing by #$40, it shows half of the first tile.

$40 != 128.

EDIT: I misread. Owell.
Re: Actually making progress: tricked out metasprite routine
by on (#175070)
Nicole wrote:
That should work fine as far as I know, assuming those files are the right size.

And apparently, they aren't... They're 512 bytes, for whatever reason: the first fourth is actual graphical data, the rest is 0 filled. I pcx2snes.

Yeah, I just now made a 16x32 picture instead of two 16x16's, (there was really no point in having it split in the first place) and it works perfectly now. I don't think I need to upload another file, as it's not like it looks any different on the surface. :lol: Speaking of uploading files, where does all this information go anyway? I imagine a sever, but who owns it? Okay, yeah, that's irrelevant. :lol:

Anyway, I did find out one thing from all of this that you people might be able to use... It appears pcx2snes just doesn't output any file smaller than 512 bytes. I think we're long overdue for a new tool, but I don't have the kind of skill to make one. (I only know 65816 and a smidge of 80186 assembly.)

Man though, it sucks that it appears absolute addressing always takes one more cycle per instruction, because I'm going to have to fix a lot of my stuff for good performance. :(
Re: Actually making progress: tricked out metasprite routine
by on (#175076)
pvSNESLib has gfx2snes, which can handle .pcx, .tga and .bmp files.
Re: Actually making progress: tricked out metasprite routine
by on (#175078)
I made one in Python that can handle at least BMP and PNG in multiple tile formats, including Super NES 4-bit. It's included with my Super NES project template.
Re: Actually making progress: tricked out metasprite routine
by on (#175079)
Espozo wrote:
it appears absolute addressing always takes one more cycle per instruction

That's not quite true. Direct page takes an extra cycle if it's not page-aligned, meaning it takes just as long as an absolute instruction (assuming you're running in FastROM, so the extra byte fetch in the absolute instruction doesn't take any longer than the internal add in the direct-page instruction). And while indexing adds a cycle to absolute instructions if X/Y are 16-bit, it adds a cycle to direct page instructions regardless of the index register size.

So, for simple load/store/add/whatever instructions (not RMW or anything fancy):

- direct - 3 cycles
- absolute - 4 cycles

- direct non-page-aligned - 4 cycles
- direct indexed - 4 cycles
- direct indexed non-page-aligned - 5 cycles

- absolute 8-bit indexed - 4 cycles
- absolute 16-bit indexed - 5 cycles

Notice that for indexed accesses, if X/Y are 8-bit and the bottom byte of DP is nonzero, absolute is faster.

Add one cycle to all of those if the data is 16-bit. I'll knock off there; see 65c816.txt for further information.

If you want to know how many slow cycles each instruction has, just count the number of byte accesses in slow memory. Everything else is fast.
Re: Actually making progress: tricked out metasprite routine
by on (#175082)
Espozo wrote:
Anyway, I did find out one thing from all of this that you people might be able to use... It appears pcx2snes just doesn't output any file smaller than 512 bytes.

pcx2snes/gfx2snes are known to be buggy, even with bigger files. :|

I'll try out tepples' script shortly. :)
Re: Actually making progress: tricked out metasprite routine
by on (#175126)
93143 wrote:
That's not quite true. Direct page takes an extra cycle if it's not page-aligned, meaning it takes just as long as an absolute instruction (assuming you're running in FastROM, so the extra byte fetch in the absolute instruction doesn't take any longer than the internal add in the direct-page instruction). And while indexing adds a cycle to absolute instructions if X/Y are 16-bit, it adds a cycle to direct page instructions regardless of the index register size.

Man, all that is hard to keep track of. :( It would be so awesome if there was a way to have the cycles per instruction shown while you were typing code, but that would mean this theoretical program would have to assemble the whole file each and ever time you did anything. Unfortunately, I don't exactly see SNES (or Apple IIGS) development taking off enough for something this complicated to be made... :lol:

Ramsis wrote:
pcx2snes/gfx2snes are known to be buggy, even with bigger files.

Would you like to have your blank data in the front, or the back? :lol:

Anyway, I got to thinking that my next step in my grand SNES adventure would be to make it where old frames are deleted whenever an object changes frames. I suppose I'll have it to where there's the existing "FrameRequest" thing, but also have a "CurrentFrame". What it will do is see if the frame request is equal to the current frame, and if it is, do nothing. If it isn't, it would upload the frame and copy the frame request into the current frame. It would also get rid of what was then the current frame if nothing else is using it. (There's a counter of how many objects are using a particular frame, so if it's 0, follow the linked list, replacing it with #$0000 on every entry in the linked list as that acts as an empty slot.) I'm not really sure I'll have an animation engine, because it would be a giant mess if I were to implement everything that I want out of it. For example implementing tank treads or tires moving is a major pain: there'd have to be the feature of playing animation at different speeds, and also playing animations backwards. Also if one thing is this fancy, everything has to be, and that could be unnecessarily slow. I think I'll just hardcode everything.
Re: Actually making progress: tricked out metasprite routine
by on (#175127)
Espozo wrote:
It would be so awesome if there was a way to have the cycles per instruction shown while you were typing code, but that would mean this theoretical program would have to assemble the whole file each and ever time you did anything.

That might be a job for a profiler. Instead of breakpoint, you set a profile point on a JSR, and the emulator counts cycles for you.

Espozo wrote:
I suppose I'll have it to where there's the existing "FrameRequest" thing, but also have a "CurrentFrame". What it will do is see if the frame request is equal to the current frame, and if it is, do nothing. If it isn't, it would upload the frame and copy the frame request into the current frame.

So far this sounds like the scheme I used for Haunted: Halloween '85. I was sometimes able to fit two frames' tiles into one slot if they shared many, so that I could get away with this trivial frame request more often.

Espozo wrote:
There's a counter of how many objects are using a particular frame

And there it differs. You're using reference-counted GC, which is quite a bit more complicated than what I used. I just fully double-buffered all actors' cels, which was fine for the number and size of enemies that engine supported but may not be fine for a more detailed game.
Re: Actually making progress: tricked out metasprite routine
by on (#175129)
So I take it you took the more simple but faster method of having a fixed spot in vram for each object? Yeah, I can't think of a single game that is doing anything as complicated as I am, but I am worried about how it will run. However, I won't concern myself with compression because I'll just make the cartridge bigger, so that'll save a good amount of time. My whole thing is trying to fit as much data into the little 16KB of vram available to sprites as possible, because I think one of the main differences you can tell about an SNES game vs a Neo Geo game or something is how much less diverse the backgrounds and especially the sprites are on the SNES because most everything is often crammed into vram and never swapped out. (Often times, the only thing that is is the character.) Heck, most games don't even seem to go anywhere near using the whole vram bandwidth, which isn't even large to begin with.

Anyway, I'll post when I get my slot deletion thing working. The current slot and frame slot approach seems to be the best, which is why it also seems pretty popular.
Re: Actually making progress: tricked out metasprite routine
by on (#175163)
I didn't think I had it in me, but apparently I do. :) (Press A to go to numbers, and B to go to letters)

Attachment:
Vram Engine V2.zip [237.42 KiB]
Downloaded 93 times

Just so you know, what's going on here is that the vram engine detects that the requested frame isn't the same as the current frame, so it goes and follows the linked list of the current frame, setting each slot in vram to be #$0000, which means it's empty so it can be overwritten. Then, the requested frame is uploaded to where it'll fit, and because the beginning of vram is now empty (in this case, because there are no other objects except the one) it uploads the tiles there.

I originally had some trouble, because the last tile in a frame kept getting overwritten, and that was because I then realized that because the last entry in the frame doesn't lead to anywhere, it stays #$0000, which means it's empty, not that it points to the beginning of vram (which it doesn't, #$0002 does, as the table is offset by -2). Luckily though, that was an easy fix.

What I don't get, however is if you disable the delete slot feature I implemented, (comment out delete_vram_loop, but not delete_vram) but have it to where it thinks that are no frames of that kind, (so the vram available to sprites is completely filled) then this happens:

Attachment:
SNES VRAM 4bpp.png
SNES VRAM 4bpp.png [ 1.59 KiB | Viewed 2133 times ]

Yeah, I have no clue what's going on there... :lol: What's interesting is that the tile number is still correct for the sprites, (so they appear blank, because vram is blank there) it's just that the actual tiles are wacked. Every time I'd alternate between A and B (which created that pattern in vram) instead of going forward, it seemed to overwrite itself in that weird way. I don't know if this is the vram engine's fault, or the tile uploader's, but I'll investigate.



When I find out how to fix that, I wonder what I'll do then... I'll probably try and get a tilemap updater created for whenever the level scrolls. I also want to try and find a way to have it to where new tiles are uploaded for BGs whenever the level scrolls, but I don't know how I'm going to accomplish this. It'll be a lot more hardcoded than objects, that's for certain.

Edit: Wow, the aforementioned problem was caused by me making my "VramAddressToTranferAddressTable" incorrect. Here's the correct table, for anyone who cares:

Code:
VramAddressToTransferAddressTable:
  .word $0000,$0020,$0040,$0060,$0080,$00A0,$00C0,$00E0
  .word $0200,$0220,$0240,$0260,$0280,$02A0,$02C0,$02E0
  .word $0400,$0420,$0440,$0460,$0480,$04A0,$04C0,$04E0
  .word $0600,$0620,$0640,$0660,$0680,$06A0,$06C0,$06E0
  .word $0800,$0820,$0840,$0860,$0880,$08A0,$08C0,$08E0
  .word $0A00,$0A20,$0A40,$0A60,$0A80,$0AA0,$0AC0,$0AE0
  .word $0C00,$0C20,$0C40,$0C60,$0C80,$0CA0,$0CC0,$0CE0
  .word $0E00,$0E20,$0E40,$0E60,$0E80,$0EA0,$0EC0,$0EE0
  .word $1000,$1020,$1040,$1060,$1080,$10A0,$10C0,$10E0
  .word $1200,$1220,$1240,$1260,$1280,$12A0,$12C0,$12E0
  .word $1400,$1420,$1440,$1460,$1480,$14A0,$14C0,$14E0
  .word $1600,$1620,$1640,$1660,$1680,$16A0,$16C0,$16E0
  .word $1800,$1820,$1840,$1860,$1880,$18A0,$18C0,$18E0
  .word $1A00,$1A20,$1A40,$1A60,$1A80,$1AA0,$1AC0,$1AE0
  .word $1C00,$1C20,$1C40,$1C60,$1C80,$1CA0,$1CC0,$1CE0
  .word $1E00,$1E20,$1E40,$1E60,$1E80,$1EA0,$1EC0,$1EE0

That's a relief though, I was really concerned for a minute! :lol:
Re: Actually making progress: tricked out metasprite routine
by on (#175287)
You can have a linked list for empty slots, and when your clearing a metasprite, you can link the last sprite to the beginning of the empty slot list, and use the first sprite as the new entry point for the empty slot list.
Re: Actually making progress: tricked out metasprite routine
by on (#175288)
This is actually really random, but I got to looking at BG layer information, and I saw how priorities are arranged for BG layers. What's odd though is looking at the information for BG 3 in mode 1:

wiki.superfamicom.org wrote:
The background priority is (from ‘front’ to ‘back’):BG3 tiles with priority 1 if bit 3 of $2105 is set
Sprites with priority 3
BG1 tiles with priority 1
BG2 tiles with priority 1
Sprites with priority 2
BG1 tiles with priority 0
BG2 tiles with priority 0
Sprites with priority 1
BG3 tiles with priority 1 if bit 3 of $2105 is clear
Sprites with priority 0
BG3 tiles with priority 0

This seems to imply that it's impossible to have BG3 between BG1 and BG2, which I know isn't the case.

Now, the more relevant question, which is is it impossible to efficiently update a tilemap on its sides using DMA? I mean, The data isn't continuous like it is on the top or the bottom, so you can't just update it all in one shot.

You know, I think this discussion has already been had, but how would you have the tilemap for a level out in rom? I think I'll have it to where instead of acting like a bunch of 32x32 tilemaps like the system does, it'll just be one giant tilemap instead of a bunch of 32x32's put together.



It appears a new post came in while I was writing this...

psycopathicteen wrote:
You can have a linked list for empty slots, and when your clearing a metasprite, you can link the last sprite to the beginning of the empty slot list, and use the first sprite as the new entry point for the empty slot list.

What do you mean by clearing? Anyway, I think I'll worry about optimizing this more later when I run into CPU time problems, otherwise I'll just end up freaking myself out. :lol:
Re: Actually making progress: tricked out metasprite routine
by on (#175290)
Espozo wrote:
This seems to imply that it's impossible to have BG3 between BG1 and BG2, which I know isn't the case.

You know it isn't the case? How exactly?

Espozo wrote:
is it impossible to efficiently update a tilemap on its sides using DMA?

No, it's possible to change bits in $2115 (VMAIN) to change how the VRAM address auto-increments. By doing that, you can have the address increment by 32 words each write, allowing you to efficiently update columns.
Re: Actually making progress: tricked out metasprite routine
by on (#175291)
Espozo wrote:
This is actually really random, but I got to looking at BG layer information, and I saw how priorities are arranged for BG layers. What's odd though is looking at the information for BG 3 in mode 1:

wiki.superfamicom.org wrote:
The background priority is (from ‘front’ to ‘back’):BG3 tiles with priority 1 if bit 3 of $2105 is set
Sprites with priority 3
BG1 tiles with priority 1
BG2 tiles with priority 1
Sprites with priority 2
BG1 tiles with priority 0
BG2 tiles with priority 0
Sprites with priority 1
BG3 tiles with priority 1 if bit 3 of $2105 is clear
Sprites with priority 0
BG3 tiles with priority 0

This seems to imply that it's impossible to have BG3 between BG1 and BG2, which I know isn't the case.

See attached image for ordering, including how OBJ (sprites) fit into it. Yeah, this one is a little hard to understand, but it should help explain how and "where" BG3 (and OBJ!) fit into the "render layering process" based on if priority bits are set for BG3 and/or in each individual sprite.
Re: Actually making progress: tricked out metasprite routine
by on (#175293)
Nicole wrote:
You know it isn't the case? How exactly?

Look at the factory levels and the cliff side levels in DKC3. Those are the examples that pop up immediately.

Image
Image

Having BG3 in the middle is really useful for things like silhouettes:

Image

Nicole wrote:
No, it's possible to change bits in $2115 (VMAIN) to change how the VRAM address auto-increments. By doing that, you can have the address increment by 32 words each write, allowing you to efficiently update columns.

Oh, cool. The one problem that would still remain though would be the offset for the source, unless there's a way around this too.
Re: Actually making progress: tricked out metasprite routine
by on (#175297)
I looked into it, and yes, BG3 does end up getting rendered in the middle. How it does it is pretty clever, though; what it does is take advantage of the subscreen and color math.

Taking DKC3's factory levels as an example, only BG1 (the foreground), BG3 (the back wall), and the sprites are on the main screen. So, with no color math, the backdrop color would show through the windows.

However, it then has that backdrop color set to black, puts BG2 (the sky) on the subscreen, and uses color math to add the subscreen to only the backdrop layer. The effect is that wherever the backdrop appears, BG2 will appear instead.
Re: Actually making progress: tricked out metasprite routine
by on (#175310)
So I guess based on that, you can't really have a transparent BG3 in the middle? Well, you can't do everything. Also, can BG2, which is in the background, have color 0 showing behind it and not affect BG3? Actually, no, you're saying that BG2 has to use color math, so they make color 0 black and just add the BG layer over it. Can it not be opaque? I'm not entirely sure how the whole main screen and subscreen thing works, but it seems like it's just two screens that go over one another for color math. I know you can't have two color math things over one another and have the effect show through both of them, but can you also not have two different types of transparencies going on at the same time?

But yeah, I know this sounds ridiculous, but I suppose for updating a tilemap on its sides, you could have a sideways large tilemap in rom. :? That, or you'd copy the data into a small buffer using the CPU, and then DMA the data from that small buffer in ram to vram. That, or you could just copy the data without DMA in the first place. :lol: I suppose you could break it up to where you're only partially updating the sides of the tilemap per frame if you're not scrolling at over 8 pixels per frame (or 16, depending on the BG tile size). How do you guys approach this?
Re: Actually making progress: tricked out metasprite routine
by on (#175317)
Espozo wrote:
But yeah, I know this sounds ridiculous, but I suppose for updating a tilemap on its sides, you could have a sideways large tilemap in rom. :?

Haunted: Halloween '85 does exactly this. Its maps are stored as a list of columns compressed with a mixture of RLE and a predictive scheme. Each frame, if the camera code deems it necessary, it decompresses a single column of the nametable to a buffer in VRAM, and then it schedules a copy to take place during the next vblank. Many other games' metatile expansion is column-oriented as well.
Re: Actually making progress: tricked out metasprite routine
by on (#175321)
I actually meant having two maps of the same thing, one of rows, and one of columns. I don't know if that's what you meant, but for what I'm doing, I want to be able to scroll both vertically and horizontally. That is kind of stupid though, I'll just have a 128 byte buffer for the sides of the screen where I'll upload the tilemap there with the CPU, and then DMA it, as apposed to with the top of the screen where I won't copy it to a buffer and will instead just go from rom straight to vram.

I guess for handling the camera, you'd have a "CameraX" and a "CameraY". There will also be a previous camera X and Y, and depending on if the camera variable is bigger or smaller, the tilemap will be updated on the right or left, top or bottom. I also want to have some BG layer specific controls, like how much should a BG layer be affected (in terms of movement) by the camera moving.

Man though, just thinking, one thing that's going to kind of suck is that a 64x32 tilemap is really two 32x32 tilemaps, so there will be some break up. Kind of annoying, but should be easy enough to work around. I really think I'll just use a 64x32 tilemap, because that covers the whole screen when it's being scrolled to the side. I really wish there were such thing as a 264x232 pixel tilemap. :lol:

The one thing that sucks about trying to make a tilemap for the SNES is that there are no tools for making one... I guess I'll have to make a test one manually... :lol:
Re: Actually making progress: tricked out metasprite routine
by on (#175322)
Yes, 64x32 is probably the most practical for most Super NES scrolling games.

If you just want to update a map in columns, no metatile compression or anything, you can read a column out of the map during draw and DMA it to VRAM in vblank. If you're doing metatiles, you'll have to do it that way.

How would your preferred type of map editor work? Would you design the tile set and then the map, or would you convert a level-sized PNG file to a tile set and map?
Re: Actually making progress: tricked out metasprite routine
by on (#175325)
tepples wrote:
you can read a column out of the map during draw and DMA it to VRAM in vblank.

Yeah, like in a buffer. You don't have to do it for rows though, only columns.

tepples wrote:
Would you design the tile set and then the map

Yeah, like this. I wouldn't like the large PNG thing because I want to change the tiles as the level scrolls, and this will make the PNG thing not work.
Re: Actually making progress: tricked out metasprite routine
by on (#175328)
There's also 16x16 tile mode.

For transparency, is there also transparency enable bits that have to be set as well as the sub screen and main screen? I'm surprised the sub screen can be used for reordering priority.

Quote:
What do you mean by clearing? Anyway, I think I'll worry about optimizing this more later when I run into CPU time problems, otherwise I'll just end up freaking myself out.


Freeing up the slots when you have a frame change. You can just have it where all the empty slots are linked up together, and just connect the object's linked list to the empty slot linked list, when there is a frame change.
Re: Actually making progress: tricked out metasprite routine
by on (#175338)
The subscreen priority trick uses transparency, so to speak.

The SNES has four different color math modes. One adds the subscreen's colors to the main screen's colors and divides by two, averaging them, which is the classic transparency effect. Another doesn't do the division, which means the colors are simply added, which can be used for light effects, as subscreen colors would always brighten the screen. It's this one which is important, because adding a color to black just yields that same color, since black is (0,0,0), and (r+0,g+0,b+0) is just (r,g,b).

It's possible to set it so color math only operates on certain main screen layers, though you can't use different color math modes at once. In this case, it's using color addition only on the backdrop, which, since it's black, means the subscreen appears unaltered on the backdrop. And since BG2 is enabled on the subscreen, it shows up where the backdrop would.

If it used the color averaging mode, the subscreen would appear half as bright as it should, as it would be averaged with black.

(Something interesting to note is that, with the color averaging mode, any transparent areas on the subscreen will only cause the subscreen backdrop color (set with COLDATA) to be added to the main screen, not averaged. This is to avoid halving the color of the screen outside of, say, sprites you want to appear transparent.)
Re: Actually making progress: tricked out metasprite routine
by on (#175344)
So, whenever you use the subscreen, color math is always enabled? It does kind of stink though, because color 0 can't necessarily be whatever you want it to be. I've found that DKC3 does a couple of cool tricks with the video hardware that the other games didn't, like I really like how the tree levels put a window layer behind each tree to have it to where each BG tile can look like it's partially over sprites instead of all or nothing:

Image

About camera movement, wouldn't this be the way to detect what direction the camera has moved in? (assuming it doesn't move #$8000 in one frame :lol: )

Code:
  lda CameraX
  sec
  sbc PreviousCameraX
  beq camera_x_not_moved
  cmp #$8000
  bcc camera_moved_right
Re: Actually making progress: tricked out metasprite routine
by on (#175346)
Not always enabled, since the subscreen is also used for hi-res; that said, color math and hi-res are the only two ways you can have the subscreen show up on-screen.

However, I did remember something; color 0 does not necessarily have to be black, because you can additionally have color math force main screen layers to be black. Using that, you could effectively force display of only the subscreen for those layers.

That said, when using this particular trick, I don't think there's a case where you'd need color 0 to not be black, because there's no circumstance where it would be displayed. Transparent areas on the subscreen would end up showing the subscreen backdrop color (COLDATA), not color 0.
Re: Actually making progress: tricked out metasprite routine
by on (#175347)
I didn't know that the subscreen had a "different color 0".

Yeah, so effectively, BG3 in the center has all the advantages of any other background layer, it's just that it can't be for color math.

Actually, no, because I'm pretty sure Dracula X does this somehow:

Image
Re: Actually making progress: tricked out metasprite routine
by on (#175348)
Yeah, it's outside of the normal 256-color palette, which is a little strange. Can be useful for some things though, like using color math with a solid color that needs to be different from the backdrop color.

Dracula X isn't using the same trick; it simply puts BG3 (the fire) on the subscreen, then applies color math to BG2 (the background), adding the fire to it.

The trick DKC3 uses only needs to be done if you want BG3 to not be transparent.
Re: Actually making progress: tricked out metasprite routine
by on (#175352)
Yeah, from all this, it seems you can pretty much arrange the BG layers however you want to. Fine with me! :lol:

I'll try and get started on my DMA tilemap thing now, that just stood out to me.

I actually had a genius idea of using a GBA tilemap editor and having 8 of the palettes look exactly the same and use the last palette bit as the priority bit but, I found out that the order of the bits isn't the same. :(

GBA: PPPPVHCCCCCCCCCC
SNES: VHOPPPCCCCCCCCCC

"O" is priority, the rest is self explanatory.

You know what, it would be much easier to just make a GBA to SNES tilemap converter than to deal with making one from scratch.
Re: Actually making progress: tricked out metasprite routine
by on (#175355)
Nicole wrote:
you can't use different color math modes at once
Nicole wrote:
color math and hi-res are the only two ways you can have the subscreen show up on-screen

And here we have the exception that proves the rule. In pseudo-hires mode, you can blend the main screen and subscreen at 50:50, like the add and divide by two option, and also use colour math with the subscreen backdrop.

The 50:50 blending in this case is only an approximation, since what it's actually doing is just alternating half-dots and betting on the TV being blurry enough for the result to look good. The COLDATA math is real though.
Re: Actually making progress: tricked out metasprite routine
by on (#175357)
So the subscreen only shows up on main screen backgrounds with color math enabled. What happens if a sub screen layer has color math enabled? Does that mean that layer shows up transparent everywhere?
Re: Actually making progress: tricked out metasprite routine
by on (#175358)
All in all, the SNES is incredibly flexible in the stuff you can do with its graphics settings. It's nothing compared to modern hardware of course, but I do feel like there's a lot of potential that hasn't necessarily been exploited in a lot of games.

psycopathicteen wrote:
So the subscreen only shows up on main screen backgrounds with color math enabled. What happens if a sub screen layer has color math enabled? Does that mean that layer shows up transparent everywhere?

Color math only affects the main screen, essentially. If BG1 was in the main screen, BG2 was in the subscreen, and color math was enabled for BG1 (so BG2 was added to BG1), enabling color math for BG2 as well would make no difference whatsoever.

However, it is possible to have a layer on both the main screen and the subscreen, so it's possible to apply a layer to itself in that case. What that does depends on the mode: addition would double the brightness of the color, since it's adding the color to itself; averaging would leave the color unchanged; subtraction would make the layer black; subtraction with halving would also make the layer black.
Re: Actually making progress: tricked out metasprite routine
by on (#175363)
Nicole wrote:
All in all, the SNES is incredibly flexible in the stuff you can do with its graphics settings. It's nothing compared to modern hardware of course, but I do feel like there's a lot of potential that hasn't necessarily been exploited in a lot of games.

psycopathicteen wrote:
So the subscreen only shows up on main screen backgrounds with color math enabled. What happens if a sub screen layer has color math enabled? Does that mean that layer shows up transparent everywhere?

Color math only affects the main screen, essentially. If BG1 was in the main screen, BG2 was in the subscreen, and color math was enabled for BG1 (so BG2 was added to BG1), enabling color math for BG2 as well would make no difference whatsoever.

However, it is possible to have a layer on both the main screen and the subscreen, so it's possible to apply a layer to itself in that case. What that does depends on the mode: addition would double the brightness of the color, since it's adding the color to itself; averaging would leave the color unchanged; subtraction would make the layer black; subtraction with halving would also make the layer black.


What happens if only BG2 has color math enabled?

On the topic of animation, I think animation is pretty simple to program once you've already figured out what works and what doesn't. When I first started homebrewing I tried uploading every dynamically animated 16x16 sprite at a fixed 30 fps, in the same order they appeared in OAM. I ended up wasting vblank time, being limited to half the available sprites, and inflexible frame rates.
Re: Actually making progress: tricked out metasprite routine
by on (#175364)
psycopathicteen wrote:
being limited to half the available sprites

What now?
Re: Actually making progress: tricked out metasprite routine
by on (#175365)
psycopathicteen wrote:
What happens if only BG2 has color math enabled?

Color math only affects the main screen. If only BG1 is on the main screen, all you'll see is BG1. Color math doesn't do anything if the layer it's enabled on isn't on the main screen.
Re: Actually making progress: tricked out metasprite routine
by on (#175369)
Espozo wrote:
psycopathicteen wrote:
being limited to half the available sprites

What now?


Back in 2010, my first attempt at dynamic animation, I did page switching on the bottom 8kB of sprite patterns to double buffer the sprites. I uploaded 4kB one frame, and 4kB the next, and then switched pages. Everything had to be animated at 30fps, and the oam was delayed a frame to sync up with the vram.

This caused a lot of problems. The first was having trouble getting it all in vblank. I wasted time trying to make an unrolled loop, but it still didn't work out. So instead I had it copy the sprite patterns to work RAM, and then DMAing it from there. This was so frustrating for me back then that I had to take a break in programming.
Re: Actually making progress: tricked out metasprite routine
by on (#176022)
Espozo wrote:
So, whenever you use the subscreen, color math is always enabled? It does kind of stink though, because color 0 can't necessarily be whatever you want it to be. I've found that DKC3 does a couple of cool tricks with the video hardware that the other games didn't, like I really like how the tree levels put a window layer behind each tree to have it to where each BG tile can look like it's partially over sprites instead of all or nothing:

Image

About camera movement, wouldn't this be the way to detect what direction the camera has moved in? (assuming it doesn't move #$8000 in one frame :lol: )

Code:
  lda CameraX
  sec
  sbc PreviousCameraX
  beq camera_x_not_moved
  cmp #$8000
  bcc camera_moved_right


You can probably do that to fake a background with Mode-7. Having a wall one color, with sprite bricks on the edges, and having the sky behind it another color.
Re: Actually making progress: tricked out metasprite routine
by on (#176033)
I've thought about putting a window layer behind BG3 to where you could potentially have 5 colors in one 8x8 block, like this could work great for clouds or mountains or something like that. You could also combine this with scrolling the BG layer vertically every couple of lines and also then changing the tilemap for 5 colors in an 8xWhatever sized area.
Re: Actually making progress: tricked out metasprite routine
by on (#176043)
Or fake a couple mode-7 clouds.

Random news: I'm actually working on adding duplicate checks to my game. I already added an animation number to every metasprite so it can use a "super long list" to point to a CHR table. I will eventually use a linked list too, but I need to have something working first.
Re: Actually making progress: tricked out metasprite routine
by on (#176045)
psycopathicteen wrote:
I'm actually working on adding duplicate checks to my game. I already added an animation number to every metasprite so it can use a "super long list" to point to a CHR table. I will eventually use a linked list too, but I need to have something working first.
:wink:
Re: Actually making progress: tricked out metasprite routine
by on (#176046)
...and I broke it. It feels like I'm trying to pull out a table cloth from under the dishes. Maybe I'll try giving each object a register pointing to it's index on the list so it doesn't have to keep looking into it's metasprite data for it, and designate frame 0 as no frame. I guess you can say I should've had the frame number point to the metasprite data, and not the other way around.
Re: Actually making progress: tricked out metasprite routine
by on (#176051)
Quote:
Maybe I'll try giving each object a register pointing to it's index on the list so it doesn't have to keep looking into it's metasprite data for it

A bit related as this is exactly what i'm doing in my sprite engine :p

So 2 months ago a new version of my SGDK library for MD, ok it's for Megadrive but i think the Sprite Engine part can be interesting for you or at least give you some ideas :)
In this release i almost entirely rewrote it as the old implementation was too slow to be usable in real game condition.
The new implementation is faster but still done 100% in C so not blazing fast... I plan to eventually convert to assembly critical parts when code will be mature enough (it still need some polishes and probably a bit of bug fixes here and there).

The core sprite engine implementation is here :
https://github.com/Stephane-D/SGDK/blob ... rite_eng.c

But it relies also on the VDP sprite unit (low level sprite access) and VRAM unit (Video memory management).

The Sprite engine basically give you high level Sprite capabilities, handling meta sprite, vram and hardware sprite allocation for you.
It also handle automatic sprite visibility to limit the sprite per scanline usage (on MD even if the sprite is outside the horizontal screen range it will consume the scanline sprite slot). All these features cost CPU time and hopefully you can disable / enable them depending your case (you can for instance have automatic hardware sprite allocation while having manual VRAM allocation).

I know the code is complex, probably too much for your case as i intended to have something really generic, flexible and still "fast enough" for SGDK. Still i believe you can grab some ideas from the implementation and the structures :)
Re: Actually making progress: tricked out metasprite routine
by on (#176087)
Got it to work. Might as well get rid of the special cases like explosions and fireballs, and instead, animate them like everything else in the game, now that there's no limit a generally animated can be copied on-screen.
Re: Actually making progress: tricked out metasprite routine
by on (#176115)
Good job! :) It's funny, because explosions are the main reason I want to use the system I am. It kind of works that what I'm doing currently uses 8x8 and 16x16 sized sprites, because it's easier, and if people ever figure out how to multiplex sprites on the SNES, I'm definitely doing that.
Re: Actually making progress: tricked out metasprite routine
by on (#176174)
The downside is I have to fix slowdown again.
Re: Actually making progress: tricked out metasprite routine
by on (#176452)
I got the VRAM linked list working. Yeay!, and it only a couple minutes at midnight when the house was quiet, plus a couple minutes between church and work.

Now the next thing I want to improve is animation frame rate consistancy. Certain moves look choppy when there is a multijointed bosses on-screen because of the rotating DMA priorities. I'll redesign it so the main character always has priority over other objects on-screen when it comes to DMA updates so the player's animation always has perfect timing.
Re: Actually making progress: tricked out metasprite routine
by on (#176475)
I didn't know you'd have rotating DMA priorities, I'd have thought you'd have it to where a certain animation frame would always want to be updated. The only problem is that this could cause you to try to update more than you can in one frame, which in that case, I'm just going to have black spill over the screen. :lol: