BBBB is pretty much stuck with what it has. I'm just speculating what kind of animation system my next game would have, since I am on a never ending quest to find the perfect animation system. What I just thought about is having 8 pixels of forced blank on top and bottom of the screen will give me 54 lines of dma time, and up to 9207 bytes per frame. Just enough time to update 50% of sprite patterns, and OAM, and BG scrolling.
Sounds like a good compromise, 8 pixels on top and bottom doesn't sound like it'd eat a lot of space if it can help getting better animations.
However I wonder if this could cause problem to start the HDMA if you use any. Pehaps you'd have to start them manually if you use forced blanking at the top of the screen. I might be wrong though.
Before I start programming anything, I need to figure out what should I do if the game needs more than 64 16x16 sprites. Should I run all animation at 30fps? I can't remember if the explosions in Busty Baby were 30fps or 60fps. The explosions in Secret Agent Insane Maniac were animated at 60fps, but I had to stop development for that game because I maxed out vram and dma bandwidth.
EDIT: I just checked BBBB's code, and the explosions are at 30fps. I guess I'll animate all sprites at 30fps, since I can't tell the difference between BBBB and SAIM explosion's animation speed.
Given the number of widescreen HDTVs out there by now, a 256x176 letterbox might even be a good idea. That way pushing the zoom button to make the game fill the screen wouldn't cut much off. Umihara Kawase was way ahead of its time in putting the HUD in the 16:9 safe area.
I googled searched "Umihara Kawase." I'm impressed by that fishing line trick. Is the fishing line drawn on the BG3 in Mode1?
Anyway, I thought about widescreen, but I'm afraid it would appear too pixelated and cramped.
Okay, I thought a little about how to impliment it. In order to get 8kB of animated sprite patterns into that time, I need as little CPU work as possible during v-blank. Having every sprite being DMAed individually takes too much CPU time during v-blank, so I'm going to use a sprite VRAM buffer. Because the SNES stores sprite patterns in 2D tile formation, I am going to use 2 vram buffers, and use 2 DMA channels, and alternate between the two every group of 16 8x8 tiles.
If you're using a buffer anyway, why not lay out your sprite data in such a way that a block copy is possible?
The Super NES requires that the data for 16x16 and 32x32 pixel sprites be stored discontiguously in VRAM. It's like a 128x128 pixel sprite sheet or like 2D mode on the GBA. Unless you copy eight 16x16 cells in one block, you have to copy the top and bottom halves separately.
tepples wrote:
Unless you copy eight 16x16 cells in one block, you have to copy the top and bottom halves separately.
This is what I was referring to. He talks about uploading 8KB of sprite patterns at once. Shouldn't you organize your data in such a way that a block copy can be performed?
Psycopathicteen's engines typically use what some of us have called the "Battletoads" method: they
load in a sprite cel just as it's about to be used, rather than keeping all of a character's sprite cels in VRAM at once. Unless your sprites are fighting game sized, each frame of animation typically won't span eight different 16x16 pixel sprite cels.
Could you describe or draw a diagram of how you recommend to organize the data?
Well, to me it sounded like he wanted to update half of the sprites every other frame. I might have misunderstood his original requirements though.
I interpreted it as the following:
- I have a 16x16 area in sprite VRAM for each object on screen
- I want to animate each sprite at a rate of 30Hz
- During every VBlank I want to 1) Update OAM 2) Update BG map (probably a single row/col) 3) Upload 8KB of sprite data.
The easiest solution I can think of is the following:
divide the sprite table into two parts of same size and update each one at a rate of 30Hz (alternating every other frame). This could be done by a block copy easily, couldn't it? It always updates half of the sprite table regardless of actual changes taking place, though.
Quote:
The Super NES requires that the data for 16x16 and 32x32 pixel sprites be stored discontiguously in VRAM.
Then why not use a metasprite made of several 8x8 sprites then ?
This way any important object can be bulk-copied from ROM directly during VBlank, without the need to split in halves or anything.
Less important objects that would not need to be constantly uploaded could use larger HW sprites.
Or alternatively you could do a smart buffering system that does ROM -> temportal area in RAM during the frame, and then a bulk transfer from RAM to VRAM during VBlank.
Unfortunately I don't know enough about the SNES to be of much help.
While I was working on this, I figured out an easy way to get around the 16x16 sprite problem.
Code:
lda !pattern_address_pointer
and #$0200
clc
adc !pattern_address_pointer
sta !pattern_address_pointer
If bit 9 in the pattern_address_pointer is set, then it is an odd-row sprite, and will automatically jump an entire row, so that all sprites land on an even row.
Now I want to know how games like Street Fighter 2 use HDMA along with forced blanking, if there are a lot of hardware bugs with using DMA and HDMA at the same time, or starting HDMA late.
Okay, I tried using a RAM buffer, but it is taking way too much CPU time. I'll try to see if the existing code can be optimized a little more, and then I'll program it to adapt to the number of sprites onscreen, instead of copying 64 sprites per frame. I might as well use shorter DMA legnths when less sprites are onscreen too, just to get as much CPU time in as possible.
Well, I don't know but you'd have to be sure that even in the worst case (there is the maximum of animated objects on screen) your engine still works fine.
Using smaller 8x8 sprites to do bigger sprites might really be the best trick if you want to DMA pattern for each metasprite separately.
I guess I'll bring it up to 12 pixels on top and bottom, because v-blank is already saturated with 1 big 8kB block, and there needs enough CPU time to run the game.
http://wiki.superfamicom.org/snes/show/Registers On this website I found this under registers $2180-$2183
Quote:
This means you could use DMA mode 4 to $2180 and a table in ROM to write any sequence of RAM addresses. The value does not wrap at page boundaries on increment.
Is this true? Can you DMA from rom into wram with these registers? If so, then using a CPU RAM buffer would be possible. The only problem I see is that it wouldn't work for decompressed sprites, because you can't do wram-to-wram DMAs.
Well, I can't answer for your questions, but about the "is it true" question, the only really reliable document for the SNES is Anomie's. Most others/older documents are totally inaccurate and somtimes even complete crap.
I'm finished with my animation engine. Here is a demo showing Red's running animation from Gunstar Heroes.
I realized that instead of using HDMA to create the top and bottom bar, a better way would be to use BG3 to create the top and bottom bar. This way would be better because the CPU can keep the PPU in forced-blank until DMAing is finished, even when it finishes late.