Would it be feasable to put everything on one 8bpp BG layer?

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Would it be feasable to put everything on one 8bpp BG layer?
by on (#135027)
How do I put this... Basically, what I am trying to say is if you can load all the data you would normally use for other backgrounds and sprites and make all of them overlap on one layer and update it all every frame. Example:
Attachment:
example 1.png
example 1.png [ 16.71 KiB | Viewed 3004 times ]
+
Attachment:
example 2.png
example 2.png [ 8.5 KiB | Viewed 3005 times ]
=
Attachment:
example 3.png
example 3.png [ 20.53 KiB | Viewed 3004 times ]
The way you would set this all up is to use a large portion of vram to display a 32x28 tile grid of unique 8x8 pixel tiles and a much smaller portion displaying a 32x32 tile map. By the way, are the pictures working, and if not, how do you add pictures? :wink:
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135028)
You can't use [img] to refer to images on your computer's hard drive. Upload them as attachments.

Anyway, yes you can put everything on a big bitmap, but it'll be dog slow if you're updating a significant amount of pixels on the screen.
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135031)
Is that why most 3d games for the Super Nintendo suffer frame rate issues, or is it just because the 3d calculations take a lot of time?
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135033)
Both.
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135037)
Also another problem is that a 32×28 tiles 8bpp bitmap would take up 56KB of VRAM... Do any of the 8bpp modes even allow that? Mode 7 definitely doesn't, even if you switched on the fly mid-screen you'd have 32KB at most, although at least in that case you can use less tiles and stretch it to cover the entire screen =P
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135041)
Sik wrote:
Also another problem is that a 32×28 tiles 8bpp bitmap would take up 56KB of VRAM... Do any of the 8bpp modes even allow that?

Yes. The tilemap in a normal BG mode (ie: 0-6) is 16-bit, composed of two flip bits, a priority bit, three palette selector bits, and ten bits to denote the actual tile. This means you have up to 1024 tiles, which in 8bpp will completely fill the VRAM.

Mode 7 uses an 8-bit tilemap with no extra features, so you can only have 256 tiles, but since the tilemap is the same size as the tile data (16 kB), the resulting BG layer is huge - which is of course helpful when you're zooming out and viewing it in perspective and such...

...

It's true that a 56 kB background would be impossible to page, though, and the DMA is nowhere near fast enough to update it in one VBlank, so you'd get tearing during updates (on top of a horrible frame rate) unless you were careful to limit the number of tiles changed. Updating the whole screen at 8bpp takes more than 9 frames in the best case, unless you use force blank to get more DMA time.

[Interestingly, Starfox had a small enough playfield (224x190x4bpp) to update at 30fps given the extra VBlank. I wonder if a later-model GSU at 21 MHz might have allowed it to hit that number at times during gameplay, with appropriate programming...?]
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135048)
The kind of game for which one big 8bpp plane works well is Zoop. Because of the size of its playfield (18x14 cells), it needs to use 12x14 pixels in each cell. It could have used the Puyo Pop (GBA) solution in mode 1 with two 16-color planes scrolled 12 pixels apart and then chop off the bottom row of each 8x8 pixel tile with scroll HDMA, but then that'd leave a single 4-color plane for the playfield backdrop. So instead, it renders in software to a plane that covers most of the screen. But this being an advancing-block puzzle game, most of the field isn't moving at once, except for one advancing row or column of blocks.
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135118)
93143 wrote:
It's true that a 56 kB background would be impossible to page, though, and the DMA is nowhere near fast enough to update it in one VBlank, so you'd get tearing during updates (on top of a horrible frame rate) unless you were careful to limit the number of tiles changed. Updating the whole screen at 8bpp takes more than 9 frames in the best case, unless you use force blank to get more DMA time.

[Interestingly, Starfox had a small enough playfield (224x190x4bpp) to update at 30fps given the extra VBlank. I wonder if a later-model GSU at 21 MHz might have allowed it to hit that number at times during gameplay, with appropriate programming...?]


I am just thinking about how 93143 said that I would take over 9 frames to update an entire 8bpp screen, but some Super FX games I know , like Doom, are 8bpp and run at a (somewhat) reasonable speed. Does the Super FX chip have anything to do with the fast DMA transfers, or am I crazy and they actually only run at about 10 frames per second?
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135121)
SuperFX is replacing the 3Mhz 65816 with a 10Mhz 65816, and later 21Mhz, so that's why games run better and needed it for the complex graphics and 3D math.
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135125)
That helps with the 3D calculations, but what about transferring data to VRAM?
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135127)
SNES Doom definitely isn't rendering to the entire screen though, I imagine that gets rid of a rather big chunk of the memory needed. Also stupid question, but in what mode does Doom show the graphics? I mean, it isn't full resolution so in theory scaling in mode 7 would work (thereby reducing memory usage even less), but no idea if Doom is doing that.

EDIT: oh, also I suppose mosaic can be used to make the SNES skip every other pixel, so that'd mean even less bandwidth needed if you can make it transfer into every other byte, right?
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135131)
Espozo wrote:
I am just thinking about how 93143 said that I would take over 9 frames to update an entire 8bpp screen, but some Super FX games I know , like Doom, are 8bpp and run at a (somewhat) reasonable speed. Does the Super FX chip have anything to do with the fast DMA transfers, or am I crazy and they actually only run at about 10 frames per second?

My "9 frames" number is for a standard full-screen display, where the active region is 256 dots wide and takes up 224 lines out of 262, giving you about (262-224)*(1364-40)/8 = 6,289 bytes of DMA per frame to transfer 57,344 bytes of graphical data. If you want to refresh the entire screen, that's about six and a half frames per second, with visible tearing because the screen is too big for any paging strategy.

Doom seems to run at 216x176, with a 32-pixel-high status display at the bottom reducing the actual rendered area to 216x144. This means that in principle up to (262-176)*1324/8 = 14,233 bytes of DMA per frame is available to transfer 31,104 bytes of graphical data. That's plenty for 20 frames per second, with ample headroom for OAM and CGRAM updates, so you can use 5/3 buffering regardless of frame rate as long as the Super FX has enough RAM to hold up its end (a 30 kB layer still seems a little big to naïvely double buffer if you want anything else on screen)...

3gengames wrote:
SuperFX is replacing the 3Mhz 65816 with a 10Mhz 65816, and later 21Mhz, so that's why games run better and needed it for the complex graphics and 3D math.

Uh, the Super FX was a custom RISC chip by Argonaut. The SA-1 was a 65816 at 10.74 MHz, but it was never (so far as I know) upgraded to 21 MHz as the Super FX was.

Also, while I believe Super FX games tended to leave the SNES CPU mostly idle, the SA-1 was designed to cooperate with the CPU rather than outright replace it.

Sik wrote:
in theory scaling in mode 7 would work (thereby reducing memory usage even less), but no idea if Doom is doing that.

I think it's using Mode 3. Actually, I've seen no evidence that the Super FX's PLOT instruction even knows about Mode 7; it seems to be only for bitplanes. You could always render normally in software, I suppose...

Quote:
I suppose mosaic can be used to make the SNES skip every other pixel, so that'd mean even less bandwidth needed if you can make it transfer into every other byte, right?

That's not how bitplanes work, unfortunately. Each byte contains data for all eight pixels in the row.

As a matter of fact, the VRAM port is two bytes wide, and you can set the word address to increment after writing either the low or high byte. This doesn't really help with the cunning plan I had a while back re: DMAing only the first two bitplanes of a sprite table (if anyone has any great ideas on how to do that without having to reset the DMA every 16 bytes, I'm all ears), but it does mean you can write to either the tilemap or the tiledata in Mode 7 without wasting bandwidth writing to both...
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135133)
93143 wrote:
I think it's using Mode 3. Actually, I've seen no evidence that the Super FX's PLOT instruction even knows about Mode 7; it seems to be only for bitplanes. You could always render normally in software, I suppose...

The docs say "256-color mode" without ever mentioning to which mode it's referring to =/ Seriously, Nintendo's documentation is atrocious.

Also PLOT only advances forward horizontally (more specifically, after PLOT is executed the X coordinate advances by 1 while the Y coordinate is untouched), so using PLOT would mean having to substract 1 from R1 and add 1 to R2 after every pixel, so I'd imagine PLOT isn't even used in the first place, it'd be just wasteful.

93143 wrote:
That's not how bitplanes work, unfortunately. Each byte contains data for all eight pixels in the row.

Ouch, right, for some reason I was under the impression that 8bpp modes were always packed instead of planar (I know that's definitely the case in mode 7).
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135145)
Since we had good luck changing bg modes mid scanline, would it be possible to use force blank during H-blank, and use DMA then.
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135150)
93143 wrote:
Sik wrote:
in theory scaling in mode 7 would work (thereby reducing memory usage even less), but no idea if Doom is doing that.

I think it's using Mode 3.

Correct, Doom for the SNES/SFC uses mode 3. I verified it in SNES9x's debugger. All of Doom's scaling, as I remember it, is purely 100% software; there isn't any "HDMA trickery" or anything else that I know of. Using NO$SNS check out the VRAM, tab "Tiles 8bpp", and watch the window when moving/panning around (even a slight bit) -- all software.
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135153)
psycopathicteen wrote:
Since we had good luck changing bg modes mid scanline, would it be possible to use force blank during H-blank, and use DMA then.


If that is possible, then wouldn't you be able to DMA twice as much graphics?
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135154)
Even if forced blanking during horizontal blanking is possible, this would mean no sprites.
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135158)
tepples wrote:
Even if forced blanking during horizontal blanking is possible, this would mean no sprites.


I don't plan on halving any sprites, because I want to put everything on BG1 and I would not even have enough vram for a full 8bpp screen and sprites. Just thinking, couldn't you actually update the whole screen in one frame with DMA and several HDMA channels? If I am correct, I am guessing you would load as much of the graphics on the top of the screen with DMA, and also at the top of the screen, you would have several HDMA channels loading graphics at the rest of the screen that DMA does not cover. For every 8 rows of pixels, you could update 1 row using HDMA because each row of pixels in 8bpp mode is 32 bytes, and in HDMA, you can transfer 4 bytes in every row, so 32/4 = 8. Every new HDMA channel would then decrease the amount of rows of HDMA for every row of graphics written. (Sorry for my poorly worded and generally confusing post, I just had trouble writing what I wanted to say.)
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135161)
Not enough bandwidth. You've only got 224 scanlines; with all 8 HDMA channels firing on every line, that's 7,168 bytes. Add that to the 6 kB or so from VBlank, and you're still well short of what would be necessary to update the whole screen in one frame. And since the data is still too big to page, you'll still get tearing.

This is all assuming you can actually do this stunt with force blank and HDMA at all...

Your best bet, if you insist on a flat-rendered 8bpp display, is to make it smaller. This simultaneously increases the DMA bandwidth and reduces the amount of data you need to transfer with it. Getting 60 frames per second would require an unreasonably small display, but you can probably use an adequate amount of the screen at 20 frames per second without the HDMA trick, or 30 with it.

...

NB: Everything I've said thus far assumes an NTSC system. The numbers will be different for PAL; somewhat better, though the same general conclusions should hold.
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135163)
Oh, I was being an Idiot... I was thinking that every pixel was 1 bit, not 1 byte. :roll:

93143 wrote:
And since the data is still too big to page, you'll still get tearing


What do you mean by paging and tearing?
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135165)
Paging means using half the VRAM for the frame being displayed and half for the next frame as it is being loaded into video memory, then switching the VRAM bank once the frame is completely loaded. At least Elite and Tank Demo use a form of paging.

Tearing means showing a picture on the screen that is only partially updated. Tetris for NES has some visible tearing when updating the matrix after a line clear. And during some particularly fast maneuvers, I've even got Zoop for Super NES to visibly tear.
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135166)
Yeah, even with general DMA and HDMA combined there isn't enough time in a single frame to do this. You'd have to settle for actual 30fps. This is how and why arcade games with dedicated hardware for massive sprite or BG overlays and/or lots of ROM space were made. Why does this whole thing make me think of Metal Slug? *cringe* :P

Edit: hey, if every pixel was 1 bit (meaning black-and-white quite literally), you could pull it off. Bad Apple demo for SNES. (I'm certain Tepples will latch on to this, LOL)
Re: Would it be feasable to put everything on one 8bpp BG la
by on (#135167)
93143 wrote:
NB: Everything I've said thus far assumes an NTSC system. The numbers will be different for PAL; somewhat better, though the same general conclusions should hold.

Vblank time in PAL is gigantic compared to vblank time in NTSC, it's not just somewhat better, it's a lot better. The obvious downside is the lower framerate =P (to put it simple: if you were to trim 16 pixels from the top and the bottom, you'd be nearly doubling blanking time in NTSC without it really being noticeable on the screen size - it's that bad)

Also on top of all this: how much processing time is left to render to such a bitmap in the first place? I guess this would resort to a coprocessor because otherwise I see no way how one could want such a high framerate.