Max colour output - NESdev BBS

Max colour output
by Molive on 2018-09-22 (#226262)

Hey guys.

I've been trying to think about what the limits are on colours on the snes, and I couldn't quite work it out.
Using just colour math, and possibly hdma, what is the limit on the colours per frame on snes?
I worked out maybe 4096 colours was possible, but can you hit the 32768 colour limit?

Thanks,
Molive.

Re: Max colour output
by lidnariq on 2018-09-22 (#226263)

GoodSNES has a "32768 Color Demo by Joshua Cain".

Clever abuse probably could invent a few more colors using the display brightness control, since reportedly that has its own 4 bit DAC that multiplies the RGB channels' 5 bit DACs in analog.

Re: Max colour output
by tepples on 2018-09-22 (#226265)

Combining color math with pseudo-hires can generate (2*sub+main+COLDATA)/4 and (3*sub+main)/4, in theory allowing for even more intermediate colors.

Re: Max colour output
by Molive on 2018-09-22 (#226266)

That 32768 colour demo seems to use hdma to force the same R colour per scanline, which means that it can't be used for arbitrary images :/
Is there other ways?

Re: Max colour output
by tepples on 2018-09-22 (#226267)

The biggest limit you'll hit in full screen is VRAM. An 8 bit per pixel 256x224 pixel image with all unique tiles already takes 58K of the 64K of VRAM: 56K for the tiles and 2K for the map.

If you letterbox to 256x160, you can go 12 bits per pixel (4 bits per channel) by using color math in mode 3 to add an 8bpp "direct color" image (3 bits red, 3 bits green, 2 bits blue) with a 4bpp low bits image (1 bit red, 1 bit green, 2 bits blue). This takes 60K for the tiles and 4K for the map, leaving nothing for sprites. You can use software sprites. If you want a larger image or some VRAM for hardware sprite tiles (which are limited to 4bpp), you can repeat tiles in the tilemap.

Re: Max colour output
by lidnariq on 2018-09-22 (#226269)

There's the DMA hack that 93143 invented, allowing displaying a 64x224x15bpp image (4x1 pixel blocks).

Basically the SNES analog of the equivalent Genesis DMA hack (that displays 160x224x9bpp).

But honestly, you're going to get perfectly adequate results with just an 8bpp paletted image; you might need to hunt down a good quantizer. (Bisqwit's animmerger includes a wide variety)

Re: Max colour output
by psycopathicteen on 2018-09-22 (#226271)

Normal circumstances it's 256 colors. It can be up to 32768 colors if using transparency, or HDMA tricks.

Re: Max colour output
by dougeff on 2018-09-22 (#226273)

I feel like with color math, you should be able to push an additional 3 bits of color variation onto a max palette of 256.

256x2x2x2 = 2048.

Then adjust the screen brightness 16 times a frame.

2048x16=32768...

actually, screen brightness zero = black so... 15

2048x15+black=30721

Re: Max colour output
by lidnariq on 2018-09-22 (#226274)

That's basically the method that tepples pointed out. Start out with a R3G3B2 image, use color math to add a R1G1B2 layer on top to get R4G4B4. But you run out of data to store pixel data: 65536 bytes is only enough to store 43k pixels at 12bpp. Hence his comment about letterboxing it.

Re: Max colour output
by dougeff on 2018-09-22 (#226275)

Thought. If someone makes a retro console in the future.

5 bits per color is really more than the human eye can differentiate. They should have gone 4 bits per color, and used the extra 4 bits for alpha (transparency). Then they could have eliminated all that color math difficulty.

Re: Max colour output
by psycopathicteen on 2018-09-22 (#226276)

I could still can tell shades between colors in 15-bit colors most of of the time.

Re: Max colour output
by lidnariq on 2018-09-22 (#226278)

In the Not-Quite-So-Bad-Old-Days, image viewing software often had an option to dither its output for 15- or 16- bit displays. Color banding is really obvious otherwise.

Re: Max colour output
by tepples on 2018-09-22 (#226279)

As an experiment, I scaled "Nethil and Sarah's bunny", a photo of two plushies, to 256x224 pixels and bit-crushed it using twelve settings: six depths (R5G5B5, R4G4B4, R3G3B3, R3G3B2, 8bpp with optimized palette, and 4bpp with optimized palette) and two dithering types (off or Bayer). I can see the banding on R5G5B5, particularly on the white area in the background.

In my subjective opinion, the 8bpp with optimized palette version looks as good as R4G4B4. The 4bpp version doesn't look quite as good as the R3G3B2 version, but better is possible by choosing different 15-color palettes for different parts of the picture. (My software doesn't do that.)

If you're curious about the lore behind this photo, let me know and I'll explain the back story of Nethil (the one in the yellow shirt).

Re: Max colour output
by 93143 on 2018-09-22 (#226280)

dougeff wrote:

Then they could have eliminated all that color math difficulty.

The N64 tried replacing "color math" with alpha, and look where it got them. They had to shove in additive blending at the last second, and it ended up basically useless anyway because they hadn't bothered to clamp the output of the blender, on the grounds that alpha blending with clamped sources can't overflow. This is why glow effects sucked on the N64.

Also, with alpha blending you still have to worry about what you're blending it with. The SNES could do exactly one blend operation per pixel. By "retro console" I assume you mean something with a linear constant-time renderer like the S-PPU, as opposed to a framebuffer blitter like the RCP; in that context it seems to me you'd have to devote a fair amount of silicon to pixel blending if you wanted to avoid the main screen/subscreen limitation and blend anything with anything else.

I wonder if it would have been feasible to allow individual sprites to be sent to either the main screen or the subscreen (basically having two linebuffers). It's a pretty harsh limitation to never be able to blend sprites with each other at all, ever... It would probably have cost more silicon than they wanted to spend on a feature that was already way beyond anything their competition could do...

psycopathicteen wrote:

I could still can tell shades between colors in 15-bit colors most of of the time.

Even with 24-bit colour you can see banding under certain circumstances. The human eye is way better at telling the difference between shades when they're right next to one another.

Re: Max colour output
by Molive on 2018-09-23 (#226285)

Tepples:

That image may be quite good under just 256 colours: I think it looks better even, as you're in a 5-5-5 colour space compared to the 4-4-4.
However if you really wanted to show off the 4-4-4 colour space you'd have to use a really high colour image which ends up using more than 256 colours from that space. Currently the bunny one probably uses less than the one from the pallette mode.

That's why this scene from Overdrive looks so weird, there's a huge colour range in it.

Re: Max colour output
by dougeff on 2018-09-23 (#226286)

Yeah. Even with dither the 4x4x4 picture isn't nearly as good as 5x5x5. I guess I was wrong.

Re: Max colour output
by lidnariq on 2018-09-23 (#226293)

Molive wrote:

However if you really wanted to show off the 4-4-4 colour space you'd have to use a really high colour image which ends up using more than 256 colours from that space. Currently the bunny one probably uses less than the one from the pallette mode.

The bunny picture, when scaled down to 256x224 and truncated to 15bpp, has 2k unique colors in it. Yes, that's "only" 1/16th of the full gamut. It's also 8x256.

From the other extreme, you can look at the decade of PC demo effects that used the VGA (and SVGA)'s paletted video modes to know that wide breadths of hues don't have any correlation with color depth. In fact, the biggest problem is huge varieties of desaturated colors, nothing like the garish image there from Overdrive 1.

Quite frankly, I've put a lot of effort into figuring out how to get good photographs displayed on the SNES, and the 8bpp paletted mode is close to 99% of the image quality for 5% of the effort and 2% compromise. Dithering is required regardless of whether it's an 8bpp palette or 15bpp directcolor.

Re: Max colour output
by Drew Sebastino on 2018-09-23 (#226300)

This is hypothetical and thus not useful, but the sad thing is, if the SNES had 128KB of VRAM as originally intended, the S-PPU could have been made to support something like a 16bpp bitmap mode, as it has just enough graphics bandwidth. (Although you'd only have 8KB for sprites, assuming the bitmap is 256x224.

Edit: 16KB for sprites; I forgot that while you'd only have 1/8 of total VRAM, you would have 128KB instead of 64KB.

93143 wrote:

It would probably have cost more silicon than they wanted to spend on a feature that was already way beyond anything their competition could do...

Including everything but the most powerful arcade machines even; it's weird how feature packed the SNES was in comparison to most everything else for how starved for bandwidth it is...

psycopathicteen wrote:

I could still can tell shades between colors in 15-bit colors most of of the time.

Yup; just look at how the sky gradient in the first level of DKC jitters up and down for blending. 12bit color is much better than 9bit color, but even it can be pretty ugly. Just look at how the only upgrade beside sound hardware that Capcom made with the CPS2 was upgrading the color from 12bit to 16bit. (12RGB with an additional 4bits of brightness; probably easier to implement or something...)

Re: Max colour output
by Molive on 2018-09-23 (#226311)

Ah, sorry lidnariq, I didn't know. Thanks for telling me.
That makes a lot of sense, and I think I'll do most of what I need in 8pp, with some in mode 5 and like one or two really high colour images.

Thanks guys,
Molive

Re: Max colour output
by 93143 on 2018-09-24 (#226314)

You can of course change up to 8 arbitrary CGRAM entries per scanline with HDMA. A few years ago I wrote a scheduler in Matlab that starts with the top 256 colours (or fewer if I need space for something else) and goes down the picture changing palette entries that aren't currently needed when out-of-palette colours show up. If it can't find a free HDMA slot, or a single line has too many colours for the palette, it fails and I have to go re-quantize to a lower total colour count and try again. Obviously it would be better to have a tool that simultaneously optimizes the image and schedules the HDMA, but that seems like a lot of work...

With the above method, I managed 417 colours in a photograph and nearly 600 in a converted title screen backdrop. IMO both looked much closer to the 15-bit RGB version than to the 256-colour version. Performance depends heavily on the material, and also on the quantizer and its settings. IIRC it tends to perform worse on dithered images, which makes sense. I could probably improve the scheduling algorithm if I tried, but I'm really busy and it's good enough for my purposes...

This is admittedly a lot of effort for a modest gain. I developed this method because I'm working on a project that involves a small number of high-colour images, and I wanted them to look as good as possible.

...

Note that the quantizer can make a big difference in the quality of the picture even when just using 256 colours. I use a free standalone program called "Color quantizer" for most things because if you tweak it right you can usually get better results than with the GIMP. There may be a better program out there that I'm unaware of or don't own (Photoshop?).

Also note that, at least in the software I'm used to, reducing the bit depth to 555 from 888 is a separate operation from quantizing to a certain palette size, and your results will vary depending on which one you do first (and which one you do with dither - Color quantizer can load custom .pal files, so you can use its features when bit-reducing to 555 by loading a palette file that corresponds to 15-bit RGB). If you do palettization first, you may end up wasting a lot of palette space on a bunch of colours that end up identical to each other once bit-reduced; if you do the bit reduction first, you have to either dither at the same time (meaning you'll be palettizing a pre-dithered image) or get stuck with banding. And then you may have to bit-reduce again to make sure your image is still 555 (Color quantizer allows you to prevent colour mixing when performing adaptive quantization, but this can harm image quality). If your software can bit-reduce, dither, and palettize all at the same time, tell me what you're using and where I can get some.

...

It should also be possible to change at least 15 and possibly as many as 20 colours per line with regular DMA in an H-IRQ. But since the palette indices have to be sequential, it's less flexible than HDMA unless you're using 4bpp [cough]Mode5[/cough]. (Sadly, it is not possible to use HDMA to change more than 8 colours per line regardless of their arrangement in CGRAM, because there is no mode that sends four bytes to one address.)

Re: Max colour output
by Señor Ventura on 2018-09-26 (#226402)

Drew Sebastino wrote:

This is hypothetical and thus not useful, but the sad thing is, if the SNES had 128KB of VRAM as originally intended, the S-PPU could have been made to support something like a 16bpp bitmap mode, as it has just enough graphics bandwidth. (Although you'd only have 8KB for sprites, assuming the bitmap is 256x224.

Edit: 16KB for sprites; I forgot that while you'd only have 1/8 of total VRAM, you would have 128KB instead of 64KB.

Tiles of 128 bytes for a machine with less than a resulting 6KB of bandwidth (6,32KB, but WRAM, OAM, CRAM), seems a little bit insufficient to me.

All that stuff is a big domino effect. If you want a 65816 with more frequency, you need a WRAM with more frequency, and the rest of the hardware working in consequence too.

128KB of VRAM implies the need of an bigger bandwith, and, in this architecture yo can't get this without higer frequencies for all the pipeline (cpu->wram->vram).

Suddenly i have the need of open a thread about this topic (in terms of hardware and its availability according to date).

Re: Max colour output
by creaothceann on 2018-09-26 (#226406)

Wasn't the restriction to 2.68MHz because of the slow ROM chips of the day + the 8-bit data bus?

A 16-bit data bus is needed, but that doesn't seem possible with a stock 65c816 CPU.

Re: Max colour output
by Señor Ventura on 2018-09-26 (#226420)

creaothceann wrote:

Wasn't the restriction to 2.68MHz because of the slow ROM chips of the day + the 8-bit data bus?

Yes, presumably the WRAM forces the DMA to adapt its frequency to 2,68mhz, i don't know if the DMA actually works at 3,58mhz of stock.

creaothceann wrote:

A 16-bit data bus is needed, but that doesn't seem possible with a stock 65c816 CPU.

The SA-1 of the cartridges has an 16 bit bus data to comunnicate with the internal ram of the own cartridge.

Probably, that SA-1 inside of a snes could have been able an 16 bit bus data to comunnicate with the WRAM, but i don't know if it would alter the rest of DMA buses, or the continuity of the bus with the rest of the components.

Re: Max colour output
by CypherSignal on 2018-09-26 (#226422)

93143 wrote:

You can of course change up to 8 arbitrary CGRAM entries per scanline with HDMA. A few years ago I wrote a scheduler in Matlab that starts with the top 256 colours (or fewer if I need space for something else) and goes down the picture changing palette entries that aren't currently needed when out-of-palette colours show up. If it can't find a free HDMA slot, or a single line has too many colours for the palette, it fails and I have to go re-quantize to a lower total colour count and try again. Obviously it would be better to have a tool that simultaneously optimizes the image and schedules the HDMA, but that seems like a lot of work...

I actually got a pretty decent processor going with several days of work that does that harmonious quantization you're wondering about.

The idea I went with was to originally quantize images based on bucketing all of the pixels, determining a bounding volume across the RGB 'axes', and then splitting the bucket with the widest bounding volume into two buckets. Repeat until the max palette size is hit. More or less how most modern palettizing processes operate these days.

The extension to support HDMA I did was to continue splitting buckets, sort of treating the scanline as an additional axis to split on:

-Once the number of buckets is more than the size of the base palette, identify the first and last scanline that each bucket of pixels has
-Across the current list of buckets, identify all potential HDMA candidates by seeing if one bucket's first scanline was greater than some other bucket's last scanline, flagging each one as such
-If the number of buckets that don't get roped in via HDMA are less than the max palette size, then split based on colour, and start again, to re-evaluate potential HDMA candidates (i.e. splitting on colour may have resulted in new HDMA possibilities)
-If we're at the max palette size, though, then over the full list of buckets, find one with a gap of scanlines where a pixel does not exist, and find some other bucket that could be a candidate for hdma, e.g. checking the following

Code:

   // check if hdmaCandidate's final scanline is before our first possible scanline
   // this only takes into account bucket pairs that are like so:
   // -|||-------|||----- <- Bucket - split this along the scanline
   // ------|||---------- <- hdmaCandidate?

   // check if hdmaCandidate's first non-gap sequence (that we can see) is within 
   // bucket's gap. this takes into account bucket pairs like so:
   // -|||-------|||----- <- Bucket 
   // ------|||------|||- <- hdmaCandidate?

...if an hdmaCandidate was found, then the bucket being evaluated is split across the scanline. Afterwards, start this loop again.

And repeat until no more hdma candidates are discovered.

It's still a little slower than I'd like, in particularly due to one chunk of the algo that basically runs O(n^3) with number of buckets, but with maxcolours of 256 and maxhdmachannels of 8, I'm able to run it over the Kodak image library at an average of ~1.9 seconds per img (single-threaded; ofc processing them in bulk, it's trivial to parallelize). Also, the calculation of the colour deltas for buckets could be a lot better, e.g. the colour deltas are all computed in 8b per channel, not 5b, which is why even for the "0 hdma" case shown below, there aren't actually 256 colours in use, because there were, apparently some duplicates after final quantization, so we must have done some bucket splits that were redundant and may have been better suited on some other stuff.

The following PSNR values are compared against versions of the source images that were scaled down to SNES' max height or width, corrected for the different pixel size, and direct quantization to R5G5B5. The colour counts are all measuring unique quantized colours as well; there's almost assuredly some cases where one colour may get HDMA'd out and then get HDMA'd in again, but that still counts as one colour.

and some of the more notable comparisons (left-to-right: 'original', 15bpp quantized original, 8-channel hdma, 0-channel hdma):

#03:

#15:

#23:

If anyone is interested, I'm willing to post the source code up somewhere for perusal. There are some bits of ISPC code and the afore-mentioned parallelism is provided via Msft's ConcRT library, so it's not super platform agnostic, but if anyone is interested in a different take on this problem, I'd be happy to oblige.

Re: Max colour output
by lidnariq on 2018-09-26 (#226424)

Is the "original" truncated to 15bpp? Or just scaled down to 256x224?

Re: Max colour output
by CypherSignal on 2018-09-26 (#226425)

Those "originals" are still at 24bpp, so you can't directly check my math on that

I don't have the 15bpp versions handy, as the PSNR-calc just does the quantization against the original at the same time whilst it is comparing against the output image data.

Re: Max colour output
by lidnariq on 2018-09-26 (#226427)

I mean, for quantization purposes that's clearly correct. But I was just hoping for something for visual comparison

Re: Max colour output
by Señor Ventura on 2018-09-26 (#226430)

That 15bpp are really possible in snes?, How many KB's occupies?.

Re: Max colour output
by CypherSignal on 2018-09-26 (#226431)

Señor Ventura wrote:

That 15bpp are really possible in snes?, How many KB's occupies?.

The 15bpp images are only provided for reference/comparison, as per lidnariq's request.

Re: Max colour output
by Señor Ventura on 2018-09-26 (#226432)

CypherSignal wrote:

Señor Ventura wrote:

That 15bpp are really possible in snes?, How many KB's occupies?.

The 15bpp images are only provided for reference/comparison, as per lidnariq's request.

So, the two of the snes are the two of the right, right?.

Re: Max colour output
by Drew Sebastino on 2018-09-26 (#226434)

The SNES ones are the two on the bottom; the bottom left is with 8 channel HDMA, the one in the bottom right is without.

Fantastic job by the way; it looks closer to the 15bpp version for the most part. The shadow of the face is the only thing that looks noticeably poor. With the 8bpp version, I'd take the palette given and sacrifice some other color in order to have an extra shade of brown. It's a shame you can't really do that with the HDMA image.

Re: Max colour output
by CypherSignal on 2018-09-26 (#226437)

Drew Sebastino wrote:

With the 8bpp version, I'd take the palette given and sacrifice some other color in order to have an extra shade of brown. It's a shame you can't really do that with the HDMA image.

Yeah, it is ironic that the HDMA allows for a ton of little details and small colours to come out, meanwhile huge splotches of colour banding are left relatively unfettered, only improving through more headroom provided via the HDMA stuff. Across the rest of the Kodim examples, sky gradients and other face shots tend to suffer as well. The easiest solution is probably just throw some simple dithering in there late in the process and call it a day, but for my own purposes, it's not a priority anyway - the next big target in my sights for this is to support 16c tiled output for mode 1/3 display, similar to what Khaz discussed in the thread that 93143 linked to, but with improved processing time and expanding usable color palettes through HDMA.

Re: Max colour output
by Señor Ventura on 2018-09-26 (#226439)

Drew Sebastino wrote:

The SNES ones are the two on the bottom; the bottom left is with 8 channel HDMA, the one in the bottom right is without.

Fantastic job by the way; it looks closer to the 15bpp version for the most part. The shadow of the face is the only thing that looks noticeably poor. With the 8bpp version, I'd take the palette given and sacrifice some other color in order to have an extra shade of brown. It's a shame you can't really do that with the HDMA image.

So, we are talking about more KB's of VRAM than the dedicated to backgrounds. It has to use sprite too, ¿right?.

Re: Max colour output
by lidnariq on 2018-09-26 (#226440)

CypherSignal wrote:

the next big target in my sights for this is to support 16c tiled output for mode 1/3 display, similar to what Khaz discussed in the thread that 93143 linked to, but with improved processing time and expanding usable color palettes through HDMA.

One my first idle thoughts in response to your first post was wondering how just seeing how a 512x224 16-color image (i.e. the simpler problem of ignoring all palettes beyond the first) would work out.

Re: Max colour output
by CypherSignal on 2018-09-26 (#226441)

lidnariq wrote:

One my first idle thoughts in response to your first post was wondering how just seeing how a 512x224 16-color image (i.e. the simpler problem of ignoring all palettes beyond the first) would work out.

16-colours across the entire image? It's fairly unremarkable. Across the entire kodim set, there are zero to two colours added through any use of HDMA. e.g.

The way it's set up right now, with only 16 buckets of colours, they will all likely cover a very large portion of the screen and have overlapping scanline coverage, and there will be few opportunities for a colour to be evicted, and few slots for colours to jump back in.

Re: Max colour output
by Drew Sebastino on 2018-09-26 (#226442)

@Señor Ventura What are you trying to say? Both SNES images take the same amount of memory in VRAM; 56KB excluding the tilemap. Backgrounds can use 1024 unique tiles, which for an 8bpp background, is the entire 64KB of VRAM.

The problem I've seen with image palettization (?) programs is that they don't distribute color based off screen area well enough, or something like that. You'll see a small object get as many palette entries dedicated to it as an entire backdrop because it has more contrast. I'm not really sure how it works.

Re: Max colour output
by 93143 on 2018-09-26 (#226443)

CypherSignal wrote:

93143 wrote:

Obviously it would be better to have a tool that simultaneously optimizes the image and schedules the HDMA, but that seems like a lot of work...

I actually got a pretty decent processor going with several days of work that does that harmonious quantization you're wondering about.

Well, this is certainly a nice surprise to wake up to...

Quote:

If anyone is interested, I'm willing to post the source code up somewhere for perusal.

I'm interested. Though I'm pretty busy right now, so I probably won't get to it right away...

CypherSignal wrote:

The easiest solution is probably just throw some simple dithering in there late in the process and call it a day

You'd have to reference the 24-bit original, I assume.

I wonder if the way the SNES handles the video signal would cause issues with extensive checkerboard-type dither, as is sometimes seen in NES games; perhaps a more random style like Floyd-Steinberg might produce better results...

CypherSignal wrote:

16-colours across the entire image? It's fairly unremarkable. Across the entire kodim set, there are zero to two colours added through any use of HDMA.

Hmm. I wonder if there's a way around this...

Aside from the full-screen 8bpp multichannel stuff, I've got a bunch of smaller high-colour images that need to look good in single-palette 4bpp with only one HDMA channel, and iterative static quantization is not an ideal solution when resources are that thin. I've done a small one-channel 4bpp image by hand/eye, and it works great (32 unique colours in a 28-line range, and looks lovely), but as one might imagine it is somewhat tedious, and I wasn't looking forward to doing a whole lot more of it...

Drew Sebastino wrote:

The SNES ones are the two on the bottom

I suspect you're using a narrower screen than some of us. I see all four in a row.

lidnariq wrote:

One my first idle thoughts in response to your first post was wondering how just seeing how a 512x224 16-color image (i.e. the simpler problem of ignoring all palettes beyond the first) would work out.

Although I haven't actually tried it, it certainly seems like you should be able to use ordinary DMA to change out a whole 4bpp palette each line. You might have to watch the timing, maybe even use a jitter reduction technique in the H-IRQ if you need to run code at the same time, but it should fit.

On the other hand, if you were to quantize each line separately it might look funny because the lines would be uncoupled... How did DreamGrafix handle this?

Re: Max colour output
by CypherSignal on 2018-09-26 (#226447)

As an aside, on a lark I also fed the Wii kids image you had through my processor, and got this:

...which looks okay. It's similar to what your script generates - arguably worse because of some large gradients getting messed up (e.g. the wall on the left and the table in the background have more visible bands than your versions), but funnily enough, the final tally for # of colours ended up being a new record across anything I tested: 1265 (PSNR of 32.12 dB, fwiw)

...but that was so many colours that it identified a mild issue where, because I'm not doing any compression, the total binary data size of the map, clr, tile, and 8 channels of HDMA data being thrown at the assembler totals up to 65.1KB, so it can't fit in one bank anymore :lol:

Thankfully an easy fix because it's easy enough to cap the number of buckets to a lower number, but I was a bit stunned because I never bothered running the math of, "what's the total binary size of everything maxed out?"

Re: Max colour output
by Señor Ventura on 2018-09-26 (#226451)

Drew Sebastino wrote:

@Señor Ventura What are you trying to say? Both SNES images take the same amount of memory in VRAM; 56KB excluding the tilemap. Backgrounds can use 1024 unique tiles, which for an 8bpp background, is the entire 64KB of VRAM.

But the ppu1 don't give more than 45KB for backgrounds, so, i don't get why it can use 64KB for a background :?:

Re: Max colour output
by CypherSignal on 2018-09-27 (#226475)

Well, did the big optimization I wanted to do, so processing is back down to ~200ms per image at 256c/8hdma. There's still a couple other big hotspots of activity that could be tightened up, but I'm pretty okay with it as-is.

If anyone wants to look over the code or try it yourself, it's hosted at https://github.com/CypherSignal/background-processor (and the code you're probably interested in the most is over in https://github.com/CypherSignal/backgro ... rocess.cpp )

I've got some other things I have to take care of so I won't be getting back to it to work on Background Mode 1/2-style output for awhile.

Señor Ventura wrote:

But the ppu1 don't give more than 45KB for backgrounds, so, i don't get why it can use 64KB for a background :?:

Hmm, I'm not sure where you ever got 45KB from. The total RAM available for the PPU is 64KB in size, and all background, sprite, and tile data can be addressed in that space without any restriction.

Re: Max colour output
by Señor Ventura on 2018-09-27 (#226504)

CypherSignal wrote:

Hmm, I'm not sure where you ever got 45KB from. The total RAM available for the PPU is 64KB in size, and all background, sprite, and tile data can be addressed in that space without any restriction.

I've read from some programmers that snes has an amount of memory for sprites, and an amount of memory for backgrounds, so it is preselected.

Re: Max colour output
by CypherSignal on 2018-09-27 (#226505)

Señor Ventura wrote:

I've read from some programmers that snes has an amount of memory for sprites, and an amount of memory for backgrounds, so it is preselected.

Ah. That information may pertain more towards game-specific memory budgets, then - if you were doing modifications to an existing game like Super Mario World, that would be useful information. Tile data for sprites and backgrounds have to share the same 64KB of space, and for all intents and purposes, a developer would not want to mix the memory used for sprites and bg's. So, a developer would have to make a choice at some point to declare that they want to use some portion of the available VRAM for sprites (or some types of sprites - player sprites make have more memory allocated than sprites for enemies, for example) and some portion for backgrounds.

However, the examples I'm talking about here are independent of any game, and are just focused on displaying a single image. In this case, the entire VRAM is available to work with, and there are no restrictions on how memory can be allocated one way or another.

Re: Max colour output
by lidnariq on 2018-09-27 (#226508)

CypherSignal wrote:

16-colours across the entire image? It's fairly unremarkable. Across the entire kodim set, there are zero to two colours added through any use of HDMA. [...] The way it's set up right now, with only 16 buckets of colours, they will all likely cover a very large portion of the screen and have overlapping scanline coverage, and there will be few opportunities for a colour to be evicted, and few slots for colours to jump back in.

Hm, that's disappointing.

93143 wrote:

On the other hand, if you were to quantize each line separately it might look funny because the lines would be uncoupled... How did DreamGrafix handle this?

The last time tomaitheous asked, I came up with this quick and dirty hack and he was rightly disappointed in it.

On the other hand, coming back to this after another 3 years I can see trivially how to do subpalette generation more easily (namely, don't use ppmquant, instead use pnmcolormap on the colorspace-reduced reference, and use pnmremap from the highcolor original, with floyd-steinberg dithering)

Comparing this:

Attachment:

rgb9si.png [ 14.86 KiB | Viewed 3382 times ]

to the above linked attempt: functioning dithering makes horizontal banding much less obvious. (Simulated DAC here has 9bpp, just like the PC Engine).

And here's the same technique applied to kodim23:

Attachment:

File comment: #23

kodim512x224_4_of_15_aspect_corrected.jpg [ 92.64 KiB | Viewed 3382 times ]

(Image was simulated at 512x224, 15bpp, scaled vertically nearest-neighbor; scaled horizontally cubic)

Re: Max colour output
by Señor Ventura on 2018-09-27 (#226512)

CypherSignal wrote:

Señor Ventura wrote:

I've read from some programmers that snes has an amount of memory for sprites, and an amount of memory for backgrounds, so it is preselected.

Ah. That information may pertain more towards game-specific memory budgets, then - if you were doing modifications to an existing game like Super Mario World, that would be useful information. Tile data for sprites and backgrounds have to share the same 64KB of space, and for all intents and purposes, a developer would not want to mix the memory used for sprites and bg's. So, a developer would have to make a choice at some point to declare that they want to use some portion of the available VRAM for sprites (or some types of sprites - player sprites make have more memory allocated than sprites for enemies, for example) and some portion for backgrounds.

However, the examples I'm talking about here are independent of any game, and are just focused on displaying a single image. In this case, the entire VRAM is available to work with, and there are no restrictions on how memory can be allocated one way or another.

No, no, i mean that i've read from programmers that snes has a delimited and preselected memory for sprites and backgrounds.

It was a demo of the gunstar heroes for snes, and the programmer said that it could be impossible in that machine because the original game gets more memory for sprites than the snes dedicates.

P.D: The demo seems to not be in youtube anymore.

edit: Sorry for the off topic, it only was to commenting that.

Re: Max colour output
by 93143 on 2018-09-27 (#226514)

Señor Ventura wrote:

No, no, i mean that i've read from programmers that snes has a delimited and preselected memory for sprites and backgrounds.

It doesn't. You misunderstood, or else the person who said it was wrong.

Quote:

It was a demo of the gunstar heroes for snes, and the programmer said that it could be impossible in that machine because the original game gets more memory for sprites than the snes dedicates.

The SNES allows 16 KB for sprites at any one time. That's in two 8 KB chunks that can be anywhere in VRAM (in fact you can change where they are between frames or even between scanlines), and they can overlap BG data.

[Was this a Treasure programmer or just a random dude? I'd want to see an example of what they were talking about before I conclude that Gunstar Heroes on SNES was really impossible. 16 KB isn't exactly tiny (it's over half a screen of completely unique tiles, equivalent to 20 KB on a Mega Drive in H40 mode), and a lot of situations where you'd run out of room are amenable to workarounds.]

Each BG layer gets 1024 tiles (which is 16, 32, or 64 KB depending on bit depth) which are contiguous in VRAM and can be placed anywhere. Tileset regions for different layers can overlap one another, just like sprites can overlap BG data, and all of this can overlap tilemaps. There are no mutual exclusion restrictions in SNES VRAM mapping, and all of it is controlled by writable registers instead of hardwired.

The only exception to the no-hardwiring rule is Mode 7, where the interleaved graphics/map data is hardwired to the bottom of VRAM and cannot be moved. This is why it's tricky (though IMO not impossible) to do a 2-player F-Zero game - Super Mario Kart just places both players on the same map, but F-Zero moves too fast to just use a static map for the whole race without zooming in too far and making it stupidly chunky and wobbly. But even with Mode 7, nothing says you can't use part of the Mode 7 region for sprites - I do this in my shmup port because I'm using 40 KB of mixed sprite and Mode 1 tileset/tilemap data (in fact I have to switch sprite data locations partway down the screen, which is how I know it's possible) and Mode 7 covers the entire bottom 32 KB of VRAM. Just gotta make sure the part of the map you've repurposed doesn't show up on screen...

CypherSignal wrote:

...which looks okay. [...] arguably worse

Some of the bigger nearly-uniform areas are pretty bad, but you really nailed that bottle of Mountain Dew.

I wonder if this is partly due to differences in the underlying quantization methods... I spent a while fiddling with Color quantizer to get a good result on that image. Among other things, it wanted to not bother with the bright yellow triangle...

CypherSignal wrote:

If anyone wants to look over the code or try it yourself, it's hosted at https://github.com/CypherSignal/background-processor

Thanks. Like I said, I'm a little swamped, so it might be a while before I can look at it.

lidnariq wrote:

On the other hand, coming back to this after another 3 years I can see trivially how to do subpalette generation more easily (namely, don't use ppmquant, instead use pnmcolormap on the colorspace-reduced reference, and use pnmremap from the highcolor original, with floyd-steinberg dithering)

I understood some of those words... I'm still very userspace on this topic, but I'm sure it would make more sense if I read up on those resources.

But it does seem to work okay. You can still see some artifacting in 15-bit, but it really looks nice for 4bpp. Once I get some free time I should see if I can get the H-IRQ/DMA method running on real hardware, if no one's gotten to it before me.

Re: Max colour output
by lidnariq on 2018-09-27 (#226523)

93143 wrote:

lidnariq wrote:

On the other hand, coming back to this after another 3 years I can see trivially how to do subpalette generation more easily (namely, don't use ppmquant, instead use pnmcolormap on the colorspace-reduced reference, and use pnmremap from the highcolor original, with floyd-steinberg dithering)

I understood some of those words... I'm still very userspace on this topic, but I'm sure it would make more sense if I read up on those resources.

Yeah, sorry, what I said was all jargon.

Unpacking it:

subpalette = the 16-color palette I'm using on each scanline
tools from the "netpbm" software suite:
ppmquant = tools that generate a colormap, remaps colors, and handles dithering all in one program.
pnmcolormap = a tool that just generates that colormap
pnmremap = a tool that just remaps colors and handles dithering
pnmquant= a convenience wrapper that runs both pnmcolormap and pnmremap. Doesn't handle pipes.

pertinently, my previous attempt:
1- Started with the original image
2- Converted it to X bpp (X=9 for TG16, 15 for SNES)
3- divided this depth-reduced version into single-scanline high bits
4- converted each depth-reduced scanline on its own into the final result (without dithering)
5- combined all of those slices.

This time, I instead
4a- only used the depth-reduced scanline for its resulting palette.
4b- I go back to the original image, extract the corresponding full-depth single scanline
4c- and reconvert that slice of the full-depth scanline using the palette generated in 4a (with dithering)

Using Floyd-Steinberg dithering (or any other error propagating type) is really limited by only being able to propagate error within its own scanline, as is happening here. Positional dither might be an easier way to get better results.

But I can't see any way to apply these techniques to something where the palettes across scanlines have to have some correlation, which makes CypherSignal's idea so interesting.

Re: Max colour output
by psycopathicteen on 2018-09-27 (#226527)

Señor Ventura wrote:

CypherSignal wrote:

Señor Ventura wrote:

I've read from some programmers that snes has an amount of memory for sprites, and an amount of memory for backgrounds, so it is preselected.

Ah. That information may pertain more towards game-specific memory budgets, then - if you were doing modifications to an existing game like Super Mario World, that would be useful information. Tile data for sprites and backgrounds have to share the same 64KB of space, and for all intents and purposes, a developer would not want to mix the memory used for sprites and bg's. So, a developer would have to make a choice at some point to declare that they want to use some portion of the available VRAM for sprites (or some types of sprites - player sprites make have more memory allocated than sprites for enemies, for example) and some portion for backgrounds.

However, the examples I'm talking about here are independent of any game, and are just focused on displaying a single image. In this case, the entire VRAM is available to work with, and there are no restrictions on how memory can be allocated one way or another.

No, no, i mean that i've read from programmers that snes has a delimited and preselected memory for sprites and backgrounds.

It was a demo of the gunstar heroes for snes, and the programmer said that it could be impossible in that machine because the original game gets more memory for sprites than the snes dedicates.

P.D: The demo seems to not be in youtube anymore.

edit: Sorry for the off topic, it only was to commenting that.

I later figured out how to get around the 16kB sprite problem with my original homebrew game. I had to do a lot of trial and error.