Snes emulation image processing

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Snes emulation image processing
by on (#202921)
Image


Hello there, i hope my simple and newbie question is accepted here. This question is specific to Snes9x, but i think can be applied to most emulators that i'm aware of.

The snes is able to rotate, scale, shift palettes, shift hue, apply transparency, make the image looks brighter or darker, etc etc ... I guess emulators must support these functions to allow games to produce such outputs.

Also, as far as i'm concerned, most emulators work by the means of software blitting. Some might not, but in this thread i want to focus in the ones that work that way. In my old notebook at least it could not work by shaders, for example. So, unless i'm mistaken here, i'll take this fact for granted.

The thing is, as a little experiment, i tried to reproduce some of these effects via software too. In the case of snes9x i got a steady 30-50% of cpu usage. But, to my surprise, when i try to replicate some of these basic actions, it ends up consuming all the cpu. This makes me think that i'm doing something very wrong.

The question then is, how do emulators process images to end up being that fast? Do they avoid float point operations? I'm interested in knowing this, maybe have a look at some code, because to me is no different than magic. Thanks for your time.
Re: Snes emulation image processing
by on (#202923)
I do not understand what you're asking.

I did not reverse-engineer Seiken Densetsu 3 or anything, but the most likely case is that they use two separate, pre-calculated palettes to display during day and during night. It is extremely unlikely the game and/or the emulator does any calculation to do this.

For some games, like maybe the earlier games on the platform, they could use colour subtraction (called "Transparency effects" in SNES9x) to get such effects done (with less flexibility in their result). This is obviously NOT what SD3 uses, as you can see the windows lighting, something that wouldn't be thinkable using colour subtraction only. SNES9x allows dynamic disabling of "Transparency effects" so you should be able to tell immediately which games uses it and which don't.
Re: Snes emulation image processing
by on (#202926)
They're asking how emulators produce 60fps images with transparency, scaling, rotation, etc. so quickly on PC hardware without hardware acceleration, not how SNES games do it (with the SNES' hardware acceleration, of course).

I've never written an emulator myself, so I'm afraid I could only guess.
Re: Snes emulation image processing
by on (#202938)
Emulators don't use floating-point math, for one thing. (Neither does the compositor in the actual S-PPU.) And they use an API optimized for speed rather than one optimized for genericity and colorimetric accuracy. It's not getpixel()/putpixel(); it's get a reference to an array of integers representing a row of pixels and write to that array.
Re: Snes emulation image processing
by on (#202942)
Yeah, as part of telling you why your code is slow it'd help if we could see your code :)
Re: Snes emulation image processing
by on (#202950)
Yes, it's quite possible to write software renderers that run well / fast. Emulators were even doing this back in the 90s on much less powerful machines.

If you've got to draw 100,000 or 1,000,000 pixels per frame, though, it's still usually something you need to approach with care, even with today's power. The sheer amount of pixels to draw puts a huge multiplier on any inefficiency in your code for it.

...and if writing something that's not an emulator, it's probably better to stick to stuff you can do with a modern GPU if you can.
Re: Snes emulation image processing
by on (#202961)
rainwarrior wrote:
...and if writing something that's not an emulator, it's probably better to stick to stuff you can do with a modern GPU if you can.

What's the best practice for palette swapping per team color in modern pixel shaders?
Re: Snes emulation image processing
by on (#202962)
tepples wrote:
rainwarrior wrote:
...and if writing something that's not an emulator, it's probably better to stick to stuff you can do with a modern GPU if you can.

What's the best practice for palette swapping per team color in modern pixel shaders?

There are a lot of ways to approach this problem, but one way to do it is to have a sprite layer containing just the coloured parts of the uniform in white/greyscale and then using a vertex colour (or shader constant) as a multiplier.
Re: Snes emulation image processing
by on (#202967)
tepples wrote:
Emulators don't use floating-point math, for one thing.

Well, that makes me respect emu-devs more. I need to have a look at how they accomplish that.
adam_smasher wrote:
Yeah, as part of telling you why your code is slow it'd help if we could see your code :)

There are a lot of quite generic functions. For example, to make the screen look darker (a common operation with snes games), i do:
Code:
//f in 0.0 - 1.0
//Pseudocode, each channel is actually a byte.
for each pixel:
  pixel.red *= f;
  pixel.green *= f;
  pixel.blue *= f;

However this operation in the whole screen is quite costly.
rainwarrior wrote:
...and if writing something that's not an emulator, it's probably better to stick to stuff you can do with a modern GPU if you can.

Yes, with GPU you can accomplish much, but also the learning curve is huge. Just by looking these games emulated you can tell all the great effects you can accomplish without knowing opengl/vulcan
Re: Snes emulation image processing
by on (#202975)
OneQuestionPlease wrote:
There are a lot of quite generic functions. For example, to make the screen look darker (a common operation with snes games), i do:
Code:
//f in 0.0 - 1.0
//Pseudocode, each channel is actually a byte.
for each pixel:
  pixel.red *= f;
  pixel.green *= f;
  pixel.blue *= f;

However this operation in the whole screen is quite costly.

It doesn't need to be done on the entire screen. Games change just 256 values (the palette), and everything else on the screen refers to that palette. And because division by 256 is so fast, games will scale things by multiplying by a fraction 1/256 through 255/256 by first multiplying by 1 through 255 and then dividing by 256. There are even shortcuts for that multiplication.
Re: Snes emulation image processing
by on (#202978)
The SNES has a screen brightness register; that's what most games would probably used to darken the screen.

Nevertheless, the idea is the same: the emulator precomputes the darkened palette and uses it to draw the screen rather than doing a floating point multiply on each pixel after-the-fact.
Re: Snes emulation image processing
by on (#202981)
Oh, and
OneQuestionPlease wrote:
The question then is, how do emulators process images to end up being that fast? Do they avoid float point operations?

I am not extremely familiar with the hardware, but I'm pretty sure floating point is not slower on PCs than fixed point. It is slower in embedded systems without an FPU. FPUs has been incorporated in PCs since much earlier.
Re: Snes emulation image processing
by on (#202982)
Bregalad wrote:
I am not extremely familiar with the hardware, but I'm pretty sure floating point is not slower on PCs than fixed point. It is slower in embedded systems without an FPU. FPUs has been incorporated in PCs since much earlier.

It's slower in a lot of ways, maybe not so much on a per-instruction basis anymore, but when the whole pipeline is involved it's still pretty significant.

SIMD instructions can help a lot with floating point efficiency too, but that's a whole field in itself.
Re: Snes emulation image processing
by on (#203006)
tepples wrote:
And because division by 256 is so fast, games will scale things by multiplying by a fraction 1/256 through 255/256 by first multiplying by 1 through 255 and then dividing by 256

I didn't realize that, and after a small adjustment the difference is now notable. I could not think the float/int difference would be that high. Now i find my self looking at places where to apply this optimization.
adam_smasher wrote:
The SNES has a screen brightness register; that's what most games would probably used to darken the screen.

But i guess that at some point you have to apply the brightness anyway.
tepples wrote:
Games change just 256 values (the palette), and everything else on the screen refers to that palette.

adam_smasher wrote:
Nevertheless, the idea is the same: the emulator precomputes the darkened palette and uses it to draw the screen rather than doing a floating point multiply on each pixel after-the-fact.

This makes a lot of sense, i count the number of colors on the picture above and is roughly 100, very low and common in pixel art. Having to process just 100 is a lot faster than computing each pixel. But thinking about it, might be nightmarish to apply transparency, isn't it? I imagine the workflow to be:
Code:
for each color in palette2:
  if it exists in palette1: update index in image 2
  else: add new color and update index in image 2

  for each pixel in image2:
    copy index to image1

Image
Now, if you have to take into account the new colors produced by alpha blending, having to locate every new color (probably it didn't exist before) can be a slow operation.
Also, does the snes emply a different palette for every sprite/tile?
Re: Snes emulation image processing
by on (#203023)
The S-PPU has 8 palettes for backgrounds and 8 palettes for sprites. A palette can be assigned to each 8x8 pixel tile or each sprite. Blending on the S-PPU uses the formula (r1+r2), (r1-r2), or (r1+r2)/2. See Please consolidate all info.
Re: Snes emulation image processing
by on (#203024)
I feel like everyone is forgetting that the SNES has a 15-bit RGB (as in non-indexed) colorspace and that this whole colorspace is used for color math. So you can't really apply color indexing optimizations here unless you want really big lookup tables (For any single CGRAM state, theoretically up to 65536 entries, with many of them regenerated every time there's a palette change, and even then you'd basically just be caching addition which will probably give you very limited gains if any).
Re: Snes emulation image processing
by on (#203029)
HihiDanni wrote:
theoretically up to 65536 entries

ie: twice the size of the master palette...

Quote:
addition

Good point. SNES hardware color math doesn't involve multiplication; it's just addition or subtraction with an optional right shift (basically - I'm not sure what the exact hardware implementation of the division by two is). It's done once per pixel, as the PPU is generating the signal to send to the TV, and the result is not stored.

Software color math on SNES is (I assume) much rarer and mostly done with precomputed palettes stored in ROM, like the day/night idea mentioned earlier.

The screen brightness register is a whole other matter; I believe the function is actually analog, operating on the post-DAC video signal. Of course it's only ever applied once per pixel, and if you're willing to use sixteen 96 KB palette tables you could skip it...
Re: Snes emulation image processing
by on (#203037)
HihiDanni wrote:
I feel like everyone is forgetting that the SNES has a 15-bit RGB (as in non-indexed) colorspace and that this whole colorspace is used for color math. So you can't really apply color indexing optimizations here unless you want really big lookup tables (For any single CGRAM state, theoretically up to 65536 entries, with many of them regenerated every time there's a palette change, and even then you'd basically just be caching addition which will probably give you very limited gains if any).

I do not understand why pre-computing a 256x256 palette for colour math would be an optimisation over computing 256x240 screen pixels and applying colour math to them. This would be an optimisation in hi-res mode but I'm fairly sure the SNES can't do colour math (at least not as intended) when it is in hi-res mode.

The fact colour math is potentially CPU intensive might be why SNES9x can disable them optionally, in older computer it might have been necessary to do that in order to run games at full speed.
Re: Snes emulation image processing
by on (#203054)
OneQuestionPlease wrote:
how do emulators process images to end up being that fast?

They just emulate the graphics pipeline of the emulated system.

The day/night cycle in SD3 is simply done via palette changes. Take a look at the palettes in these pictures: day, night
All that happened is that the colors used by certain backgrounds and sprites are darkened.

SNES screens can use up to 4 background layers, plus one sprite layer (whose pixels are individually combined with the other layers). Here are the ones used by SD3: BG1, BG2, BG3, OBJ

For each pixel on the screen, the hardware (or the emulator) determines which tile is used from each background and which pixels in these tiles are used. SD3 uses graphics mode 1, in which BG1 and BG2 have 16 colors per tile and BG3 has 4 colors per tile. If the tile data for a tile's pixel is zero, it is transparent and another layer's pixel is used.

This process is done twice, resulting in two screens. Only now is the tile data converted to actual colors (using it as indices into the global 256-color palette, in which every entry is a 15-bit color). The generated screens can be added (or subtracted) if necessary. This is how layers that are enabled only in one screen can appear transparent. (But SD3 doesn't use color math for this effect in the pictures above.)

So, in summary: the hardware/emulator does very simple operations on a few bytes. The real magic is how the game programmers used these facilities to create impressive effects.

OneQuestionPlease wrote:
tepples wrote:
Emulators don't use floating-point math, for one thing.

Well, that makes me respect emu-devs more. I need to have a look at how they accomplish that.

Until very recently, working with floating-point numbers was much slower than with integer numbers.

adam_smasher wrote:
The SNES has a screen brightness register; that's what most games would probably used to darken the screen.

Unfortunately it has only 16 steps. (In the scene above, SD3 always keeps this register at the maximum brightness.)

OneQuestionPlease wrote:
i count the number of colors on the picture above and is roughly 100, very low and common in pixel art. Having to process just 100 is a lot faster than computing each pixel. But thinking about it, might be nightmarish to apply transparency, isn't it? I imagine the workflow to be:
Code:
for each color in palette2:
  if it exists in palette1: update index in image 2
  else: add new color and update index in image 2

  for each pixel in image2:
    copy index to image1

[...] Now, if you have to take into account the new colors produced by alpha blending, having to locate every new color (probably it didn't exist before) can be a slow operation.
Also, does the snes employ a different palette for every sprite/tile?

Emulators do not go through the picture looking for colors to change - the games simply generate slightly different source data (different palette entries) and the hardware/emulator generates a new screen from that initial data.

You can read more about how the SNES renders the screen in this document (among others).
Re: Snes emulation image processing
by on (#203090)
I'm glad everybody is taking the time to give thoroughly answers, including links and pictures. I'm learning immensenly.

tepples wrote:
The S-PPU has 8 palettes for backgrounds and 8 palettes for sprites. A palette can be assigned to each 8x8 pixel tile or each sprite. Blending on the S-PPU uses the formula (r1+r2), (r1-r2), or (r1+r2)/2

93143 wrote:
SNES hardware color math doesn't involve multiplication; it's just addition or subtraction with an optional right shift (basically - I'm not sure what the exact hardware implementation of the division by two is). It's done once per pixel, as the PPU is generating the signal to send to the TV, and the result is not stored.

But i was under the impression the snes actually supported different grades of transparency, for example https://youtu.be/XNQm9idpB1M?t=5m48s (clouds) or https://youtu.be/986vcULjMq8?t=25s (Boss).
creaothceann wrote:
All that happened is that the colors used by certain backgrounds and sprites are darkened.

I was under that impression, the day-night cycle is handled more gracefully in DQIII, probably with "handmade" palettes.
Image
Quote:
Emulators do not go through the picture looking for colors to change - the games simply generate slightly different source data (different palette entries) and the hardware/emulator generates a new screen from that initial data.

But i suppose at some point it must transform this array of indexes into an actual rgba array. In that case, it must compare if the pixel is 0 or not.

By the way, this thread improved my code, mostly thanks to avoiding floats at some places and processing unique colors and then replacing rather than processing the whole image. Also gave me a closer insight on how the snes works., Sometimes technical documents are too technical. But i'll have a glance at these links, the blending formulas are simple yet effective judging by the results.
Re: Snes emulation image processing
by on (#203095)
OneQuestionPlease wrote:
But i was under the impression the snes actually supported different grades of transparency

No, transparency is just half transparency. Multiple grades of transparency can be simulated by using additive color math and fading the palette to black. Example: the Great Fairy in LTTP (you'll notice there's kind of a "jump" as it switches to additive blending but the rest of the fade is smooth).
Re: Snes emulation image processing
by on (#203099)
OneQuestionPlease wrote:
But i suppose at some point it must transform this array of indexes into an actual rgba array.

No alpha needed.

OneQuestionPlease wrote:
In that case, it must compare if the pixel is 0 or not.

That's done before converting the tile data to actual colors.

If you look at anomie's document that I linked above, you'll see this:

Code:
Color Math
----------

Each main-screen BG (and the color-0 backdrop, and the sprites (although
sprites with palettes 0-3 never participate)) may be marked in register $2131
to participate in color math. If the visible pixel is from a layer/OBJ
participating in color math, we perform one of 8 operations on the pixel,
depending on $2130 bit 1 and $2131 bits 6-7.

  0 00: Add the fixed color. R, G, and B are added separately, and clipped to
        the max.
  0 01: Add the fixed color, and divide the result by 2 before clipping (unless
        the Color Window is clipping colors here).
  0 10: Subtract the fixed color from the pixel. For example, if the pixel is
        (31,31,0) and the fixed color is (0,16,16), the result is (31,15,0).
  0 11: Subtract the fixed color, and divide the result by 2 (unless CW etc).
  1 00: Add the corresopnding subscreen pixel, or the fixed color if it's the
        subscreen backdrop.
  1 01: Add the subscreen pixel and divide by 2 (unless CW etc), or add the
        fixed color with no division.
  1 10: Subtract the subscreen pixel/fixed color.
  1 11: Subtract the subscreen pixel and divide by 2 (unless CW etc), or sub
        the fixed color with no division.


I also suggest you watch these playlists:
https://www.youtube.com/watch?v=57ibhDU ... _lvGwfn6GV
https://www.youtube.com/watch?v=Tfh0ytz ... KdXygMO71Z
Re: Snes emulation image processing
by on (#203103)
In short, transparency can either be on or off, and any other level of transparency is just transparency + changing the palette.

If you ever see a smooth complete transition between invisible to non-transparent, it's probably done using a duplicate BG with a darkened palette on the subscreen with color math applied to sprites.
Re: Snes emulation image processing
by on (#203202)
I get it. I must say in favor of the snes that blending formulas are good enough to fool me. Is not a big deal of a limitation if it works as intended. The docs and videos were pretty interesting, i will spend some time searching for more on youtube.

I saw that rotation and scaling are performed in mode 7, which was a question that i had too, but i have seen a 'wave' effect in many games which makes me suspect it is also a snes function. Am i wrong or are these things performed 'by hand'? https://www.youtube.com/watch?v=3TdziNzMtaM
Re: Snes emulation image processing
by on (#203203)
Ripple waving like the one in the video is done easily enough on systems that can do it (Sega Master System is one that can't): the code manually changes the vertical scroll on each scanline, usually through an interrupt, but can be done with timed code on other systems.

Changing the horizontal scroll on each scanline gives the familiar wavy/fire effect seen in Thunderforce III, etc.
Re: Snes emulation image processing
by on (#203213)
In addition, the SNES and Mega Drive can both do offset-per-tile, or column scrolling. This can result in a wavy effect like this, or a fake rotation effect when combined with line scroll, like this, or this (note that only the backdrop is using this trick; the polygons are rendered at the proper angle). This scene combines horizontal and vertical line scrolling in the backdrop layer with column scrolling in the foreground, plus what looks like addition with the constant colour, and mosaic when you touch a Fuzzy.

The Mega Drive can use scroll offsets for every 16-pixel-wide column. The SNES can do it in 8-pixel columns, but you have to sacrifice a background layer - in Mode 1 you have two 4bpp layers and one 2bpp layer, but you can't do column scrolling; in Mode 2 you can do column scrolling but the 2bpp layer is gone. (Similar sacrifices are made going from Mode 3 to 4 and from 5 to 6, for the same reason, that being VRAM bandwidth.) On the Mega Drive there is no third layer, just the two 4bpp layers, meaning Mode 2 is a reasonably close match to the Mega Drive's feature set.
Re: Snes emulation image processing
by on (#203224)
Another real neat trick you can do by carefully adjusting the scroll during hblank interrupts is vertical scaling.
Re: Snes emulation image processing
by on (#203239)
OneQuestionPlease wrote:
i have seen a 'wave' effect in many games which makes me suspect it is also a snes function
HDMA

adam_smasher wrote:
Another real neat trick you can do by carefully adjusting the scroll during hblank interrupts is vertical scaling.
Prime example: Axelay
Re: Snes emulation image processing
by on (#203344)
Thanks, the snes pipeline seems to be complex, given all the stages you have at hand to alter the output.

Well, i'm glad i had this thread. Got a lot from it, and i really like to continue but i don't know what more to ask. My experiment was fulfilled successfully. I did not know exactly if emulators pull other tricks to relieve the lack of custom hardware, but just by knowing what is going on behind the scenes on the snes gave me many ideas to optimize the engine.

The difference of knowledge between the average user and me is huge but i'd like to lurk from the darkness to read more interesting threads. I thank everyone who answered and provided me with examples to ease my understanding. And i'd say this thread will come handy for many people in my situation, maybe this place is out of the radar for the ones in my shoes, but regardless of that, the thread is here waitting for whoever arrives at it.