HD mode 7 - NESdev BBS

HD mode 7
by byuu on 2019-04-16 (#237587)

Okay so this absolutely blew up last night.

https://www.reddit.com/r/emulation/comm ... 71_beta_1/
https://www.resetera.com/threads/bsnes- ... es.111715/
https://www.pcgamer.com/this-snes-emula ... ng-results

I implemented hires mode 7 in bsnes, where I render at double resolution, which is the same thing ZSNES and Snes9X used to do. It didn't make a large difference, so I just filed it as a novelty option and moved on. u/DerKoun on Reddit came along and upped it to 16x resolution for jaw-dropping results:

Frankly, I'm stunned we didn't realize this level of detail was possible for 20+ years.

So, I intend to offer this in bsnes, but I'm curious if Snes9X and/or Mesen-S are going to be interested or not.
I know Snes9X recently removed the feature, so ... maybe it's too soon?
But if others want to implement this, I think we should coordinate on what we name all of the various options and how they work, mathematically, so we can provide a consistent user experience instead of a mass of confusion. I won't hold back anyone's progress: if someone comes up with something cool, I'll implement it as well.

Current considerations:

* supersampling: if we scale the resulting image back down, it makes the pixel resolution match better without causing too great a disparity between sprite pixel sites and the background detail. Example: https://i.imgur.com/DTBoOTN.gif
* prescaling: we can perform filtering like HQ2x onto the mode 7 background first to smooth the jaggedness of the output, but that probably won't matter unless we go crazy and output at 4K mode.
* interpolation: we can interpolate on the output pixels to smooth things out, and also introduce more than 256 colors from the source image in doing so.
* maximum scaling factor: do we stop at 1024x1024, or go up even higher?
* perspective correction: affine texture mapping (eg what mode 7 actually is) has a really obvious problem with distortion. We could offer an option to correct for this. Example: https://en.wikipedia.org/wiki/Texture_m ... apping.jpg

If no one's interested in coming up with some standard terms and features for this, that's okay too, and I'll just experiment on my own for now.

Re: HD mode 7
by 93143 on 2019-04-16 (#237593)

As a stickler for accuracy, I doubt I'd use this feature much, but it's still kind of neat to see. It stands to reason, once you think about it - the direct mapping between stored pixels and screen pixels doesn't apply for Mode 7, so why not go nuts?

byuu wrote:

* perspective correction: affine texture mapping (eg what mode 7 actually is) has a really obvious problem with distortion. We could offer an option to correct for this.

Maybe I'm missing something, but unless you're just talking about corrections for the extra in-between lines you're adding, I don't see how this makes sense.

1) It's only affine if you don't change the matrix line by line. Affine distortion isn't really a thing in perspective Mode 7; all you get is the distortion from people taking shortcuts when computing the coefficients, or using unrealistic viewport parameters, or correcting for fisheye (Mode 7 perspective is essentially a raycaster, and you need to take similar measures). There's the constant-distance effect on single scanlines, but that's only a problem if you're doing a tilted camera, which I'm pretty sure no one did.

F-Zero with affine perspective distortion would look like this:

Attachment:

affinezero.png [ 11.83 KiB | Viewed 6211 times ]

2) How would you get the information? The actual 3D coordinate/camera/viewport data is gone, if it ever existed; all you have are the affine matrix coefficients, and there's no way to back out enough information from those to implement any sort of perspective improvements. You'd need game-specific hacks, and even that might not be feasible in some cases.

Re: HD mode 7
by byuu on 2019-04-16 (#237596)

> Affine distortion isn't really a thing in perspective Mode 7

I wasn't aware it actually got rid of the distortion just by setting the coordinates through HDMA. Very interesting.

Being perfectly honest, the underlying math of mode 7 never made the most sense to me. I've written affine texture mapping in C code where it's nice and simple. Thankfully anomie figured out the underlying math, so it was pretty straightforward to just implement it. The few mode 7 demos I've written were very simple affairs that didn't really do HDMA updates.

> How would you get the information?

My thought was that you'd analyze the mode 7 coordinates at each scanline to extract the "vertexes" trying to be displayed. It would obviously not work in all games that do silly things. But any that just treat it as "rotozoom the mode 7 background around onscreen as one big whole piece" (eg as if the square tilemap is just a quad to be rasterized from 3D->2D) should theoretically be quite doable.

I figured that was why games like Terranigma and Hyper Zone were breaking in HD, because the games are being silly and repeating the screen mirrored on the top and bottom using HDMA.

Re: HD mode 7
by ccovell on 2019-04-16 (#237598)

I tried this out, and yeah, it's absolutely great! The emu doesn't run nearly full speed on my PC, but oh well.

Naturally, I tried out Mohawk and Headphone Jack, and the floor textures appear to have errors in hires rendering. Is this normal?

Re: HD mode 7
by mkwong98 on 2019-04-16 (#237599)

My concern is how to mix this with upscale filtering of other objects on the screen. To my understanding, the whole screen is sent to the filter as a single picture and the filter with upscale the screen as a whole. This won't work now that the screen becomes two parts with different resolutions. The "non mode 7" part will not cover the whole screen and the empty area will be filled with transparent pixels. I think the filters are not capable of handling transparency and calculations with transparency is not simple.

Re: HD mode 7
by Sour on 2019-04-16 (#237601)

This is definitely something I'd like to implement, eventually. Not quite far enough on the actual emulation aspect of things to start adding extra features like these just yet, though (probably be some more months still). I haven't actually read how its implemented, though, mind you.

In terms of resolution, going much higher than 1024x1024 may get pretty rough in terms of CPU usage (e.g HD Packs in Mesen have no scale limit, but trying to run a 10x HD pack (2560x2400) can't even hit 60fps iirc despite all of it being done in its own thread.)

Re: HD mode 7
by tepples on 2019-04-16 (#237603)

The screenshots that byuu posted on Twitter upscale only the tmapped output, leaving the input texture as nearest neighbor. I'd like to see what Scale4x on the 1024x1024 pixel plane before tmapping does.

Re: HD mode 7
by creaothceann on 2019-04-16 (#237605)

byuu wrote:

Being perfectly honest, the underlying math of mode 7 never made the most sense to me.

https://www.youtube.com/watch?v=3FVN_Ze7bzw
https://www.youtube.com/watch?v=K7gWmdgXPgk

...easy, isn't it? /s

EDIT:

AlexFromRussia Smack-Fu Master, in training wrote:

Math in the base of Mode 7 is very simple.
It is based on six numbers - let's call them PA, PB, PX, PC, PD and PY.
Videochip goes through pixels of scanlines of screen iterating by two current coordinates TX and TY.
TX starts with PX and increments by PA with every pixel and by PB with every scanline.
TY starts with PY and increments by PC with every pixel and by PD with every scanline.
That is formula for random pixel is:
TX = PA * x + PB * y + PX
TY = PC * x + PD * y + PY
But 16-bit videochip cannot do several multiplications for every pixel drawn, so it just increments current values of TX and TY - that is 2 additions per pixel only.
For every pixel of screen TX and TY are used as "texture coordinates" in background layer of tiles of mode 7. That is "texture fetching" in modern terms. But 16-bit videochip can't do texture filtering, so it just gets nearest pixel from layer, that is why image is granular.
However this formula (which is known as "affine transforms" can do scaling and rotations but cannot do perspective projection which is needed in 3D-like planes.
So, second tricks is to change P-params in every scanline of screen to modify magnification of plane properly to imitate prespective projection. This can be done with special mode of DMA-controller of system which can feed several bytes of data to videochip ports with every scanline automatically without usage of CPU.

Re: HD mode 7
by byuu on 2019-04-18 (#237633)

Alright, I implemented DerKoun's work, made some improvements for mosaic, and allowed bumping the resolution to true 4K (2160p, x9 multiplier, x81 pixel density.)

So the big trick here is that this is taking advantage of my multi-threaded scanline renderer. Around H=512 of each frame, I capture the entire PPU I/O register state plus CGRAM to a line buffer. Each line holds all of this state. There's some extra logic to detect when games force blank to get more VRAM transfer headroom as well, and it will split off batches as needed. I use this data to then parallelize the rendering of the frame, which doesn't yield a huge speedup on its own, but it helps.

The HD mode 7 trick relies on it being true 99.99999% of the time that the line render functions are called in one giant batch per screen. Basically, no game is going to force blank in the middle of the screen, that's ridiculous. It then scans the adjacent line buffers to inspect their mode 7 parameters to perform vertical interpolation. This can look pretty good, but it's still not really that sharp due to rounding errors when SNES games try to do perspective correction of their own.

So the 3D perspective option instead scans to find the first and last scanline of each mode 7 block (you could imagine a mode 7->1->7 change for a text box on the screen, for instance), and then presumes the game is trying to draw a 3D quad, and will interpolate between the first and last vertical mode 7 A/B/C/D parameters, which looks stunning, but breaks when games get silly and do even cooler distortions with HDMA like repeating the screen in Hyper Zone, or showing whatever the heck at the top in Terranigma.

I believe that someone smarter than me should be able to analyze the A/B/C/D values of each line, and detect algorithmically when they change too drastically to break off the 3D perspective correction option.

That and a supersampling option would make this really something special.

...

So the bad news ... implementing this into existing emulators is going to be very difficult. You'll have to implement a scanline buffering system as I did (you need to know scanline 239's M7A/B/C/D values before you can start drawing scanline 1), and honestly if you do that you might as well parallelize it.

The obvious result of this is that you can basically forget about supporting this with a pixel-accurate raster PPU core like higan and Mesen-S use. Unless of course Sour *really* wants to show off and he manages to build some kind of state delta system for the entire frame, heheh.

As a result, Snes9X seems the most likely candidate to get this support one day in the future, by replacing their PPU core with a new one. Mesen-S will be up to whether Sour wants to maintain two separate PPU cores like bsnes is doing. And you can pretty much forget about seeing this on the Super Nt.

But then again, that weird RA gimmick where they idle for 75% of each frame to try and poll gamepads closer to the video rendering will definitely not like anything to do with this. Nor will runahead like it. So, this may just be a novelty for screenshots and casual gaming.

...

The good news is that if you're gonna scale up mode 7 to true 4K (81x the pixels of the SNES), you're gonna be very grateful to have a multi-threaded PPU core for that. I can just barely exceed 60fps at 4K on a Ryzen 5 2600. Realistically, 4x scale (16x the pixels) gets you 95% of the benefits, so that's probably good enough and will only rule out running this on very low-powered devices like ARM cores.

...

Now I have to decide whether to keep or scrap my older 512x240 hires mode 7 renderer. Seems pretty quaint by comparison, but it does fit into the existing SNES renderer way better and isn't quite as extreme. Some people may prefer it ...

Re: HD mode 7
by calima on 2019-04-18 (#237634)

That supersampled image looks like a very convoluted way of texture filtering. Surely actual texture filtering would be way faster than rendering at 16x and downsampling.

As for "when can I use this", I know you're not fond of such solutions, but a game whitelist would work fine.

Re: HD mode 7
by tepples on 2019-04-18 (#237636)

"analyze the A/B/C/D values of each line, and detect algorithmically when they change too drastically to break off the 3D perspective correction option"
That's what I initially thought was being done: some sort of low-pass on the A/B/C/D values before rendering each 2 scanlines as a textured quad. I'll have to try to figure out how to implement (as a tech demo outside of a Super NES emulator but using the same principle) how to detect out-of-line A/B/C/D values and smooth those that are in line.

Re: HD mode 7
by tepples on 2019-04-18 (#237640)

The distance from the camera to each texture row is proportional to √(M7A^2 + M7C^2). If I can fit a linear curve to the reciprocal of distance over any given range of scanlines, I can turn this range into a polygon to allow high resolution mode 7 to work even when there are hills. But to get started on actually making the heuristics for splitting a mode 7 playfield into polygons, I'll first need some representative test data for both nice scenes and problematic ones (like HyperZone, Terranigma, and Super Castlevania IV).

- Dump of CGRAM, tiles, and map
- Log of writes to all matrix registers during a frame

Re: HD mode 7
by Bregalad on 2019-04-18 (#237643)

This high-resolution mode 7 only make sense for games who shrink the image. Those who zoom in the image are better with good-old SNES9x's bi-linear mode 7 feature, already implemented decades ago. Ideally a "perfect" emulator would allow both simultaneously.

High resolution, then filtering to actual pixel size (called supersampled here), is basically applying a comb filter before downsampling, to avoid aliasing, which is the way proper low-resolution rendering should be done. The SPC700 does this correctly for sound, but somehow the PPU isn't able to keep up for the image and uses the cheapes "nearest neighbour" algorithm when rendering mode 7 images.

Also, neither bsnes_hm7_b2.exe nor bsnes_hm7_b1.exe works on my computer. It just says "error in executing program !" without any further information. My Windows install is in French so I bet it's the programm itself displaying that.

Re: HD mode 7
by creaothceann on 2019-04-18 (#237644)

tepples wrote:

But to get started on actually making the heuristics for splitting a mode 7 playfield into polygons, I'll first need some representative test data for both nice scenes and problematic ones (like HyperZone, Terranigma, and Super Castlevania IV)

And perhaps these.

Re: HD mode 7
by koitsu on 2019-04-18 (#237648)

This came in on Discord, with my response. Not a question I particularly care for (read: the answer is kind of irrelevant in the grand scheme of things), but I'll pose it anyway.

Code:

[1:46 AM] Quantam: What is the reason bsnes' HD mode 7 took so long to implement? Hasn't this kind of thing been implemented in PSX and other emulators for many years?
[4:04 AM] koitsu: i think that might be a question better posed on the forum for byuu
[4:05 AM] koitsu: i suspect the answer is "maybe nobody cared"
[4:05 AM] koitsu: a large percentage of bsnes/higan's user base is about "true accuracy", which that enhancement certainly is not

Re: HD mode 7
by nesrocks on 2019-04-18 (#237649)

I could swear I saw this option for hi-res mode7 on the first versions of zsnes, but I'm certainly not understanding what this is all about

Re: HD mode 7
by byuu on 2019-04-18 (#237651)

Quote:

Surely actual texture filtering would be way faster than rendering at 16x and downsampling.

The way it works is you originally had a 256x240 output, right? So to get to 1024x960, we generate 4x4 output pixels where there was only one pixel (1x1) originally. We pick the values of those 16 pixels by interpolating along the X/Y axis and the A/B/C/D mode 7 parameters for either the most immediately scanlines (no 3D perspective) or the first and last mode 7 scanlines (3D perspective.)

My idea for naive supersampling was to add all 16 pixels together, divide by 16, and that's the output value. This would not need a giant buffer and could be done in place, but would need a separate mode 7 renderer.

But if supersampling requires inspecting nearby input pixels (eg other 4x4 blocks), then that information won't be available until the image is generated (the entire frame is being generated in parallel via OpenMP.) We could only realistically be confident that pixels to the left of the current pixel are valid, and certain that pixels to the right are not. Above and below would be race conditions. So in that case, we would have to render a 1024x960 texture, and then downsample that to 256x240, which would be brutally painful.

Quote:

Also, neither bsnes_hm7_b2.exe nor bsnes_hm7_b1.exe works on my computer.

Try my official release.
https://twitter.com/byuu_san/status/1118762842453241856

Quote:

[1:46 AM] Quantam: What is the reason bsnes' HD mode 7 took so long to implement? Hasn't this kind of thing been implemented in PSX and other emulators for many years?

I can't speak for other emudevs, but in my case ... I frankly didn't know the results would look this good.

I implemented hires mode 7, using twice the precision on the horizontal axis for 512x240. The result was decent, but barely perceptible. I felt that increasing the resolution higher would yield rapidly diminishing returns. The reality was the complete opposite. I wish I had come up with this. I am jealous that DerKoun beat me to this idea. But, I'm very happy he found this, and he deserves all the credit for this.

I can also say that no emulator had the framework to do this until recently. bsnes' multi-threaded PPU renderer is what made this possible. And I don't say that to take credit from DerKoun, I didn't design the new PPU with this idea in mind. But as I said before, scanline 1 needs to know about the mode 7 parameters on scanline 239 to render optimally with 3D perspective. Doing that requires all of the work needed to parallelize the SNES PPU.

If not for the very recently written bsnes parallel PPU, this mod would not have been possible. And as such, it's never been possible in ZSNES, SNES9X, higan, etc. So that's probably why it wasn't done before.

If DerKoun or anyone else had to rewrite the entire PPU to try out this idea, not even knowing if it would work, I doubt anyone would have bothered.

Quote:

I could swear I saw this option for hi-res mode7 on the first versions of zsnes, but I'm certainly not understanding what this is all about :P

Yeah, I need to post an article with screenshot comparisons. The difference is truly night and day. bsnes already had hires mode 7 as well. It's not impressive in the least, unfortunately.

Quote:

I'm not familiar with this ... assumed Snes9X was the same kind of horizontal sampling precision increase.

Can someone show me some examples of this versus bsnes' hires mode 7, and explain the technical details?

I'm on-board implementing whatever works well into bsnes.

Quote:

[4:05 AM] koitsu: a large percentage of bsnes/higan's user base is about "true accuracy", which that enhancement certainly is not

Yeah, bsnes is very much off-brand for me. The key point of bsnes though is that you can turn off every last enhancement and get higan's level of obsessive accuracy if that's your dig. bsnes is about options and having fun.

I really screwed up with how I handled things in the past. I wanted an emulator that was as perfect as possible. And now that I've made one, I don't mind maintaining a fork that focuses on quality of life improvements.

Hopefully this will go a ways toward showing that I can be reasonable about things, while also giving me more freedom to experiment with higan. And boy, higan is about to get weird with the next release. The MSX really broke every expectation for how my original bsnes project worked. I don't expect anyone will like it; it'll very much be avant garde.

Re: HD mode 7
by tepples on 2019-04-18 (#237652)

I think high resolution mode 7 on older emulators may have done one or more of three things:

512 pixels per line by stepping the texel coordinate by half AC per half pixel
Draw the top half of a scanline using start and end texel coordinates 50/50 interpolated between those of the previous line and this line, then draw the bottom half normally
Bilinear filtering

Technique B is analogous to this technique, except using a separate quad per scanline. Where it breaks down, however, is at the higher magnification levels near the bottom of the floor area (or the top of the ceiling in HyperZone), as the limited precision for the parameters causes adjacent scanlines to use the same parameters. This is what I plan to fix with a smoothing technique once I have some dumps of test data.

Re: HD mode 7
by rainwarrior on 2019-04-18 (#237653)

The question about PSX or other emulators can be simply answered that it's massively more straightforward when you're already in a 3D rendering context. Triangles are already a resolution agnostic format to begin with, so in a lot of cases it's a relatively trivial extension.

On the SNES you have a completely different rendering paradigm for everything else except this one thing which is sort of a textured quad rasterization, sort of. Kinda unfair to compare systems where everything is natively triangles to a system where everything but this one thing isn't.

Though on the subject of using hardware to do it, if you rendered the source memory to a texture, you could use 3D hardware to just draw a quad with texels from that buffer, at least across scanlines that don't have a change. Otherwise you'd have to generate a separate quad at each scanline split... though I'm not sure if any of that would be worthwhile vs. just doing it in software, especially because of the first step of generating the texture.

Another question I have is where your texel coordinates are supposed to be... like is the 0,0 pixel of a 4x4 oversampling block a match for the original resolution's pixel? Or would you shift that to 2,2? Would it make an improvement in the alignment/feel of where sprites appear on the screen relative to the BG layer?

The thought of trying to smoothly interpolate the extrapolation parameters to accommodate smoothly changing scanline parameter variations is interesting. You have to decide what's a disconnected scanline or not, I suppose, but I suspect existing games would play nice with something like that?

Bregalad wrote:

This high-resolution mode 7 only make sense for games who shrink the image.

I don't think I agree. When magnified, you can still get better defined edges of the large texel squares, and that's worthwhile.

Though on the subject of oversampling when minified, maybe consider doing this at the texel lookup level (like a modern GPU would, e.g. mipmaps or anisotropic filtering) rather than generating a high resolution image then downsampling.

Re: HD mode 7
by creaothceann on 2019-04-18 (#237655)

byuu wrote:

SNES9x 1.42 has a bilinear Mode7 feature (in addition to its bilinear OpenGL output driver) that was removed somewhere before 1.52.

In Super Metroid's title screen intro it's immediately noticeable.

Re: HD mode 7
by rainwarrior on 2019-04-18 (#237659)

rainwarrior wrote:

Though on the subject of oversampling when minified, maybe consider doing this at the texel lookup level (like a modern GPU would, e.g. mipmaps or anisotropic filtering) rather than generating a high resolution image then downsampling.

Just to rephrase this more simply, if that was too jargonny:

It might be more efficient to generate 4 texels and blend them before storing to your output buffer, instead of storing all 4 and then blending them in a second pass.

(Was making the suggestion because that's what GPUs do, i.e. texture filtering during rasterization is much cheaper to implement than anti-aliasing as a post process.)

Re: HD mode 7
by tepples on 2019-04-18 (#237660)

In particular, you can do Scale2x in shader language nowadays to make the source data even higher res.

Re: HD mode 7
by 93143 on 2019-04-18 (#237667)

rainwarrior wrote:

Wouldn't that require you to do the downsampling operation on the source image in order to obtain the desired mipmap? Alternately you could do it at load time instead of at render time, but the quality would be worse, and I'm not sure you'd save time because to be effective you'd need multiple mipmaps to fade between.

Modern GPUs use mipmapping because the multitexture can be prebaked.

Re: HD mode 7
by ccovell on 2019-04-18 (#237668)

I'm not an emu author or expert or anything, but in looking at some graphics errors (3-D correction is OFF here), I notice that there is a single scanline of erroneous Mode 7 BG when the BG modes are switched (eg Mode 0/1/2 -> 7) so perhaps you need to reset the "interpolation" anew whenever Mode 7 is deactivated/activated.

Also, for games that also have H-Sync scrolling changes (think Actraiser's title fade-in or Puggsley's Scavenger Hunt) could you also interpolate the X-scroll values per rendered line so that there isn't a jerky change of the scroll offsets every 4 lines (when in 2x or 3x mode, etc.)?

Re: HD mode 7
by rainwarrior on 2019-04-18 (#237670)

93143 wrote:

Well, I tried to clarify what I was really suggesting in my followup post.

I mentioned mipmaps because they are related to the goal of anti-aliasing a minified texture. Dynamic mipmaps could be done somewhat efficiently I think, but that's not really what I was suggesting so I won't try to describe that here. My main point was more generally doing the anti-aliasing at the texture lookup stage, and not as a post-process on the output. Mipmaps are an example of that general concept.

Re: HD mode 7
by 93143 on 2019-04-19 (#237673)

Okay, I think I see. I read your post too fast and assumed you meant doing the same thing the N64 does, which is pick four adjacent texels and interpolate between them. That method requires mipmapping to avoid Nyquist folding.

What I now think you meant is to draw a single screen pixel's worth of high-resolution subpixels, and just average them right away rather than waiting until the end of the frame. This would have an identical result to byuu's method, assuming simple block averaging.

Is that right?

Re: HD mode 7
by Pokun on 2019-04-19 (#237675)

byuu wrote:

Hopefully this will go a ways toward showing that I can be reasonable about things, while also giving me more freedom to experiment with higan. And boy, higan is about to get weird with the next release. The MSX really broke every expectation for how my original bsnes project worked. I don't expect anyone will like it; it'll very much be avant garde.

On the one hand parts of the MSX standard is defined loosely on purpose so that makers could have some freedom in how to make their computers while software still being compatible with any MSX. On the other hand programmers didn't always follow the rules and relied on all sorts of assumptions that only works on certain MSX systems. And I guess 100% universal compatibility isn't realistic since there are so many little things that may turn out in unexpected behaviour on certain hardware, even though the software seems to be following all the rules. I'm not sure if anyone can expect bsnes level of accuracy on MSX unless you emulate every MSX model out there.

Re: HD mode 7
by creaothceann on 2019-04-26 (#237916)

new version: https://www.reddit.com/r/emulation/comm ... idescreen/

Re: HD mode 7
by Bregalad on 2019-04-28 (#237993)

creaothceann wrote:

new version: https://www.reddit.com/r/emulation/comm ... idescreen/

Still doesn't work for me...

Re: HD mode 7
by creaothceann on 2019-04-28 (#237995)

Try a VM with an English Windows? Or a different account with an English display language.