Ultrasonic pulse behaviors for 2a0x and MMC5.

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221506)
https://cdn.discordapp.com/attachments/352252932953079813/469390090184032266/ultrasonic_tests.zip (This was uploaded in the NESdev Discord.)

Looks like the way in which emulators and players emulate the MMC5 ultrasonic pulses differs.

In the URL above there is an NSF, an 0CC source module, and a hardware render from an MMC5 cart by ImATrackMan.

First, the NSF starts pulse generation on all four pulse channels at 0x0D in the low timer and makes decremental writes down to 0x00 on each of them. (High timer bits 0-2 are not set... However interestingly enough the 0CC-Famitracker engine does underflow them to $FF and plays an A1 on 2a0x and an A2 on MMC5.) The next frame only has 2a0x and then repeats with only MMC5; then it loops back to the beginning.

The expected behavior of the 2a0x pulses is to stop tone generation when the low timer hits 0x07. Tone generation is supposed to stop with MMC5 when it hits 0x00. The hardware render verifies this if you count the ticks starting at 0x0D.

Current documented behaviors on certain emulators/players:

Mesen (latest): Stops tone generation of MMC5 at 0x07. (INACCURATE!)
Nintendulator (latest): Plays tones on both channels and freezes sound generation. (BUGGED!)

NSFPlay/NSFplug (latest): Stops tone generation of MMC5 at 0x00. (ACCURATE! _PASS_)
VirtuaNSF (latest v1601): Stops tone generation of MMC5 at 0x07. (INACCURATE!)

However it seems that the frequencies of the tones generated do not equate to the frequencies generated by hardware.

The tones generated by the MMC5 from the hardware render do seem to be ultrasonic... ImATrackMan says that he's able to hear the frequencies of the hardware render clip up to 0x05. I can hear them up to 0x07.

Generally speaking, there doesn't seem to be a single emulator that emulates either 2a0x nor MMC5 ultrasonic frequencies the way that hardware does. A comparison with the hardware render output will verify this.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221526)
Already it can create a BadBios with NES room
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221532)
Yes, this behaviour is documented on the Wiki's MMC5 audio page.

B00daW wrote:
However it seems that the frequencies of the tones generated do not equate to the frequencies generated by hardware.

The tones generated by the MMC5 from the hardware render do seem to be ultrasonic... ImATrackMan says that he's able to hear the frequencies of the hardware render clip up to 0x05. I can hear them up to 0x07.

Generally speaking, there doesn't seem to be a single emulator that emulates either 2a0x nor MMC5 ultrasonic frequencies the way that hardware does. A comparison with the hardware render output will verify this.

When you're talking about recording ultrasonic frequencies, the hardware you recorded it with starts to matter as much as the hardware in the Famicom. This is kind of past being an "emulation" issue and becomes more of a "digital DSP" issue. Ultrasonic frequencies are difficult to filter out. The higher and stronger they are, the more likely they are to leak some sidebands through your oversampling filter. ...this is not easy to combat properly without a lot more CPU usage (oversampling + filtering).

So, yes if you emulate an ultrasonic frequency, you're very likely to hear some aliasing in an emulator that isn't quite the same sound. It's still emulating the correct sound underneath, but the downsampling step ends up reflecting some of the sound at another frequency, unfortunately.

This doesn't just affect MMC5, it's every unit that generates frequencies in this range. 2A03 triangle, the timbre of periodic noise, VRC6, etc.


Implementing an extra high quality filter (and a ~2MHz oversampling mode) is on my to-do list for NSFPlay. How well does Blargg's bandlimited synthesis method deal with this, btw?

Also when you said "recorded from hardware" in a case like this you really need to specify which hardware was used (Famicom RF? Famicom AV? direct audio out mod? TNS cart port?) because that matters, well even the recording device used has an effect here too. Though, I don't think this is something that really needs hardware recordings to verify against at this point: the underlying emulation is correct and known, the problem is just how to build a good filter, not really about knowing what it should do.


...though, a hack that mutes the channel in the emulated output before filtering if the frequency is known to be above human hearing is also possible. That's what the "mute triangle at pitch 0" option in NSFPlay does. Maybe there should be a blanket option for muting all too-high frequency emulation like that. Of course this just hides the filter problem (which will still exist, just with a more subtle effect).

Of course, the reason that muting hack was put there for the triangle is that some games out there actually do this (e.g. Silver Surfer). This is not the case with MMC5/VRC6/etc.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221543)
B00daW wrote:
The tones generated by the MMC5 from the hardware render do seem to be ultrasonic... ImATrackMan says that he's able to hear the frequencies of the hardware render clip up to 0x05. I can hear them up to 0x07.
Timer=5 (period 6) is 19kHz. I can basically guarantee you that no-one older than about 20 can hear that.
Timer=7 (period 8) is 14kHz. This is plausibly audible (in favorable testing conditions), but unlikely.

rainwarrior wrote:
this is not easy to combat properly without a lot more CPU usage (oversampling + filtering).
[...]
...though, a hack that mutes the channel in the emulated output before filtering if the frequency is known to be above human hearing is also possible. That's what the "mute triangle at pitch 0" option in NSFPlay does. Maybe there should be a blanket option for muting all too-high frequency emulation like that. Of course this just hides the filter problem (which will still exist, just with a more subtle effect).
This is actually the correct audio synthesis method anyway. There's no reason to bother to emulate purely ultrasonic sound and then run it through a filter that will completely filter it out. You may as well memo-ize the result (namely, a DC value that is the produce of the duty cycle and the nominal volume) and just emit that instead.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221545)
lidnariq wrote:
This is actually the correct audio synthesis method anyway. There's no reason to bother to emulate purely ultrasonic sound and then run it through a filter that will completely filter it out. You may as well memo-ize the result (namely, a DC value that is the produce of the duty cycle and the nominal volume) and just emit that instead.

Yes, that's correct output for these particular cases that are only generating frequencies above that range, but there are also significant high frequencies like this being generated elsewhere (esp. the noise unit) that hare harder to address in this way... like it's probably not practical to store a pre-filtered version of every possible periodic noise loop.

However, and this was kind of the last point I was making, there's not really much reason to use these values on the MMC5 anyway. If they're working correctly you won't hear anything. Use of this would most plausibly be a bug/accident. The reason I implemented that workaround for triangle was just that it actually came up in some games. Ignoring the problem for the others seems like a reasonably simple and effective solution too, since it doesn't actually come up. Less special cases for the code and less cycles spent that way. ;P

I mean, even though this is a known thing on my to-do list to think about improving, I also consider the current "no solution" solution to be a good indicator/diagnostic of what the MMC5 part of the hardware does; even though it's producing a spurious sound, that sound lets you know that it does in fact function in that way... so there's a side effect benefit that it shows you something you can't hear directly from your TV.


(And of course, the emulators that stop MMC5 at 8 as if it had the sweep unit are just getting that wrong, but that's still kind of a moot point w.r.t. real world usage, just a little bit less moot.)
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221630)
I was going to program an NSF that coerces dogs to bark and interact... ;X
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221634)
rainwarrior wrote:
(esp. the noise unit) that hare harder to address in this way... like it's probably not practical to store a pre-filtered version of every possible periodic noise loop.
Tangenting, because this is kinda interesting to me.

There are ceil(32767÷93) = 352 different unique bitstreams that the tonal noise LFSR can generate. You only need to care about aliasing (and filtering) when the sample rate of the noise channel approaches the sample rate. Last time I looked into this, it turned out there were 208 different tone colors it could generate.

So assuming 44kHz mixing, you can probably assume you only care about LFSR sample rates at 22kHz and above (i.e. the fastest 5). At this point, the repeat frequency is high enough that you shouldn't need to worry about subsonic (rhythmic) effects, and I bet phase doesn't matter either.

Whether lookup tables from the 32767 states to the 208 tone colors, and one from those 208 tone colors to the lowest 33 harmonics (since 10kHz÷300 = 33), and then using straight additive synthesis (for the subset of harmonics that are in the nominally-audible band) makes any sense... well, I dunno. Probably not.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221639)
B00daW wrote:
Nintendulator (latest): Plays tones on both channels and freezes sound generation. (BUGGED!)
The NSF enables Frame IRQs, which the Nintendulator NSF player is not expecting. Try my Nintendulator-NRS build. (I used the occasion to also add the MMC5 hardware multiplier to the MMC5 NSF playback code, necessary for Just Breed.)
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221655)
lidnariq wrote:
There are ceil(32767÷93) = 352 different unique bitstreams that the tonal noise LFSR can generate. You only need to care about aliasing (and filtering) when the sample rate of the noise channel approaches the sample rate.

Ah, yes I hadn't considered that there's 93 trivial rotations of the 2^15 starting states. That probably does lower it to a somewhat practical problem. :P

I was thinking about the other cases where muting can't solve the problem, or maybe where it shouldn't.

Like, yeah I can mute frequencies above a certain point, but there's a continuum going right up to there where the aliasing gets fairly bad. That's more or less a problem for all the channels that can go to a high frequency (most of them), but it's just pretty rare to see them used in practice because that sound sucks to listen to. The noise channel on the other hand gets up in there all the time, so I'd been giving it some cheap extra oversampling anyway.

Also, realizing that B00daW's recordings were 96kHz reminds me that NSFPlay uses the exact same kind of filter for 96kHz as it does for 48/44.1kHz, and it really shouldn't. I'm sure I could make a more robust filter with the wider falloff from the human range that 96kHz permits, and it would also permit analysis tools to see that extended spectrum that we can't hear. (Also on my to-do list is investigating Blargg's bandlimited synthesis method. I'm still curious how well it does with the high end frequencies like this.)

Changing samplerates also means that with muting it's maybe appropriate to do it >= the nyquist frequency, which is going to be dependent on that sample rate. Even more logic for this rare special case. ;) I do get the point that it's silly to intentionally generate frequencies you can't render, and then throw a powerful downsampling filter at it, but at the same time special case muting turns a clean and continuous implementation into something with a lot of bumps. :S

Anyhow, I'd wanna weigh my filter options again carefully before I decide to add dozens of special case high frequency mutes to the code. The triangle mute special case I already implemented was specifically only for period value 0, which isn't based on nyquist/human-hearing but just that it's the one value that a developer would actually use to silence the triangle.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221656)
rainwarrior wrote:
at the same time special case muting turns a clean and continuous implementation into something with a lot of bumps. :S
Absolutely agreed! For no good reason I just want additive synthesis to actually turn out to be useful. Somewhere. At all.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221691)
Just generate 1.6 MHz audio and shove it into a resampler.

Seriously, on a modern PC the resampling isn't the bottleneck. The 40+ tap, stereo, double-precision floating point resampling in my VGM player accounts for less than half of an already tiny CPU budget. And it all fits in 240 lines of plain C, so it's not like it's hard to do. If you want to support low-end hardware, well, there are plenty of smarter ways to write a resampler.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221694)
Rahsennor wrote:
Just generate 1.6 MHz audio and shove it into a resampler.

Seriously, on a modern PC the resampling isn't the bottleneck. The 40+ tap, stereo, double-precision floating point resampling in my VGM player accounts for less than half of an already tiny CPU budget. And it all fits in 240 lines of plain C, so it's not like it's hard to do. If you want to support low-end hardware, well, there are plenty of smarter ways to write a resampler.

"A resampler" isn't very descriptive, there's a million ways to resample and they're far from equivalent (...except when extended to extremes where everything is the same). The problem I'm having isn't deciding to use a resampler (there has to be one), it's which one is most appropriate for the situation. But a step back from that, really the problem is just that I haven't taken the time to do that work to review this yet... partly because the one that's already there is doing a pretty good job, and nobody until now has been intentionally throwing un-hearable frequencies at it and complaing about them. It's a subtle problem that isn't particularly pressing.

"Less than half" of your CPU usage sounds like a very significant part of it to me?

I said above that I do plan to implement an all-cycles-oversampling mode into NSFPlay. That oversampling definitely isn't cheap, though, and is of course separate from resampling too. The amount of oversampling is already a setting in NSFPlay, and you can see how much impact it makes on performance by adjusting it.

Several people have complained to me that NSFPlay is too CPU intensive, and I do think it deserves a good performance pass over the whole code, but again not something I've gotten around to yet. (I've had other priorities... and am coming back from several years hiatus.)

So... no, I don't really agree that this is trivially solved, or that performance is a non-issue with this, but I absolutely do agree that a CPU-frequency oversampling mode is an important option to have.

I would, though, be interested to know how your implementation of this sounds with the given test, and what resampling method it uses.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221773)
rainwarrior wrote:
"A resampler" isn't very descriptive, there's a million ways to resample and they're far from equivalent (...except when extended to extremes where everything is the same). The problem I'm having isn't deciding to use a resampler (there has to be one), it's which one is most appropriate for the situation. But a step back from that, really the problem is just that I haven't taken the time to do that work to review this yet... partly because the one that's already there is doing a pretty good job, and nobody until now has been intentionally throwing un-hearable frequencies at it and complaing about them. It's a subtle problem that isn't particularly pressing.

Sorry, I was unclear - this is exactly what I was getting at. People were talking about prefiltering tables and silencing channels on certain periods and other hacks. What I meant to say is just that - they're all hacks. Doing anything other than emulating the chip's actual behaviour at its actual clock rate is a hack. You need a resampler in any case, and this is literally its entire job.

rainwarrior wrote:
"Less than half" of your CPU usage sounds like a very significant part of it to me?

It's a VGM player. The only other thing running is the audio chip emulation. Total CPU is low enough that I can't get an exact figure on it due to CPU frequency scaling. It's in the same ballpark as similar programs so I haven't bothered optimizing anything yet. In that context, it's not significant, to me anyway.

rainwarrior wrote:
Several people have complained to me that NSFPlay is too CPU intensive, and I do think it deserves a good performance pass over the whole code, but again not something I've gotten around to yet. (I've had other priorities... and am coming back from several years hiatus.)

Ah. If NSFPlay is too CPU intensive then I'm in deep shit, and the hack camp may have a point. All a matter of perspective I guess. :|

rainwarrior wrote:
So... no, I don't really agree that this is trivially solved, or that performance is a non-issue with this, but I absolutely do agree that a CPU-frequency oversampling mode is an important option to have.

This is the pedant in me talking, but please stop calling it 'oversampling'. It's the native sample rate of the chip. Anything else is undersampling. That's kind of the the point I was trying to make.

Unless of course you do mean higher than native sampler rate, but I don't get why you'd want to...?

rainwarrior wrote:
I would, though, be interested to know how your implementation of this sounds with the given test, and what resampling method it uses.

Sure. I haven't got around to implementing the NES chipset in my new player, so this is with repeat. Polyphase FIR filter, 16 taps, Nuttall windowed sinc. It runs "backwards", scattering instead of gathering samples, the table is pre-integrated, and the input signal is differentiated, making it a simple table-driven BLEP resampler.

My newer resampler dumps the BLEP nonsense and uses double-precision floating-point, which is faster (yes, really) and has a vastly lower noise floor. It does still operate 'backwards', making it technically a BLIP resampler, but only because I was too lazy to change it. If you put a differentiator before it and a leaky integrator afterwards, you'd get a BLEP resampler, which might be worth a speed boost, but my new player is mostly about FM chips so I haven't tried it.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221776)
Rahsennor wrote:
People were talking about prefiltering tables and silencing channels on certain periods and other hacks.

Oh that, well, I don't think myself or lidnariq was taking the pre-filtered periodic noise table idea as a serious option. It was just interesting to think about out loud.

Rahsennor wrote:
This is the pedant in me talking, but please stop calling it 'oversampling'. It's the native sample rate of the chip. Anything else is undersampling. That's kind of the the point I was trying to make.

No, I'm going to continue to call it oversampling, but yes it is very much undersampling for the NES. It's also oversampling for the output device. We could debate the semantics of this but I don't think it'd be an interesting debate.

Actually, to give you an idea why this affects performance so much for NSFPlay, how often it has to jump back and forth between CPU emulation and generating samples really adds a lot of overhead, and as a side effect negatively affects code caching / branch prediction / etc. at the same time. One of my planned ideas for performance improvement will be to institute the concept of CPU<->audio sync points so it can operate on longer buffers at a time, but that's going to be a big overhaul of the how original code worked. I suspect eventually, the idea of needing to undersample the NES might even be able to disappear, but for now it's deeply rooted in how it works. (If you look at other old NSF players, like NotSoFatso, this was pretty commonly done.)

As for integer vs float... yeah I mean we have good vector hardware for floats these days. I'm not sure if NSFPlay will ever make it to float, but it's something that could be done in theory at least. If this were a starting-from-scratch situation it would be a lot easier to write something from the beginning that would be easier to switch to one or the other with a #define. (...maybe for NSFPlay 3.)

Thanks for the recording and description of your method. Always good to have something to compare to and consider.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221782)
I feel obliged to point out that I get a factor of 2 change in processor load between using FCEUX with "low" quality sound emulation and "very high" quality sound emulation.

On my previous computer, this was enough to be noticeable. (Like 15% vs 30%)


Also, it's not actually correct to say that converting the output of the noise channel into its spectral components is a "hack". It is quite literally just precalculating and caching the results. The only question is whether the complexity gets any performance benefit (unclear) or if the maintenance cost overwhelms any performance benefit (quite likely)
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221784)
rainwarrior wrote:
It's also oversampling for the output device.

No. Oversampling is when you sample a signal at a higher rate than the signal requires, not whatever your desired output is. To quote Wikipedia:
Quote:
In signal processing, oversampling is the process of sampling a signal with a sampling frequency significantly higher than the Nyquist rate. [...] The Nyquist rate is defined as twice the highest frequency component in the signal.
The Nyquist rate of the NES's audio is also, surprise surprise, its sample rate. The fact that it contains frequencies all the way up to half that rate is exactly why a resampler is needed.

But you're right, arguing semantics isn't helpful. I just don't want people to hear "oversampled" and think it's either a) something unnecessary that the NES itself doesn't do, or b) some kind of enhancement that makes your implementation better than other emulators (or actual hardware) that don't "oversample".

rainwarrior wrote:
Actually, to give you an idea why this affects performance so much for NSFPlay, how often it has to jump back and forth between CPU emulation and generating samples really adds a lot of overhead, and as a side effect negatively affects code caching / branch prediction / etc. at the same time. One of my planned ideas for performance improvement will be to institute the concept of CPU<->audio sync points so it can operate on longer buffers at a time, but that's going to be a big overhaul of the how original code worked. I suspect eventually, the idea of needing to undersample the NES might even be able to disappear, but for now it's deeply rooted in how it works. (If you look at other old NSF players, like NotSoFatso, this was pretty commonly done.)

Yep, repeat does that too. One cycle at a time. One word: ouch.

That's why I stopped working on it, in favour of a VGM player that uses logs and batching to run hundreds or thousands of cycles of each channel before moving onto the next. Once I get around to re-implementing the NES audio chips, I'll rip out my NES CPU emulator, make it spit out a write log, and let the audio emulation run in batches as large as it likes. But that's a ways off yet.

rainwarrior wrote:
As for integer vs float... yeah I mean we have good vector hardware for floats these days. I'm not sure if NSFPlay will ever make it to float, but it's something that could be done in theory at least. If this were a starting-from-scratch situation it would be a lot easier to write something from the beginning that would be easier to switch to one or the other with a #define. (...maybe for NSFPlay 3.)

Would it be helpful to anyone if I put my resampler code up somewhere? It needs a good cleanup, but I feel obliged to put my money (or rather code) where my mouth is, so to speak.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#221796)
Rahsennor wrote:
Once I get around to re-implementing the NES audio chips, I'll rip out my NES CPU emulator, make it spit out a write log, and let the audio emulation run in batches as large as it likes.

That's the paradigm I've been recommending for quite a while: one process emulates the CPU, outputting a stream of writes to another process that emulates the APU signal generation. I concede that parts of the APU would need to be emulated on the CPU side of the pipe as well, such as length counters if the program reads $4015.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#222529)
Accounting for all cycles and, when resampling, an accurate lowpass filter are needed for accurate results. Some box car interpolation can speed up the process, but the larger the ratio, the more inconsistent the attenuation. In my own tests, however, I've found that FIR alone can deliver messy results. So I've found a balance between the two.

Attached is 96 KHz, a box car of 4:1, followed by FIR (Blackman-Nuttall, 160 taps). The results are pretty clean, although the mixed channel portions differ from the hardware render (due to the nature of my MMC5 emulation).

Edit: corrected spelling of Nuttall. Also, the "messy results" I mentioned for FIR alone are probably because large fixed-point coefficient windows don't seem to respond well to raw rectangular pulses.

Update: it turns out the messy results I was getting was due to the limited precision used in interpolating the polyphase coefficients. The results are now cleaner. Now down to 139 taps using a Kaiser window.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#223768)
This is why you can't use boxcar alone (see attachment), as some emulators may do. While 2:1 is mathematically sound, no ratio provides 1:1 frequency response, and a pattern emerges the higher the ratio.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#223776)
A boxcar filter in time is a sinc in frequency, and vice versa. (Your graphs are "just" of abs(sinc(x)) on a semilog plot)

The only robust way to use a boxcar filter is to use it as a close-but-not-quite 1st order lowpass.
Re: Ultrasonic pulse behaviors for 2a0x and MMC5.
by on (#223813)
Yes, boxcar is rectangular interpolation that it lines up with a rectangular sinc window. And likewise it has no side lobe suppression nor any flattening of the top in terms of frequency response. But it can make a good soft lowpass at 2:1 or an okay first order filter. Or in the case of use in my emulator, a reduction of computation time and memory requirements for storing samples (since the first stage uses the "WAVE[cycle>>boxcarshift]+=out" approach found in FCE Ultra).