Hi, so when playing Rockman 2 (Megaman 2), I am getting a buzzing sound all over the place. This can be fixed with band-limited resampling or sinc resampling, but those things are rather intensive.
aliaspider recently noted that by adding "if(period == 0) return 0;" to the top of my APU::Triangle::clock() function, that the buzzing went away completely, even with a simple hermite audio filter (many, many times less CPU intensive.)
Could anyone please look at my triangle emulation code and let me know if that is a proper fix (eg how the hardware really works), or if the game is really supposed to create this buzzing sound without proper audio resampling?
(Note that I use a variable-length integer class. So a uint5 type, when the value is 31, and you add one, will wrap to zero automatically. Much like your traditional uint8 type does now from 255->0. This is why there aren't as many bit-masks in the below code.)
Many thanks in advance!
Code:
struct Triangle {
unsigned length_counter;
uint8 linear_length;
bool halt_length_counter;
uint11 period;
unsigned period_counter;
uint5 step_counter;
uint8 linear_length_counter;
bool reload_linear;
void clock_length();
void clock_linear_length();
uint8 clock();
void power();
void reset();
} triangle;
Code:
void APU::Triangle::clock_length() {
if(halt_length_counter == 0) {
if(length_counter > 0) length_counter--;
}
}
void APU::Triangle::clock_linear_length() {
if(reload_linear) {
linear_length_counter = linear_length;
} else if(linear_length_counter) {
linear_length_counter--;
}
if(halt_length_counter == 0) reload_linear = false;
}
uint8 APU::Triangle::clock() {
uint8 result = step_counter & 0x0f;
if((step_counter & 0x10) == 0) result ^= 0x0f;
if(length_counter == 0 || linear_length_counter == 0) return result;
if(--period_counter == 0) {
step_counter++;
period_counter = period + 1;
}
return result;
}
void APU::Triangle::power() {
reset();
}
void APU::Triangle::reset() {
length_counter = 0;
linear_length = 0;
halt_length_counter = 0;
period = 0;
period_counter = 1;
step_counter = 0;
linear_length_counter = 0;
reload_linear = 0;
}
Code:
void APU::write(uint16 addr, uint8 data) {
const unsigned n = (addr >> 2) & 1; //pulse#
switch(addr) {
...
case 0x4008:
triangle.halt_length_counter = data & 0x80;
triangle.linear_length = data & 0x7f;
break;
case 0x400a:
triangle.period = (triangle.period & 0x0700) | (data << 0);
break;
case 0x400b:
triangle.period = (triangle.period & 0x00ff) | (data << 8);
triangle.reload_linear = true;
if(enabled_channels & (1 << 2)) {
triangle.length_counter = length_counter_table[(data >> 3) & 0x1f];
}
break;
case 0x4015:
...
if((data & 0x04) == 0) triangle.length_counter = 0;
break;
}
}
void APU::clock_frame_counter() {
frame.counter++;
if(frame.counter & 1) {
pulse[0].clock_length();
pulse[0].sweep.clock(0);
pulse[1].clock_length();
pulse[1].sweep.clock(1);
triangle.clock_length();
noise.clock_length();
}
pulse[0].envelope.clock();
pulse[1].envelope.clock();
triangle.clock_linear_length();
noise.envelope.clock();
if(frame.counter == 0) {
if(frame.mode & 2) frame.divider += FrameCounter::NtscPeriod;
if(frame.mode == 0) {
frame.irq_pending = true;
set_irq_line();
}
}
}
The hardware triangle is NOT halted by 0 in the period register.
It's very normal to halt it in emulation because of the reason you mentioned. It's really hard to implement a filter than can cut it out as nicely as your analog hardware does.
You can verify this on hardware by setting the period to 0 and turning the triangle on and off. You'll hear a popping where the the triangle starts and stops.
My choice for NSFPlay was to halt on period 0 by default, but I gave the option to turn it off (even though my oversampling process is not good enough to suppress it entirely, it still comes through as a high pitched aliased ringing).
One hack that's probably close enough is to output a constant 7 while the period value is less than 2.
As everyone else as said, you've run into the world's worst aliasing problem. In practice, you're probably safest using tepples's suggestion, possible even for periods as low as 0-3, since all of those are ultrasonic for most people.
AFAIK the only one that matters in practice is period 0. I've seen many games that use period 0 instead of halt (e.g. Mega Man 2, Silver Surfer) but I've never seen any try to use other ultrasonic periods for anything. There's no reason for anyone to do so, really.
Alright, thank you very much for the information.
Was hoping for an easy and correct fix =(
Only cheap solutions seem to be to falsely halt on period 0, or to simply decimate the audio 32x prior to resampling.
I suppose I'll keep looking for a more efficient / simplistic resampling algorithm than windowed sinc / band-limited synthesis that can remove the aliasing ...
Tepples's suggestion (just have the DAC output a constant value of 7.5 when ultrasonic) is actually what correct antialiasing would result in, so I'm not certain it's actually a hack. Especially if you keep the phase running for whenever the game brings the pitch back down to audible frequencies.
It's what correct anti-aliasing would do when the triangle state is not changing. It's not correct during any transition, however. As I mentioned, the triangle can still be halted in its high frequency state, and this absolutely makes audible noise on the hardware. This isn't important to emulating any game I know of, however. Possibly Mega Man 2's triangle has a bit more bite if you correctly emulate this, though aesthetically it may be worse off for it.
I prefer to halt the triangle instead of jumping to 7.5 because it very slightly reduces the popping noise of a halt by freezing at current level instead of jumping to 7. It's perhaps a tradeoff, though, since the triangle position also influences the nonlinear mix. Though, the standard nonlinear mix curve we currently use isn't totally accurate to begin with, not to mention the huge variability in hardware mix between the two APU pins, so that kind of subtlety is a bit lost on the NES as a general rule. The games that do this are really just trying to silence the triangle anyway, so it's not a big deal either way. Both methods result in the expected silent triangle, and the difference is quite subtle. (I doubt I could pick them apart in a blind test.)
Quote:
This can be fixed with band-limited resampling or sinc resampling, but those things are rather intensive
Is the resampling performance really that big of a deal? (unless you're targeting really old hardware). I'm using Blargg's Blip_Buffer in a chiptune player I'm writing for mobile phones. As an example, my GBS player uses a total of 8 Blip_Synths (4-channel stereo), each clocked at 1 Mhz, and this runs just fine on my Galaxy S2+.
As I understand it, the performance of BLEP resampling algorithms like blargg's is O(n) in the number of transitions. Doing it without hacks is fine for square and DMC because they change less often. DMC is limited to how fast writes to $4011 can occur, and square is limited to two transitions per 144 cycles because it shuts off when the period value becomes less than 8. But triangle and noise can have a transition every 1 or 2 cycles. Do GBS files routinely drive the wavetable at period $7FF (inverted vs. NES) and noise at maximum rate?
Quote:
Do GBS files routinely drive the wavetable at period $7FF (inverted vs. NES) and noise at maximum rate?
Probably not. My point was that I have 8 synths (which I know is overkill) and still get performance that is good enough for a mobile phone, and you'd really only need 1 or 2 synths for NES audio (ignoring expansion audio).
byuu wrote:
This can be fixed with band-limited resampling or sinc resampling, but those things are rather intensive.
Aren't you famous for your emulators' CPU requirements? Surely, a little more wouldn't hurt
I wrote a low CPU utilization SSE filter (512 tap FIR. IIRC, it accounts for about .3% CPU utilization on my 1.86GHz Core 2 Duo laptop). If you (or anyone else) are interested, I'll post the code.
FWIW, you can get a massive performance boost in a bandlimited polyphase resampler if you don't care if the sample rate conversion ratio is slightly off(fewer impulse response phases need to be stored, and you don't have to interpolate between phases anymore).
Like so:
1789772.72 * 7 / 261 = 48001.5672
1789772.72 * 12 / 487 = 44101.1756
...and by using 16-bit samples and impulse response coefficients, and creating four copies of each phase's impulse response with zero-padding on the front(and end for SIMD multiply granularity) to account for possible alignments of the input samples read position, you can utilize MMX to make a decent-quality resampler that'll run significantly faster than realtime even on a Pentium II.
In the current release of nemulator, I decimate the APU output by 40 to get a ~44671Hz sample rate. I then adjust the buffer's playback rate accordingly and let Windows deal with resampling it to the sound card's playback rate. I wish I knew exactly what it was doing under the hood...
Anyway, decimating by 40 (using floats with appropriately sized filter and sample buffers) allows everything to be aligned on 16-byte boundaries, so the SSE implementation is really fast.
I'm currently working on an arbitrary sample rate converter (that doesn't require the OS to perform sample rate conversion). It's working well, but results in a few % additional CPU utilization. I think I'll be able to get it within a couple of points of the current implementation, though.
James wrote:
Aren't you famous for your emulators' CPU requirements?
Probably, but I try not to entirely waste CPU cycles. They usually go to emulating fine details. The sinc filter cuts the framerate in half in NES and GB mode. When emulating the SNES+GB at the same time with that filter, it gets pretty bogged down.
The MM2 triangle popping is really easy to hear in Crash Man's stage:
https://dl.dropboxusercontent.com/u/200 ... ycrash.mp3. Here's a version with the triangle isolated:
https://dl.dropboxusercontent.com/u/200 ... lepops.mp3. The period is set to 0 to "silence" the triangle.
Those samples are using band-limited resampling via blip_buf btw, so that alone won't fix the problem. Pretty sure I've heard it in a playthrough done on a real NES as well.
I silence (halt) the triangle when the period is less than two, corresponding to the ultrasonic frequencies (0 => ~56 KHz, 1 => ~28 KHz). I also silence it when the period is greater than 0x7FD after a tip from Kevtris, as some games apparently "disable" the triangle channel by setting the period to 0x7FF/0x7FE. Not sure how that would sound with overtones from the stepiness and all, but he indicated it probably wouldn't be deliberate.
You could use a LUT for the triangle steps and premultiply the values for non-linear mixing as a microoptimization btw. The code still comes out pretty obvious.
ulfalizer wrote:
I also silence it when the period is greater than 0x7FD after a tip from Kevtris, as some games apparently "disable" the triangle channel by setting the period to 0x7FF/0x7FE. Not sure how that would sound with overtones from the stepiness and all, but he indicated it probably wouldn't be deliberate.
Tetris (J) uses a low pitch on triangle in the intro to "Technotris". The prominent 31st and 33rd harmonics of a low-pitch triangle add a noticeable metallic character to the intro. Later in the song, it gets silenced properly.
Bullet Proof's Tetris uses the triangle at a pitch of $686 as a low fill or something, kinda unclear how it's supposed to work musically (though it's an appropriate bass note C, kinda). It wouldn't be caught by silencing $7FF though.
In the high frequency case silencing the top notes is a compromise that produces a more accurate sound (despite less accurate underlying emulation) due to the limitations of your resampling process.
In the low frequency case you're not getting more accurate emulation OR sound, so both of those goals are defeated. Maybe it sounds nicer though for games that do it. What games are these, btw? I haven't run into that problem yet.
rainwarrior wrote:
In the low frequency case you're not getting more accurate emulation OR sound, so both of those goals are defeated. Maybe it sounds nicer though for games that do it. What games are these, btw? I haven't run into that problem yet.
Yeah, the only point would be "nicer" sound for games that attempt to silence the triangle that way. Found a log fragment:
Code:
<@kevtris> yah
<@kevtris> oh yeah
<@kevtris> add a few more
<@kevtris> if triangle period is > 7fdh mute
<@kevtris> some games set it to 7ff (max) to "mute" it
<@kevtris> that solitaire game "mutes" the triangle by setting it to 7ff
<@kevtris> so you can hear the low pitch growl if you listen
<@kevtris> you can hear its aliasing
I'd be more concerned with Bubble Bobble. I'm pretty sure it sets it to something extremely high to silence it in the main theme melody.
Hmm, bubble bobble is unusual. It uses halts properly in the intro, but the short bass notes are terminated by setting the period to 1.
Thanks for the tip, I'd never noticed anything using 1 to silence the triangle before. Kinda bizarre.
rainwarrior wrote:
Hmm, bubble bobble is unusual. It uses halts properly in the intro, but the short bass notes are terminated by setting the period to 1.
Thanks for the tip, I'd never noticed anything using 1 to silence the triangle before. Kinda bizarre.
Maybe they noticed that 0 sounded bad, switched to 1 and noticed that it sounded less bad, and called it a day.
What you should be doing is checking whether the period of the channel generates a waveform that completes a full oscillation in less than 2 samples. So in other words, if you're emulating the sound and outputting at 44100hz, then a triangle wave (or any wave, really) that is higher pitched than 22050hz should be ignored (and optionally, you can output the DC bias of the waveform instead, since that's basically what the output would turn into at super-high frequencies).
This is the technically "correct" way to handle this kind of problem, but simply checking whether the period setting is low enough seems to be good enough for most games.
Drag wrote:
What you should be doing is checking whether the period of the channel generates a waveform that completes a full oscillation in less than 2 samples. So in other words, if you're emulating the sound and outputting at 44100hz, then a triangle wave (or any wave, really) that is higher pitched than 22050hz should be ignored (and optionally, you can output the DC bias of the waveform instead, since that's basically what the output would turn into at super-high frequencies).
This is the technically "correct" way to handle this kind of problem, but simply checking whether the period setting is low enough seems to be good enough for most games.
My signal-fu is a bit weak, so bear with me, but wouldn't outputting the DC bias be the same thing you get with bandlimited resampling out of the box? As far as I've understood, the triangle pops in e.g. MM2 happen on the real thing too, so getting rid of them is not just a fix for emulators but also an improvement over the baseline (if you prioritize "nice" over "accurate" in that case).
The motivation for stopping the triangle instead of e.g. setting the output level to 0 was to avoid jumps in the output level. One downside is that it changes the phase of the triangle. Not sure how much of a problem that is in practice.
This might be a stupid idea, but what if you let the triangle run until its level matches the DC bias and then halt it? That way you'd get both the correct volume level with non-linear mixing (I think) and also no jumps in the output level.
Yeah, the DC bias is the technically correct output, since a filter just removes all the frequencies above Nyquist, leaving the 0Hz DC alone, which is equivalent to 7.5 on the DAC.
As for enhancing the output to reduce clicks, you could do a quick cross-fade between the triangle still playing and the DC bias. Experiment with the duration, as too quick will still have some high-frequency content that's noticeable. Or simpler, just put an IIR low-pass filter that your turn on when stopping the triangle, which smoothly moves the amplitude to the DC bias level. e.g. level = current DAC, then for each sample, level = (level * 0.95) + (dc_bias * 0.05)