Trying to Understand BRR

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Trying to Understand BRR
by on (#115181)
Hi guys,

in order to thoroughly understand BRR, I am coding my own BRR encoder. I am starting simple, by just using a filter of 0 for all packets.

I started with this Veldt article: http://emureview.ztnet.com/developersco ... PU/spc.htm
If you scroll down to the section "SPC-700 SAMPLES" I followed his directions there.

I dont really know much about waveforms but I found a simple WAV file, stripped it to RAW, and started coding my encoder. experimenting a lot with it because it just isn't offering a clear sample.
I went and found BRRTools and, that program is doing things above my scope, however when I use it to encode the same sample with only filter 0, I get a pretty similar looking resulting brr file as from my encoder. In fact, practically every byte in the job is only different by a range of 0-2.

I can't speak much accurately from this point forward, which is why I came here seeking help. I am posting up a directory on my webspace with the WAV file, raw file, brr file from my encoder, brr file from BRRTools with filter set to 0 only (resampling n1.0, which should not alter anything at all), as well as my current source code. OK I would very much appreciate you cool dudes' insight onto this issue of mine.

Thanks bros!

Bazz

p.s. Files: http://www.cs.umb.edu/~bazz/mybrr/

Note: I manually pre-prepended 10 bytes of 0x00 to my encoder's output to more easily match with results from BRRTools.

p.p.s I have a the SNES source code and SPC700 source, I didn't bother to upload it.. Hmm I'll do that now.
Everything is in the snes_side directory and can be compiled with WLA and Tasm. I'm not going to get into instructions. Good_music.smc is with the BRRTools, and bad_music.smc is with my brr encoder. Of course, with filtering, the good_music sample can be even better. Tried it.
Re: Trying to Understand BRR
by on (#115182)
I would really love to find a thorough explanation behind the picking and choosing of range and step values. When you look at my code you might understand my logic in how I chose. (Apparently I have no knowledge in this area DSP, but I did look at some pages of the signature DSP articles that Tepples always referred to, I believe from years ago: XXX page (not caring to find it )

I also haven't really investigated into ADPCM, perhaps that research may shed light on what I am looking for in understanding waveforms in English. And performing the comparisons of results using different range values and step values and how to find the best one.
Re: Trying to Understand BRR
by on (#115187)
The simplest approach for a BRR encoder would be to not use any of the filters (i.e. use filter 0).

The next simplest approach would be to, for every block in the sample, try all possible filter settings and compare the result to the original audio data (e.g. using the mean of squared errors, or some more advanced measurement if you prefer) and use the filter setting that results in the closest match. You could expand this to take neighboring blocks into account.
Re: Trying to Understand BRR
by on (#115190)
I think I'm going to try encoding at different ranges, and decode, get absolute deviation for all points, get an average deviation. After doing this for all levels, compare for smallest average deviation and select that range, use it's precalculated values for the BRR packet.
Re: Trying to Understand BRR
by on (#115220)
Well. It turns out that even my first attempt which brought me here actually 'works' the only issue was that I was using a ungetc() function it sucks balls to say the least.

so I got rid of that :'(

But if it wasn't for this, I wouldn't have learned about sum of deviation. I dunno what that is, I made it up! Oh :mrgreen:

I'm so tired.. I've been up all night:|

Well this exercise helped me in learning a lot about waves. i lerned a lot like a crapload a boutthe wavs. I watched some obscure videos about it too. check this out: http://www.youtube.com/watch?v=8x0ZynWlmDI Im surprised there's no Video bbcode, but I like that there isn't. Nesdev is my safe haven from that bullcrap on facebook and memes and shit.

Now, I don't completely understand how DPCM / ADPCM correlates to the BRR format, maybe because I learned non-traditionally. Once I started learning more about DPCM, I said OH, it's the delta's between adjacent samples (or involving multiple past samples). However!! Low and behold, as the SNES Dev Manual kinda illustrated to me, the steps all stem from 0. I originally interpreted it this way, but the learning of DPCM made me think maybe the step was from the last sample step and so on. I don't know if I'm misunderstanding how this Delta Farce comes in the situation, but I'm not seeing in the BRR yet, nope. And just because filtering will use the past samples doesn't have anything to do with the delta of the steps, does it?

YUGH, IDK.

Hey well if you would like to see my roughly completed source code, and by rough I mean I added support for WAV files but not much processing of them at the moment TEHE.

Another thing I don't know the purpose of, for instance in BRRTools, you can resample your wave file, and interpolate it (don't know what this means yet). Yowza. Don't know what the means and why you'd want to do it.

Then there's the filters to learn about...
Another interesting, mildly confusing document on the adpcm. Hold, this first:

MSDN states:
Quote:
ADPCM stores the value differences between two adjacent PCM samples and makes some assumptions that allow data reduction. Because of these assumptions, low frequencies are properly reproduced, but any high frequencies tend to get distorted.


but in BRR, where is this 'value differences' between 2 adjacent PCM samples?? I suppose it usesthe adjacent samples to get the range value. yeah that's it. Stores/Uses, it's all the same balogna (baloney), but I needed to hear *uses*.

Other useful info on this stuff:
http://ww1.microchip.com/downloads/en/a ... 00643b.pdf
=> Right (t)here baby, it includes a ADPCMEncode() function in C along with detailed comments on the process. That was useful and you could see the light shine a bit on some of that mysterious logic in BRRTools' rip from adpcm.c of SoX, I'm still getting tricked up with it.

I never did link my updated source code. EH i'm too lazy for it.
Well I'll do it anyways: http://www.cs.umb.edu/~bazz/mybrr2/
Re: Trying to Understand BRR
by on (#115243)
BRR is a (proprietary) form of ADPCM. IMA, as used in DS games, is another (standardized) form of ADPCM. There are a few key conceptual differences between the two. Unlike BRR:
  • IMA predicts the step index (that is, range) value for the next samples from the residual values for the previous samples instead of storing it explicitly.
  • IMA uses a fixed filter pred[t] = y[t - 1].
  • IMA stores end and loop points separately.

As for picking which filter to use, try filtering with all four filters and picking the one that gives the smallest peak value. Then choose the range based on this peak value.

As for having to stay within a 15-bit range in order not to screw up Gaussian filtering, there are a few ways to do this.
  • Scale all samples down by 50% (-6 dB).
  • Load the whole sample, find the maximum and minimum, and normalize the sample to fit in +/- 16368.
  • Provide amplification as an option.

Finally, Gaussian filtering strongly attenuates the high frequency. You may want to provide an EQ option to boost the treble at encoding time to compensate for this.
Re: Trying to Understand BRR
by on (#115277)
Quote:
As for picking which filter to use, try filtering with all four filters and picking the one that gives the smallest peak value. Then choose the range based on this peak value.


Can you explain to me why picking the smallest peak value is more preferred than picking from what is closest matched to the original sample values? So it's just to compress the waveform?

Quote:
As for having to stay within a 15-bit range in order not to screw up Gaussian filtering, there are a few ways to do this.
  • Scale all samples down by 50% (-6 dB).
  • Load the whole sample, find the maximum and minimum, and normalize the sample to fit in +/- 16368.
  • Provide amplification as an option.

Finally, Gaussian filtering strongly attenuates the high frequency. You may want to provide an EQ option to boost the treble at encoding time to compensate for this.


Thanks for the recommendations
Re: Trying to Understand BRR
by on (#115279)
bazz wrote:
Quote:
As for picking which filter to use, try filtering with all four filters and picking the one that gives the smallest peak value. Then choose the range based on this peak value.


Can you explain to me why picking the smallest peak value is more preferred than picking from what is closest matched to the original sample values? So it's just to compress the waveform?


Hm well I just tried something which I thought is to the effect or what you are recommending. Since I'm still stuck on Filter 0, this is what I tried:

I already have code that for every range value, for every sample value, finds the 2 steps that the sample lies between and calculates the one with the shortest distance. An array of all of these distances is stored for every range. To select the optimal range, I add the deviations for all samples in a range for all ranges, and I choose the one with the smallest accumulative deviation.

So what I altered in order to do something like what you were saying, is that my code also stores all the nibble step values. What I did was find the biggest step value (perhaps incorrectly) for each range, and then use the range with the smallest biggest nibble'd step value. This resulted in a poorer sample quality, as I compared with my original code.

Perhaps am I confusing biggest nibble with Peak nibble. Is it not just the highest value in a BRR packet, is it specifically a packet that has the nibbles for a section of the wave where it hits a peak. How can one find a peak.
Re: Trying to Understand BRR
by on (#115290)
bazz wrote:
Can you explain to me why picking the smallest peak value is more preferred than picking from what is closest matched to the original sample values? So it's just to compress the waveform?

Each block specifies a filter and a step size for the prediction residues. Each step size imposes a maximum absolute value for the prediction residues: no more than 7 steps away from zero. The smaller the absolute value, the smaller the step size you can use without saturating, and the smaller the step size, the closer you can match the original waveform.

Quote:
So what I altered in order to do something like what you were saying, is that my code also stores all the nibble step values. What I did was find the biggest step value (perhaps incorrectly) for each range, and then use the range with the smallest biggest nibble'd step value. This resulted in a poorer sample quality, as I compared with my original code.

By "biggest", are you referring to most positive or largest absolute value? Or perhaps you're allowing some saturation in an individual block. Could you upload wav files of the original sample, converted to BRR and back to wav using your suggestion, and converted to BRR and back to wav using your interpretation of my suggestion?

Ultimately the fastest way to avoid saturation is to find the largest absolute step value in a block, find the smallest range that fits that value, and encode with that range.