Treatice draft for background music

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Treatice draft for background music
by on (#202000)
Disclaimer: I've been meaning to make a blog post about this, perhaps with example files. This is a rough draft for a treatice on mixing NES bgm music (as opposed to hit singles / chip tunes). It's not fully formulated, but in the end i might add a 'standard' with the goal to keep various titles at a similar volume. Note that not all video game compositions from the commercial era follow these "rules" (often to the games' disadvantage, IMO).

=====

I've noticed that ftm/nsf files sometimes/often have tonal "instruments" peaking at full blowing volume. This seems to be fairly common in the chiptune scene, and for a reason - your instinct is likely to want the track to make an impression and grab attention.

For video game background/soundtrack music, i strongly advice against it. Actually, i'd advice against it overall, but especially for bgm. Here's why.

Ears/the aural sense apparatus/our brain gets tired when exposed to constant or loud sound sources. Your songs will loop indefinitely/for a period determined by the input of the player.

For a rather long period, the music/record industry compressed songs in the mastering process to be maxed out. This, when played as a single between radio talk etc, makes the song force itself on the listener. You can't avoid it. It helps the song win/grab attention (regardless its content). This is also true for commercials, which go to the absolute extreme since they don't need to think of audio quality the same way. That's what you want, at least commercially, for a "hit single". But listen for more than a few minutes or on a few songs in a row, or a loop going perpetually, and the ears will grow tired quickly.

There was, for a period, especially late 90s/early 00s, a plague on cd:s (Metallica's st:anger cd album is a "nice" example) where the volume was maxed out like this, completely ruining the experience in long term. Hopefully online distros have remastered cd:s from that period.

Anyway, the ideal is having a significant dynamic range between short peaks and the average mass of sound.

The complete mix/master should 'look' something like this:
Code:
 __/\__/\._/\_._
/               \

Where the peaks are percussion, mainly. "." are lower melody peaks.

Not like this:

Code:
  ._.___._._.___
 /               \
/                  \
Where the absolute peaks are a tiring mix of melody peaks and percussion peaks; and/or the summed sound floor is up close to the roof.


With the right blend, you can evoke emotions and grab attention (where appropriate), without tiring out ears.


Here's a bullet list on goals:
-The "sound bed" should be low enough to let the ears rest, but strong enough to not vanish.
-Peaks are primarily for percussion and secondarily for rythm (especially rythm bass, not so much mid/treble).
-Melody peaks should be significantly lower than percussion peaks.
-pads/recorders/flutes/reeds and other constant-sustain instruments should blend smoothly at 'sound bed' level.
-Bright and mid-bright sounds should often be mixed in quiter than low and low-mid sounds. That's because the ear is more sensitive to such sounds.
-Short percussion sounds can be as strong as you want/need. Prolonged noise sounds (long splash 80s snare hits, sweeps) shoud be toned down a little.
-Since bass sounds takes more energy to be percieved as the same level as mid-brights, it's no problem if a tri bass is at full volume. It is actually percieved as a more even and natural mix.

Practically, this means a few things in famitracker:
-Squares, who most often serve as leads, pads, and melodious stabs, should generally be lower in volume.
-Tri can't be changed, and should be reserved for bass and percussion
-If your melodious instruments peak at F, you should set the volume channel to somewhere between 8 (weak) and C/D (very strong).
-If your driver doesn't support the volume channel or you need to use it for something else, keep your instrument peaks somewhere in the mid range.

As a bonus, this also gives you more control of sfx sounds: do you want it to stand out or melt in, or somewhere in between? The choice is more readily available now.

Side notes:
-Young ears are generally more tolerant of constant volume masses.
-But less tolerant of super-bright noises
-people in the ADHD and ADD spectra have different tolerances for constant, repeating and changing sounds than people outside the ADHD/ADD spectra.
Re: Treatice draft for background music
by on (#202042)
I think you might be recapitulating the loudness war, in chiptune format...
Re: Treatice draft for background music
by on (#202053)
I've written and deleted a few responses to this. Something about this post is hitting a nerve in a way I do not like. It sounds like you are, in the end, just giving advice about making instruments that don't stay maxed out all the time. Fine, that's valuable, but I think it's one of the first things a reasonable composer thinks of.

These "goals" seem pretty restrictive, since they sound more like rules.
Re: Treatice draft for background music
by on (#202062)
I apologize, i didn't mean to step on anyones' toe here. It's not meant to tell what's right and wrong (it's more meant as optional guidelines on the theme. It's also a draft, so the tone is maybe not quite right).

For example, i recognize that the volume resolution gets truncated this way. That's not a problem for fast moving envelopes, but for slow decays/swells, the jump from one level to another becomes more audible the fewer steps you have at your disposal, and so you must either hide those slow changes when the opportunity rises in the mix, or live without them. Just that point in itself means these guideline drafts aren't a "one size fits all" solution.

But generally, it's not really restrictive in the end. Rather, it's a technique to let bass and drums shine through; something loudness all around would restrict.

Experience of sound is very subjective. But one thing about sound is at the same time not, as it has been measured in quantitative studies and been ISO-standardized: the equal-loudness contour. We percieve bass worse than mid-highs and highs.

Back to the subjective: even with proper headphones or speakers, tri bass sometimes have trouble punching through the mix if there's other loud elements we/i'm more sensitive to.

Quote:
It sounds like you are, in the end, just giving advice about making instruments that don't stay maxed out all the time.

That's not it. I'm primarily advocating that bright peaks (max between attack and decay) in an envelope be lower in volume than dark and percussive/atonal ones. Same thing with peaks followed by slow decays as opposed to peaks with short ones. That's not the same as the sustain level being reasonable. Though even that is not guaranteed in some commercial NES titles.

Quote:
loudness war

For reference, here's the wikipedia article on that. https://en.wikipedia.org/wiki/Loudness_war
Interestingly, it note Metallica's "death magnetic" as the prime example. I haven't even listened to that. One album on that list that i have listened to and found very fatiguing is Depeche Mode's "playing the angel". Expression to add to my english vocabulary: "crest factor".
Re: Treatice draft for background music
by on (#202072)
Full volume squares usually aren't usually the best choice even without taking sound effects into consideration, just because of the fixed volume and relative dullness of the triangle channel.

I often like to think that the triangle channel is the soul of the NES sound, because it's the most rigid of the channels, and when I'm composing everything else has to bend to accommodate it.


On SFX, there's a related discussion on how they should interrupt the music:
  • 1. SFX interrupts channel for its duration, resumes immediately when finished. (Simple, but the returning sound is abrupt and off-beat.)
  • 2. Like 1 but channel resumes with a fade in. (Kind of feels like "ducking" for talking over music, may or may not be objectional feel.)
  • 3. SFX interrupts channel then goes silent, resumes music at next note event. (Returning sound is on beat, but the music should be composed to accommodate, e.g. long held notes should be sustained by a series of connected/"legato" notes.)
  • 4. SFX interrupts channel whenever its volume is louder than the music. (Loud music can completely mask the SFX or interact weirdly with it. SFX basically have to be universally loud to accommodate, can't really fade off, and music return tends to be abrupt.)

I'm more or less 100% in camp 3, but opinions vary.

Previous discussions of that topic:
Re: Treatice draft for background music
by on (#202077)
Quote:
(...) triangle channel is the soul of the NES sound, because it's the most rigid of the channels, and when I'm composing everything else has to bend to accommodate it.

I subscribe to that. Since it's so rigid, it should be the reference point.

It's also a good point that the tri channel is relatively muffled/not as rich in overtones (apart from that shrill overtone noise i've learned to accept and love), so it can't really compete on the same terms unless you push its base frequency up a few octaves (generally not very pleasing, IMO).

Quote:
2. Like 1 but channel resumes with a fade in. (Kind of feels like "ducking" for talking over music, may or may not be objectional feel.)
I suppose the trick is getting the fade-in timed just right. Too quick; abrupt. To slow; interfering. The question is is if there's a golden blend or just a bit of both traits you'd want to avoid. Should louder tones take longer to fade in, or should the time be constant and the steps steeper? A blend of the two? Lots of playroom. Just not as intuitive as turning one or a few knobs on a compressor/ducker.