Music transposing/altering and saving ROM space

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Music transposing/altering and saving ROM space
by on (#30447)
I hesitated to post this in the NESdev section, but in fact altrough gamedev related it's more music related, so I post it here.

I'm currently composing music for my game, and coding it into sequenced format. Because I don't use any kind of PRG bankswitching, I want music data to be as small as possible. And I noted music quickly takes a considerable amount of space, even in compact format.

Since the format I use uses "octave up" and "octave down" commands, I found me always use an insane amount of them at some place where the melody have a lot of transition between notes just above C and just below C, and a lot of bytes could be saved by transposing the song up or down so that the melody would be always above C or below C on those part, thrun not crossing an octave boundary, saving bytes. But by doing this I'm forced to transpose and rewrite the entiere song, and as it may decrease octave crosses somewhere it can incrase them somewhere else. I wonder if there is any way to find the "ideal" tuning of a song so that there is as less octave crosses as possible.
Another big space wasting thing is "short notes". I mean using a note then a silence to ensure the note is short. This will make the song sound much more dynamic than if all notes are played full, where the song will sound blatant boring. But it also significantly incrase the number of bytes needed to code the song, by placing note-silent-note-silent instead of just note-note-note. A solution is to decay the note, but even long decayed notes sounds much less dynamic than short notes.

So I wonder if music data should be composed as it, then once the song is finished and completed consider it "untouchable" and code it as it, or if, as being the same person as the composer and coder, can take advantage of this. Should I transpose my song to have an insane amounts of sharps and flats in it just to save a dozen of bytes ? Should I use longer notes so that it takes less encoding space even if this doesn't sound the same ? Or should I just encode the song as if someone else made it without even considering altering it ? I think that's an interesiting, but somewhat difficult chose to take.

As I consider music transposing barely affect it (only sound lower or higher), I think this can be considered if it saves a lot of bytes. However some people consider key is important, as older pieces of musics are refered as "Bb concerto, F concerto", etc... this makes the key in which they were composed very important. However since this isn't the case any longer today, I think I shoudn't mind transposing.
However, alter the sounds used or lenghts of notes, or even the melody to save a couple of bytes will end up probably a bad idea, as the song won't sound as it was intended. As this isn't dramatic, in the long run it can be a lot of trouble to save a couple of bytes who will keep empty at the end of the ROM anyways.

by on (#30449)
Interesting topic!

The musician in me says to let the song be heard how it was written and intended to be heard. A song will sound pretty different when it's played lower or higher in the chord structure than it was originally written.

On the other hand, it sounds like a great exercise in coding : )

I'd say back up your files, and do the coding to try and save bytes just to do it as an exercise. I'd expect you wouldn't be as happy with the final product if the music didn't sound right to you any longer.

by on (#30450)
I'd think one note per nibble wouldn't be too bad. You'd maybe have the 12 notes, octave up, octave down, and octave set. Maybe also something to re-center the notes so that you don't need octave changes when going from C to Bb.

Edit: I just realized how stupid this post was, sorry.

by on (#30453)
First off, don't try to use the same format for music on your development machine as in the ROM; have an easy-to-enter-and-edit format for development, and a program which converts this to whatever the ROM will have. This way you can experiment with different encodings, and use ones that aren't easy to enter by hand. For example, you could use a format which encoded the delta from the previous note, using fewer bits for small deltas.

by on (#30455)
In fact I've already build my own format for long, and I always first write everything in MIDI and then convert it manually to the format I use. I use one byte per note, and I'm pretty sure that's the best I can go. One tyble for tone and one nybble for lenght (higher tones above 12 are reserved for controls). The delta thing seems interesting, as rarely more than one octave of delta is seen, but +- one octave means range -12..+12 which needs at least 5 bits. With only 4 bits we're stuck with -8...+7, which isn't really good, as you'd have a way to go out of this interval, and would require a more complex format to barely save space at all. More importantly, this would be much more pain to convert manually. That wasn't the point anyways.

Quote:
The musician in me says to let the song be heard how it was written and intended to be heard. A song will sound pretty different when it's played lower or higher in the chord structure than it was originally written.

I guess you're right here. I'm not good with the chord stuff, I usually start with a good melody and then try to put stuff with it that fit well and eventually contine with variants of it. Sometimes on the contrary I start with a baseline and place stuff on the top of it, but I often end up with weird results here. I guess that works pretty well, but this certainly isn't the best one can do.
However, to be honnest, if the song is transposed one or two semitones higher or lower that won't make a really big difference, exept for the insane amount of sharps/flats that will be required to play it (in the case of moving the thing a semitone), which isn't really important since that's no problem for the computer.
And eventually they song will never sound *exactly* as intended as everyone knows no instrument nor synthetiser can reproduce a sound that come close to the NES, you only have to imagine it. However I guess it's good if it sound pretty close to what you wanted.
Re: Music transposing/altering and saving ROM space
by on (#30471)
Bregalad wrote:
Another big space wasting thing is "short notes". I mean using a note then a silence to ensure the note is short. This will make the song sound much more dynamic than if all notes are played full, where the song will sound blatant boring. But it also significantly incrase the number of bytes needed to code the song, by placing note-silent-note-silent instead of just note-note-note. A solution is to decay the note, but even long decayed notes sounds much less dynamic than short notes.


I think what you're talking about is removing the rests. I would forget about decay, and do volume envelopes. The easy way would be to set the volume for each frame (until the envelope loops or ends). The other way would be to write code that generates them (ADSR - attack, decay, sustain, release), but that's probably only good if you're really want a lot of different envelopes.

If you kept the rests also, you could for example in the volume envlope have a start, loop, and ending, and the ending would play when the note length ends.

by on (#30478)
I'm not sure what you think is a rest.
Quote:
I think what you're talking about is removing the rests. I would forget about decay, and do volume envelopes. The easy way would be to set the volume for each frame (until the envelope loops or ends). The other way would be to write code that generates them (ADSR - attack, decay, sustain, release), but that's probably only good if you're really want a lot of different envelopes.

In fact I'm doing it quite differently to both methods you describe. I just made my own method, although it's rather limited (but I wanted it to be simple, I'm thinking of more complex things for my next version of my sound code). It allows to do an attack, and then to decay the note at varied sppeds, but even the fastest speed isn't really fast, because one frame is really a long unit of time when it comes to music/SFX. I wish it would be possible to have a more precise basetime for sound code pratically, but it's not. Unless you have a mapper with IRQs it's hard to do. Of course I could call it twice per frame, one at the beginning and once at the end but that'd be rather weird and won't be a fixed interval between calls. The only thing that can perform faster are hardware sweeps and hardware decays. I can also use hardware decays, but I didn't really use them. I can even use a combination of hardware and software decays, so that the hardware decaying rate is going smaller due to software decay, resulting in a slow-then-fast decaying note. Maybe I could made up something better that can allow notes to be decayed faster ?

by on (#32036)
Key is rather important.

A better idea is to just have a transpose command for your format, that lets you move the base around. Essentially, you remap c->C to say, e->E, and thus stay within the octave.

Crap, didn't notice the date.

by on (#32060)
Yeah it wasn't that long ago.

Anyway, I've noted there is no way to really save a significant amount of bytes by transposing. Where you save a few octave crosses on one place, you'll automatically waste a considerable amount of them at another place of the tune, so there is hardly an "ideal" scale for a tune so that it have minimal octave crosses. Unless you make a programm that searches this, there is no way to tell which scale would be better, so overall it's better to just avoid transposing. (since most people seems to think key is important).

On one other hand if you try to "guess" which key a tune you know well will be played before you actually play it, chances are that you were completely wrong (or at least for me).

by on (#32067)
What I was referring to wasn't transposing per se, as you still ended up with the same audio output.

Picture a system, where notes are packed into 8 bits, 4 bit tone, 4 bit length/cmd.

Tone 0-E would pass through, F would indicate a rest.
lengths 0-C would refer to whatever set you wanted at the time, they aren't important. length D would be octave down, length E octave up, length F would be an escape for lengthier commands.

Tones that pass through are offsets from a base tone, the sum of which is the actual absolute tone played (0-128). Octave up and down just add or subtract 12 to that base.

What I was talking about would be the case of an extended command, that could set the tone base to anything, thus giving you compact access to any contiguous stretch of 14 notes in the scale.

by on (#32070)
But do we need to have 16 distinct lengths for notes? If we assume that a row is a sixteenth note, we can use an 8-entry table of note lengths:
1 sixteenth
2 eighth
3 dotted eighth
4 quarter
6 dotted quarter
8 half
12 dotted half
16 whole
Then allocate only 3 bits to the length and 5 bits to the offset from the bottom note:

Code:
7654 3210  NoteOn/NoteOff command ($00-$D7)
|||| ||||
|||| |+++- Number of rows to wait after this command, from the table
|||| |     [1, 2, 3, 4, 6, 8, 12, 16]
++++-+---- 0-24: Offset from base note in semitones
           25: Hold existing note on this voice
           26: Release existing note on this voice

This would allow spreads much larger than an octave, which should accommodate any non-pathological melody. And it would leave $D8-$FF open for other commands.

What is pathological, you might ask? Listen to these tracks, and ask yourself how your music encoding would handle them:
  • OpenTris 02 triangle
  • OpenTris 06 triangle
  • OpenTris 07 squares (first 10 s or so)
  • OpenTris 08 triangle (first 30 seconds)
  • OpenTris 09 squares (first 15 seconds)
  • OpenTris 10 triangle
  • Covers vol. 1 03 squares
  • Covers vol. 1 09 triangle
  • Covers vol. 1 10 triangle
  • Covers vol. 1 11 squares
  • Covers vol. 1 13 everything
  • Covers vol. 1 15 triangle (in "tetris" part)
  • Comic Bakery/You Spin Me Round triangle
  • Running in the 90s square backing
  • What Is Love? triangle
  • Emerald Hill Zone square bass
  • lj-contrib Korobeiniki channel 1
  • lj-contrib Loginska channel 3, and channel 2 in the overly happy sounding part
  • Most of Tengen Tetяis, for that matter

by on (#32072)
Your format seems interesting, however I doubt I'll adopt it because I'm too lazy to completely re-encode my music data and adapt replay code just to save maybe a couple of bytes.
I don't like too much the idea to have only 8 lenghts, I have 16 but that's not directly the lenght, I use a lockup table. I'm not confortable to how note lenghts works in english I only know them in french. But your sheme seems to disallow triplets, something that's somewhat limitating.

by on (#32073)
That assumes one is using rows, rather than an MCK style input, but your point is of course valid.

That would be why bloopageddon has *another* extended command telling it how many bits are note, and how many are length. Length goes through a lookup table, and for repeated sequences, can apply a per-note macro to *that*.

It carved the command escape out of note range for commands though, as length wasn't guaranteed to exist, since you could set the length cutoff to 0. I got bored and moved to other projects while thinking up a redesign.

It tended to result in really tiny data, but it was a mess to edit without a cheat sheet.

by on (#32075)
American English uses German-style terminology for note durations. But Wikipedia has links between articles in English and corresponding articles in French. So let me translate the table into French:
  • sixteenth note = 1/16 whole note = double croche
  • eighth note = 1/8 whole note = croche
  • quarter note = 1/4 whole note = noire ("black note")
  • half note = 1/2 whole note = blanche ("white note")
  • whole note = ronde ("round note")
  • dot = point de prolongation

In simple meters (e.g. 4/4, 2/4, 2/2), my engine would handle triplets the same way Scream Tracker does: with a "delay noteon by x ticks" command. Triplets would be even easier in compound meters such as 6/8, as the row would be an 8th or 16th note.

ReaperSMS wrote:
That assumes one is using rows, rather than an MCK style input

Tidbit: MIDI specifies song positions in units equivalent to a tracker row. The Song Position Pointer is expressed in units of 16th notes, which are divided into six "MIDI clocks". Smells like mod/s3m/xm, no?

ReaperSMS wrote:
It tended to result in really tiny data, but it was a mess to edit without a cheat sheet.

Or without an assembler. I once made a music playback engine for the GBA, and I wrote what amounts to an assembler to translate textual input to music bytecode.

by on (#32076)
For this one I have a set of macros, but only for the commands, not the note data.

The quality of your triplet timing generally comes down to your tempo and time signature, and how well 60 divides into it

by on (#32081)
Oh thanks tepples this helps a lot. I can now understand note lenghts in english.

Yeah your system is good as long as no triplets are used. I guess what you guys are talking about "macros" is what I call a special command.

And yeah I have a mck-style input (done with all .db with definied variables), it's pretty quickly converted and is flexible and work well. What I dislike in trackers is their lack of flexibility. While they work fine for most regular 4/4 tunes, doing anything else is a headache (okay, I'll admit all songs I've composed are 4/4 right now but I guess I'll try something else someday).