Another person needing music/sound engine help!

Another person needing music/sound engine help!
by Roth on 2009-04-06 (#45229)

Despite my love for using NerdTracker2, I've decided that it is too hard for me to use in a game. The only way I can figure out how to get sound effects to play while using the NT2 engine is to completely write out all the accesses to the noise channel. That really limits the sounds I can do, and the music as well :/

So I have started on a music engine for myself, but I am still in the learning stages of using the sound registers. Here is a pastey of what I have at this point (I am planning to rewrite the whole thing, because I realize this is horrible):

http://pastey.net/111765

http://robertlbryant.com/temp/music_engine_test.nes

What I am asking for is advice on how to do this stuff. I have a few ideas from a couple of IRC folk, but I figured it'd be nice to get everyone's input from here, as well. For instance, data access and storage, initialization stuff... I'm really lost when it comes to sound on the NES, but I think I'm to a decent start here.

Thanks guys!

by Celius on 2009-04-06 (#45242)

Creating a music engine can be a challenging task. Basically the main goal in creating a good music engine is to get as much space, time, and flexibility out of it as possible.

The most important thing in making a music engine for the NES is cutting down on space consumption. It is easy to make a fast and flexible engine, like the one you demonstrated. If everything is just a bunch of raw values to store into all of the sound registers, you have probably as much flexibility as you can get. However, in the long run that will take A LOT of space, if you do everything you want to do. So you'll have to come up with a different plan.

For a music engine, I currently have the music data separate for each channel, so each channel gets their own "sheet" of music. In this "sheet", I tell the music engine all it needs to know to make music with a particular channel. This sheet usually contains a "command byte" (or command bytes) followed by data bytes. The command byte tells the music engine everything like "change note" or "change length of note", and the data bytes that follow say what note to play, or how long I should play it. The command byte uses each bit as a command:

;Example square wave sheet command byte

7 - Loop Song
6 - Change note
5 - Change length of note
4 - Change instrument
3 - blah blah
2 - etc.

For example, I would have a line that looks like this:

.db $60, $07, $40

The first byte is the command byte. Since bit 5 and 6 are set, I'm saying I want to change the note and the length of the note. The bytes that follow are data bytes. I'm saying I want to change to note "$07" and play it for "$40". When I say I want to play note $07, I'm actually giving the engine an index to a look up table that contains 11-bit entries, which are the pitch values to store into $4002/$4003. So the 8th entry ($07 represents the 8th entry) would give the engine the pitch to play an E on the lowest octave. This saves me space to define notes with 1 byte instead of 2 by creating a notes table. Then when I say "play for $40", I'm saying play for 64 tempo ticks. I think from looking at your example code, you know probably what I mean by that. If not, a tempo tick is the smallest measurement of time in music, like a pixel is to the screen. There is no such thing as "half of a tick". It's either a tick, or it's not. In my code, I specify the tempo as an 8 bit value, and I add it to another 8-bit value every frame. If that value wraps around, there's a tick. So if I specify the tempo as $F0, here's what will happen:

Tempo = $F0
Counter = $00

Frame 0: $00 (Counter) + $F0 (Tempo) = $F0 ;no tick
Frame 1: $F0 (Counter) + $F0 (Tempo) = $E0 ;tick
Frame 2: $E0 + $F0 = $D0 ;tick
Frame 3: $D0 + $F0 = $C0 ;tick
Frame 4: $C0 + $F0 = $B0 ;tick
...
Frame 14: $20 + $F0 = $10 ;tick
Frame 15: $10 + $F0 = $00 ;tick
Frame 16: $00 + $F0 = $F0 ;no tick

With the tempo at $F0, there is nearly a tick every frame. All music logic happens ONLY if a tick happens. Everything is counted in ticks. Like I said, it's the pixel of time. If you want a tick to happen every frame, set the tempo value to $FF. But when adding it to the counter over and over, set the carry before adding instead of clearing. This will guarantee a tick every frame. But yeah, that's how that works.

That's usually the system for making music data. You have a collection of command bytes followed by data. It's up to you how you want to set the command byte up though. You could do it like me, and have each bit represent a different command. But I've also seen people squeeze a note and note length into a single byte, sacrificing certain values as command bytes. For example, values $00-$DF represent different note/length combinations, and $E0-$FF represent individual commands. This might be a better approach to doing things for real space conservation.

You might also be interested in using the concept of instruments. Of course, you're only given the square waves, triangle waves, noise and limited sampling, but you can simulate different instruments with these channels. For example, you will manipulate the volume differently for an electric guitar-like sound rather than a piano-like sound. I use instruments in my engine. All you can do for defining these "instruments" is really define what kind of pitch distortion or volume envelopes they have (and duty cycle), and apply these to every note that's played. I'm actually creating an engine with Bregalad that uses more than one channel for some instruments. This engine will be really awesome, hopefully, but it's getting pretty complicated by that point.

Also it's pretty important that you do everything yourself. Okay, by this I mean don't let the hardware do things for you, such as volume decay. Don't use hardware volume decay! In my engine, I set up the registers for the square waves to play one big endless note at the same volume until the end of time. But of course, I want some volume decay, so I manually update the volume registers to do that. Also, I would stay away from sweeps and manually adjust the pitch each frame to get the same effect. Get as much control as you can get; don't rely on the hardware to do very much for you.

I can't think of anything else to say right now. I hope this helps some, and ask if anything is too confusing.

by Bregalad on 2009-04-06 (#45244)

I love writing music engines so if you have a specific question go ahead and ask. You don't seem to ask specific questions here so it's hard to answer.

If the question is "how do I make a sound engine" it's long to answer so I don't want if that wasn't the problem...

Celius and I are currently working on a portable standard and guideline for sound engines. You have to obey some cross-platforms standards, but functionnal algoritms would be given in pseudo-code and any programmer could write it's own driver for any machine.

Altough I'm relly very very busy those days (I have to pass 1st grade at university this summer and this is extremely hard), so I'm not really progressing at it, nor am I on my NES game.

by tepples on 2009-04-06 (#45245)

Step 1: Make a lookup table with the periods corresponding to each note of the chromatic scale, and then make a subroutine to load this period into $4002+x and $4003+x.

Step 2: Make a subroutine that does this for each channel x=0, 4, 8, 12:
1. Call instrument subroutine
2. Call sound effects subroutine
3. Copy volume-duty from into $4000+x, and use the subroutine of step 1 to set the pitch. (Only write to $4003+x if it has changed, or if you are starting a new note.)

The instrument and sound effects subroutine update two variables in zero page: desired volume-duty and desired pitch. You might want to get the sound effects done first.

Step 3: Make a counter to advance through a table of note pitches, one table for each channel. When you get to a note number, start a note on the instrument. When you get to the value for "rest", end the note.

by Drag on 2009-04-06 (#45246)

As someone who's written a music engine himself, I think I could help you out.

The barebones basic concept that you need to know is that music, in terms of programming, is just simply a list of pitches and durations. "Play pitch xx for yy frames". For example, my music engine's data format is <note> <duration> <note> <duration> etc etc.

So, in order to iterate through this list, for each channel, you're going to need:
- Two bytes, for a pointer that points to your current location within the list
- One byte, for a counter that keeps track of the duration the note needs to play for.

Other bytes can be for added features, like looping, wavetable-based envelopes, and other complex stuff that you shouldn't worry about for your first time programming a music engine.

Next, each song is going to need a header, which contains the pointers to the data for each of the channels. The idea is that you pass the pointer to a song's header, and then from that header, you grab the pointers for the channels.

Ok, so for the notes, like Tepples said, you'll need a table of the periods, to feed to the second and third registers of each of the channels. I personally have a table for the lowest A, A#, B, C, C#, ..., G, G#, and then I just LSR those entries X times, to achieve octave X.

Definitely feel free to ask me about this on IRC, I'll be willing to help you further. (especially if I'm being too confusing in this post :S)

by MetalSlime on 2009-04-06 (#45255)

When the main game engine tells my sound engine to load a new song, it passes it a value which you could think of as a song number. This value is asl'd and used as an index into a table of pointers. These pointers point to the header information for each song.

The header has basic information about the song/sfx: how many streams (Celius used the word "sheet" for what I am calling a "stream") it has, what channels those streams use, initial values for each stream (tempo, volume envelope, etc), and most importantly a pointer to that stream's data.

Stream data consists of 3 (well, 4) types:

1) note values (these signal you to look in your note lookup table and pull an 11-bit value)
2) note lengths (I also have these laid out in a table, but some games just use the number directly)
3) opcodes/control codes (these tell the engine to do things that 1) and 2) can't do. Like loop, jump, change the volume envelope, etc)
3.5) operands/arguments for opcodes. These are values that come after the opcodes, which the opcode will read to perform its function.

Then you just read from the streams in a loop, branching and doing stuff and updating your pointers along the way. The way I have it now, I have memory reserved in 6's:

Code:
stream_curr_sound: .res 6 ;what song/sfx number the stream is playing
stream_channel: .res 6 ;what channel the stream is updating
stream_ptr_LO: .res 6 ;ptr to current location in the stream
stream_ptr_HI: .res 6 ;same

stream_tempo: .res 6 ;tempo of the stream
stream_ticker: .res 6 ;current tick counter position for stream
etc..

Each variable "set" holds the data for up to 6 streams. Songs use the first four (i.e. stream_curr_sound+0, +1, +2, +3) and the 5th and 6th are reserved for SFX. This is nice because I can do a little x loop from 5 to 0 and then do stuff like:

Code:
lda (temp_ptr), y
sta stream_tempo, x ;x is the stream number

If you have any more specific questions, just ask.

by Roth on 2009-04-08 (#45347)

Thanks for all the input so far guys!

It's funny, but before anything was posted, this was an idea that was pitched to me by someone else, and I had decided to try my hand at it:

Celius wrote:
But I've also seen people squeeze a note and note length into a single byte, sacrificing certain values as command bytes.

So far it's working out... mmm... not too bad. I have a general question though. Is there anything special that I need to know about the triangle channel, or any other channel, for that matter? You know, along the lines of the way that you should only write to $4003 when it needs to be changed. Any little quirks for any others?

Thanks again for all the input, this is some good stuff to read!

by Memblers on 2009-04-08 (#45360)

If you're interested in memory efficiency, one of the rather interesting articles I'd read was in the C=Hacking zine (a C64 dev publication). There was an Ultima-type game that won of the of minigame compos (the first one?), but it had nice music and the article in C=Hacking by the author was entirely about memory-optimized sound engine ideas.

http://www.ffd2.com/fridge/chacking/
issue #21

Man, I've got tons and tons of ideas for a sound engine. I wrote it once, but only implemented it on one channel. It was more complex than trackers or MML (it's much like a combination of the two), but it could do some wild stuff. With clever usage of transposing, and combined channel instruments, the data hopefully could've stayed fairly small too.

by Drag on 2009-04-08 (#45363)

Roth wrote:
Is there anything special that I need to know about the triangle channel, or any other channel, for that matter? You know, along the lines of the way that you should only write to $4003 when it needs to be changed. Any little quirks for any others?

You can use the same period table for the triangle as you do the squares, but the triangle channel will produce tones that are an octave lower than the squares. You *could* silence the triangle channel by setting the period to 0, but that results in an ugly pop, so you should instead silence the channel by flipping its bit off in 4015. It's a good idea to use this method to silence any of the channels. I *think* you could get away with flipping the channel off, and then back on right after, and it'll still silence, unless there's another stupid undiscovered quirk.

The noise channel will need a different table, though.

by Bregalad on 2009-04-09 (#45393)

Yeah the same frequencies can be used by all channels exept noise, knowning that triangle will automatically play things one octave lower (no need to adjust, just set your octave values one more to compensate for that).

The "proper" way to silent the triangle channel is to write $00 or $80 to $4008 (using $4015 is correct too but it's often best to keep all chennels enabled via $4015 and to change their first register to change the volume).
So for square/noise channels, take the volume and or it with the duty cycle in before writing to $4000, $4004 or $400c. For triangle channel or with $80 instead, so a volume of zero automatically results in silence. You don't have to do that of course, but I find it's a nice trick. So "force" the triangle duty cycle to $80 is a nice trick.

by tepples on 2009-04-09 (#45394)

Another tip: if you're tracking the last $4003+x value for each channel in a variable, set it to $FF whenever the volume is 0 so that the note restarts properly. My own music engine wouldn't play consecutive notes on the triangle channel until I fixed this.

by Drag on 2009-04-09 (#45415)

Bregalad wrote:
The "proper" way to silent the triangle channel is to write $00 or $80 to $4008

When I tried that, I kept running into problems where the triangle channel wouldn't silence exactly when I needed it to. It'd play for a bit of (random) extra time, and I couldn't use that. Using the 4015 method worked, especially since all I'm doing is literally:
Code:
LDA #$0B
STA $4015
LDA #$0F
STA $4015

and that works the way I want it to, unless there's some kind of quirk on the actual hardware that keeps that from working properly.

by tepples on 2009-04-09 (#45419)

Drag wrote:
When I tried that, I kept running into problems where the triangle channel wouldn't silence exactly when I needed it to.

I ran into that. I think I fixed it by resetting the linear counter first: writing $80 then $00, or $81 then $00, or something.

by Memblers on 2009-04-09 (#45420)

Many games do seem to use $4008 for triangle control. When I wrote my NSF player, the triangle channel was fine in 90% of NSFs when I only enabled/disabled it based on that.

NT2 I believe uses $4015. Sometimes you'll see some really strange stuff, like Recca and how it disables the triangle channel by setting the frequency bits to all 1s. Instead of going silent, it just plays the lowest possible frequency. I found it very annoying once I was aware of it (sorry, heheh).

by tepples on 2009-04-09 (#45432)

Memblers wrote:
Sometimes you'll see some really strange stuff, like Recca and how it disables the triangle channel by setting the frequency bits to all 1s. Instead of going silent, it just plays the lowest possible frequency. I found it very annoying once I was aware of it (sorry, heheh).

Some old BPS games do the same thing. I've heard this problem in at least Tetris for Famicom (especially in the first six seconds of Technotris) and Hatris.

by Roth on 2009-04-12 (#45515)

Things have been kinda slow for me in the past couple of days because I have been doing other stuff, but I sat down and decided exactly how I think I am going to have the song data stored now. I have been doing it in a byte, but I have now changed some of the upper nybble stuff. I haven't implemented this yet. I just wanted to see if this sounds alright, or if there is something I may be forgetting in terms of the commands in the upper nybble that might be useful:

Code:
Hi nybble Lo nybble
--------- ---------
_________________
0 - A \
1 - Bb \
2 - B \
3 - C \
4 - Db \
5 - D \
6 - Eb |--------- Duration of Note
7 - E /
8 - F /
9 - Gb /
a - G /
b - Ab /
c - Silence Channel ___/
d - Octave Change --------------------- 0-7 (choose offset of LUT)
e - Instrument Change ----------------- 0-* (choose offset of LUT)
f - Loop Command ---------------------- $f0-$f*

I still don't have the instrument and loop stuff planned out, but I remember when I was checking out the music code for Friday the 13th, and they had used f4 as a place where the beginning of a loop is, and f5 as the point where it loops back. I was thinking maybe I could do it kinda like that. I dunno, it's on my mind to do it similarly, though.

What do you think? Does this setup look alright?

Oh, and Bregalad, I see why you love writing music/sound engines. It can be pretty fun, but hair pulling at times hehe

by Bregalad on 2009-04-12 (#45518)

Yeah, exactly as you say, it can be pretty fun and hair pulling at the same time.

Your format is extremely similar to "mine" (I'd rather say the format I use because it's not entierely mine), so it's the proof it's definitely a good format, I'm glad to see that :p
The only difference is that it stats from A up to Ab, and mine starts from C, altough I have considered making it start from A in another thread, but Tepples pointed out that while the NES channels "starts" on A-1 the GB/GBC/GBA square wave channels "starts" on C-1.

You don't need 16 different command bytes for looping just one is enough, you place that byte followed by the adress to loop (if you're very concerned about ROM space : place only one byte that is how much you should substract to the current adress to get to the loop point, but you can only go 256 bytes backwards).

You also probably need another byte that allow you to stop the channel without looping. Looping a command on itself could make your engine crash into an infinite loop depending on how you implement it.

That way you have 14 more bytes available for various commands.

BTW I and Celius are now building a sound engine standard using that format, and the channels can optionally select which scale they use if that module is implemented. Altough now I'm trying to fix bugs in my SPC prototype of my engine, which is supposed to be global and multi-platform, but the SPC is a good platform to test on because powerful and flexible, and a good representation of what more modern systems are capable of.

by Memblers on 2009-04-12 (#45519)

I think you could cut that in half in a lot of cases. There are lots of times where the duration, and especially the octave and instrument will remain unchanged between notes. If you don't mind working with nybbles, you could just move on to the next nybble if the current one isn't D, E, or F.

Dal segno al fine
by tepples on 2009-04-12 (#45523)

Bregalad wrote:
You don't need 16 different command bytes for looping just one is enough, you place that byte followed by the adress to loop (if you're very concerned about ROM space : place only one byte that is how much you should substract to the current adress to get to the loop point, but you can only go 256 bytes backwards).

Yesterday, I implemented a conductor track into my own music engine. I use two commands: one to copy the current position to the loop point ("CON_SEGNO"), and one to copy the other way, jumping to the loop point ("CON_DALSEGNO"). (If you don't know what a segno is, you can look it up.)

by tokumaru on 2009-04-25 (#46190)

Since I'm also designing my sound engine now, I'd like to ask something from the experienced ones: Can anyone give me a quick (yet comprehensive) list of the things that cause clicks and pops so that I can avoid them? Thanks a lot!

by Celius on 2009-04-25 (#46191)

The only thing I know of is writing to $4003/$4007 every frame. You should only write to those regs when the value changes. It isn't the case for the triangle, I believe (you should be able to write to $400B every frame). Also, you want to write to $4003/$4007 if you disable/re-enable the square waves via $4015.

by Disch on 2009-04-25 (#46193)

1) Abrupt volume changes. "Fade out" and "fade on" instead of setting volume to F on startup, and down to 0 to silence it. More specifically, employ a type of simple ADSR (search wikipedia if unfamiliar)

2) Don't write to $4003/$4007 unless necessary, as this resets the phase

3) Don't silence the triangle by setting its period to 0. Use the linear counter to shut it off, or if you really need it to be silent NOW for some reason, flip off $4015.2 (and then flip it back on right away)

That's all I can think of

by Drag on 2009-04-25 (#46194)

tokumaru wrote:
Since I'm also designing my sound engine now, I'd like to ask something from the experienced ones: Can anyone give me a quick (yet comprehensive) list of the things that cause clicks and pops so that I can avoid them? Thanks a lot!

What Celius said above me, PLUS:

Do not silence the Triangle by setting its period to 0. It does silence it, but it does so with an ugly popping noise. (Some NSF Players remove this pop to make Megaman 1 and 2 sound better).

Instead, either use the linear counter, length timer, or just play with $4015 to silence it.

In my experience, all of the emulators I've tried seem to allow you to pull stunts like:
Code:
LDA #$0B
STA $4015
LDA #$0F
STA $4015

...in order to silence the triangle channel. Could someone please test this to see if it works on the actual hardware too?

Also, disable the DMC channel if you don't use it.

Make sure to write $08 to the square sweep registers if you're not using them, otherwise the squares won't be able to play really low pitches. This doesn't make a pop, but I figured it's useful knowledge anyway.

Edit: Damn! Ninja'd by Disch. I need type faster.

by Celius on 2009-04-25 (#46196)

In order to silence the triangle, I just have a copy of $4015 in RAM and in the music reading code, do:

lda $4015Copy
and #$FB
sta $4015Copy

to silence, and

lda $4015Copy
ora #4
sta $4015Copy

to enable it. Then of course, after music reading is done:

lda $4015Copy
sta $4015

It works just fine for me. But of course, I do this after that:

lda $4000Copy
sta $4000
lda $4001Copy
sta $4001
...
lda $4008Copy
sta $4008
lda $400ACopy
sta $400A
lda $400BCopy
sta $400B

So updating all those registers might be necessary after silencing/re-enabling the triangle channel.

by tepples on 2009-04-25 (#46197)

Code like the following works in my music engine, both on Nestopia and on an NES:
Code:
triangle_on:
lda #$C0
sta $4008
lda freqLo
sta $400a
lda freqHi
sta $400b
rts

triangle_off:
lda #$00
sta $4008
rts

by tokumaru on 2009-05-04 (#46526)

OK guys, there are a couple more things that I couldn't find definitive answers for anywhere in these forums that I hope you can shed some light on.

One of them is instruments. At first I though of using specialized routines to handle each instrument, but it seems there are not that many parameters you can work with when creating them, so I imagine that defining these parameters (volume, pitch, duty cycle) in compressed lists might be better. Do you agree?

Well, if I use lists of parameters to modify the note while it plays, how would I work with different durations? Should each instrument definition be divided in 4 parts (ADSR, like Disch suggested) so that A, D and R are constant but the pattern defined for S is repeated for as long as the note lasts? This is the only solution I could think of.

Oh, one more thing: sound effects. We've discussed enough about how to play them, but what would be a good way to define them? Is the same format used to store music usually enough to create good sound effects, or do games use something more specialized?

by tepples on 2009-05-05 (#46535)

Tetramino defines its sound effects as
HEADER: 2 bytes, length and number of frames per step
EACH ELEMENT: 2 bytes, duty+volume and pitch. But the way I specify pitch might not work for a platformer because it's limited to semitones.

Right now, it uses a very simplistic, Game Boy-style instrument definition. Each instrument is 1 byte for duty+initial volume and 1 byte for decay rate. But I too am interested in what parameters I need to use to make it sound less Game Boy-ish.

by Drag on 2009-05-05 (#46557)

In my engine, instruments are just lists of "settings" that are written to $4000, $4004, and $400C every frame.

For example:
54 58 5C 5F 5E 5D 5C 5C 5B 5B 5A 5A 5A 4A 5B 5B 5C 5D 5E 5E 5C 5A 58 56 54 52 51 50

Or
9F 4D 1B 19 17 15 13 11 10

Stepping through those lists every frame, starting at the beginning when new notes are played, creates complex volume envelopes (+ duty cycle envelopes, like in the second example).

(For the record, this method cannot be done on the gameboy, since "setting the volume" of gb channels resets the phase of the waveform.)

by Celius on 2009-05-05 (#46561)

My sound effects are made of code, not data. Each one is so unique that code gives more flexibility than a universal handler that's fed data. Though you could make a universal routine that's fed data, it would take lots of data to define the sound you want. And when you think of it, code given a chunk of RAM can do A LOT.

by tokumaru on 2009-05-05 (#46564)

Celius and tepples, thanks for the suggestions on sound effects.

Drag wrote:
Stepping through those lists every frame, starting at the beginning when new notes are played, creates complex volume envelopes (+ duty cycle envelopes, like in the second example).

OK, but how do these lists work with notes of different durations? I mean, the lists are made for a certain number of frames, but notes with different durations take a different number of frames. How do you handle that?

by tepples on 2009-05-05 (#46567)

Drag's engine appears to handle the duty+volume side of instruments the same way my engine handles the duty+volume side of sound effects. In this case, you'd maintain a pointer to the duty+volume stream for each channel, and every time you start a note, you'd just reset the pointer to the start of the stream for a given instrument.

To put it another way: If a note is longer than the duty+volume stream, repeat the last element(s) of that stream until key-off. If a note is shorter, stop reading that stream.

by tokumaru on 2009-05-05 (#46573)

I see... But what if you want have volume, duty, whatever, changes at the end of the note?

To handle this I've though of instrument definitions divided in 4 parts: attack, decay, sustain and release. Attack and decay information would be used as the note starts, the data defined for sustain would be repeated over and over, and as the note is about to end the data for release is used. Hasn't anyone tried something similar?

by Celius on 2009-05-05 (#46574)

When stuff needs to happen towards the end of a note, I usually calculate when it needs to happen based on how long the note is, and then have it happen at that point. Say that 4 "ticks" before a note ends, I want it to decay. So I know the note is 15 ticks long, so by the time 11 ticks have happened, I need to start the volume decay. I needed to implement this for the Triangle Wave so that right before a note ended I could silence it for a moment to make each note more distinct. That's about as far as I ever went for something like that.

by Drag on 2009-05-05 (#46577)

tokumaru wrote:
I see... But what if you want have volume, duty, whatever, changes at the end of the note?

To handle this I've though of instrument definitions divided in 4 parts: attack, decay, sustain and release. Attack and decay information would be used as the note starts, the data defined for sustain would be repeated over and over, and as the note is about to end the data for release is used. Hasn't anyone tried something similar?

I don't use bits %00110000 of $4000 for the lists, since the software always sets them to a constant. So instead, I have those bits signify a command for the envelope.

For example, you'd need a command for "stop the envelope here until key-off", or "loop back to position xx until key-off" in order to get what you're talking about.