Writing a NSF player

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Writing a NSF player
by on (#230025)
Long story short, I'm trying to write a NSF player to handle music and sound effects in a project I'm making and I need some help understanding the file structure and how NSF players commonly work. Do NSF players generate the sound (aside DMC samples) or do they use prerecorded samples?
I probably don't need to implement all expansion chips and features either, since VRC6 is the chip I'm using for the music. I don't plan on using PAL mode either.

I did try to look into NSFPlay's source code, but its written in C++ which isn't my strongest language so I don't think I can use that as a reference either.
Re: Writing a NSF player
by on (#230036)
An NSF file is essentially a ROM fragment. You need to emulate the CPU, APU and any expansion audio chips as you would for a full NES emulator. On the other hand, you don't have to worry about the PPU or mappers (the hard parts) and you don't really need to be cycle-accurate unless you want to play PCM. Full details of the format can be found on the wiki.

You can use pre-recorded samples, but unless you're building patches for an existing synthesizer you're probably in for a similar amount of work to writing a basic APU emulator, so you may be better off just pre-recording the entire soundtrack at that point.

What exactly does this project of yours need? What languages do you use/prefer? Is there a specific reason why you can't use an existing player?
Re: Writing a NSF player
by on (#230037)
If by emulating the CPU you mean defining what each opcode does and handling the program counter, then that shouldn't be too hard. I'm not sure how easy the APU would be as I have not done much audio processing. The wiki does seem to have all the algorithms, but I'm not entirely sure of how exactly you output the sound.

If by playing PCM you mean DCM samples, then yes that is something I was planning on using. Is there a particular reason why it has to be cycle accurate and does if go for only CPU or APU or both?

The project I'm making is a game to be specific. That is why I don't really want to use existing player unless it is free to use. NSFPlay would have been, but the programming language is not correct. As I mentioned it's C++ and I'm using C#. The engine I'm using only supports C# and JavaScript as far as I'm concerned.

EDIT: By the way, since I'm using FamiTracker, is there already a code that can be used to play NSF files exported from it? Famitone is a good example although it is for NES games specifically.
Re: Writing a NSF player
by on (#230040)
SusiKette wrote:
The wiki does seem to have all the algorithms, but I'm not entirely sure of how exactly you output the sound.

The simplest - though not necessarily fastest - method is to emulate the APU cycle by cycle, producing one sample of output per CPU cycle, and then feed that into a resampler. You can fit a usable BLIP/BLEP resampler in a few hundred lines of code, or you can find a good FFT-based resampling library and use that.

I don't recommend using a 'traditional' resampler like SRC, as they require a huge number of taps and run quite slowly when downsampling from 1.7 MHz to 48 kHz.

SusiKette wrote:
If by playing PCM you mean DCM samples, then yes that is something I was planning on using. Is there a particular reason why it has to be cycle accurate and does if go for only CPU or APU or both?

By PCM I mean timed $4011 writes, for which you obviously need accurate timing throughout. The DMC channel is no harder than the other APU channels if you don't need that kind of accuracy.

SusiKette wrote:
The engine I'm using only supports C# and JavaScript as far as I'm concerned.

I think Mesen is written in C#, so you might try looking there. It's a rather heavy-duty, high-accuracy emulator so I doubt it will work for you directly even if the license is good, but the code should (hopefully) be readable.

SusiKette wrote:
EDIT: By the way, since I'm using FamiTracker, is there already a code that can be used to play NSF files exported from it? Famitone is a good example although it is for NES games specifically.

Not that I'm aware of. And my experience suggests it's probably easier to emulate the NSFs than mess around with Famitracker's internals and documentation. It's not really designed to target anything else, after all.

You could convert the NSFs to a log-based format like VGM and play that, if you want to skip the CPU emulation, but I'm not familiar with any tools for the conversion. I only know they exist somewhere, because there's plenty of NES VGMs out there.
Re: Writing a NSF player
by on (#230060)
Do you have any recommendations on the FFT library? Although if writing a resampler that has the necessary functionality for this project isn't hard to make and has usable documentation on how to make one then that can be an option as well.

As for Mesen, it seemed to be written in C++ as well, so that won't be useful either.
Re: Writing a NSF player
by on (#230100)
SusiKette wrote:
EDIT: By the way, since I'm using FamiTracker, is there already a code that can be used to play NSF files exported from it? Famitone is a good example although it is for NES games specifically.

Have you considered using the Qt FamiTracker library I provide as part of nesicide? I built a "front-end player" and a WinAmp plugin that demonstrate its usage. Basically with that you can play regular FamiTracker modules without having to export them or find a different player. The library is essentially full-blown FamiTracker with exposed APIs for playing, pausing, changing tracks, etc.
https://github.com/christopherpow/nesic ... amitracker -- the Qt FamiTracker library
https://github.com/christopherpow/nesic ... famiplayer -- the Qt FamiTracker player GUI
https://github.com/christopherpow/nesic ... ibs/in_ftm -- the WinAmp input plugin
Re: Writing a NSF player
by on (#230125)
That would most likely work if it wasn't in C++
I already have a general idea on how I should structure the code for the CPU part, but what would be the best way to implement the RAM that the code in the NSF file uses? Should I just make a byte array that goes from $0000 to $07FF or is there a better way to do this? Mostly the size of the RAM area is what I'm concerned of since I can't know where FamiTracker's NSF files try to store variables and how many bytes of RAM it needs.

EDIT: Also, does FTM files have any implementation on sound effects and sort of a priority system to decide what to play?
Re: Writing a NSF player
by on (#230128)
FTM doesn't have sound effects.

Though, the FTM format is kind of a time grid of note events, and each note plays an "instrument" at a specific pitch. The instrument is sort of like a sound effect... but they don't overlay on the music, they are how you make the music. No priority system, each note plays on the channel where it's placed.
Re: Writing a NSF player
by on (#230133)
SusiKette wrote:
Do you have any recommendations on the FFT library? Although if writing a resampler that has the necessary functionality for this project isn't hard to make and has usable documentation on how to make one then that can be an option as well.

As for Mesen, it seemed to be written in C++ as well, so that won't be useful either.

Oh. I've never used C# so I'm afraid I don't really have any other suggestions that would be useful to you. Sorry.

Writing a BLEP resampler is fairly straightforward; the only complicated mathematics required are for generating the filter table. Once you have that, it's a simple convolution: read a sample, multiply by a row of the table, add to the output. Since most PSG output is stepwise, if you differentiate the signal first then most of the samples going into the resampler will be zero, meaning you can skip them to save CPU. Since the rate of transitions is (roughly) limited to the audible range regardless of sample rate, the performance is as well. Reverse the differentiation with a leaky integrator afterwards and you're done.

I can't remember any good tutorials off the top of my head, though. I've been meaning to write one but never found the time.

SusiKette wrote:
I already have a general idea on how I should structure the code for the CPU part, but what would be the best way to implement the RAM that the code in the NSF file uses? Should I just make a byte array that goes from $0000 to $07FF or is there a better way to do this? Mostly the size of the RAM area is what I'm concerned of since I can't know where FamiTracker's NSF files try to store variables and how many bytes of RAM it needs.

The NSF wiki page lists all the required address ranges, under "Summary of Addresses". RAM when not using FDS or MMC5 is at $0000-$07FF and $6000-$7FFF, just like a normal NES ROM with WRAM. For a stripped-down NSF player you can map RAM from $0000 to $7FFF if you want to keep it simple. The only readable port in that that range (assuming no FDS, MMC5 or 163 expansions) is $4015, so as long as you special-case that any conforming NSF should work fine.
Re: Writing a NSF player
by on (#230159)
So if I only want to support VRC6 I only need RAM address from $0000 to $07FF? One way to implement the $6000 to $7FFF could be to create a separate array and see if bits %0110 0000 0000 0000 are set to see that the address is above $6000. If this is the case, the $6000-$7FFF range array is used and they you subtract $6000 from the address to get relative address from the array (i.e. $6003 would be index 3 from the array).

And if I understood correctly the loading, initializing and playing the tune that is described on the wiki page are something that is already in the NSF and not something that I have to worry about?

EDIT: Do NSF players and emulators commonly keep the CPU status flags as individual bool values and put them on a byte if you need to push them to the stack etc. or kept as a byte like they are on the NES?
Re: Writing a NSF player
by on (#230161)
SusiKette wrote:
So if I only want to support VRC6 I only need RAM address from $0000 to $07FF? One way to implement the $6000 to $7FFF could be to create a separate array and see if bits %0110 0000 0000 0000 are set to see that the address is above $6000. If this is the case, the $6000-$7FFF range array is used and they you subtract $6000 from the address to get relative address from the array (i.e. $6003 would be index 3 from the array).

RAM is present at $6000-7FFF for all expansions (or no expansions). Many NSFs will not use it, but it is available there per the spec for all NSFs.

FDS has some additional RAM requirements (and $6000-7FFF becomes bankable), but it's the only expansion that alters the specified RAM.

SusiKette wrote:
And if I understood correctly the loading, initializing and playing the tune that is described on the wiki page are something that is already in the NSF and not something that I have to worry about?

No, that describes the situation you need to provide before calling the INIT subroutine in the NSF.

SusiKette wrote:
EDIT: Do NSF players and emulators commonly keep the CPU status flags as individual bool values and put them on a byte if you need to push them to the stack etc. or kept as a byte like they are on the NES?

Yes I think that is common.
Re: Writing a NSF player
by on (#230166)
There might still be hope for using some of the C++ projects that has been suggested. I figured that if the necessary bits are converted to a .dll they can be imported to the program I'm using. Although I'm not exactly sure how the conversion works I guess it's worth a try. I don't think there is any point in writing a completely new player if an existing one can be used.
Re: Writing a NSF player
by on (#230186)
SusiKette wrote:
EDIT: Do NSF players and emulators commonly keep the CPU status flags as individual bool values and put them on a byte if you need to push them to the stack etc. or kept as a byte like they are on the NES?

I keep them as a byte. I found it simpler that way. YMMV.

rainwarrior wrote:
FDS has some additional RAM requirements (and $6000-7FFF becomes bankable), but it's the only expansion that alters the specified RAM.

Nitpick: MMC5 adds EXRAM at $5C00. Because why not.
Re: Writing a NSF player
by on (#230207)
Using .dll seemed to be useless effort.
As for the player, do I have to map the contents of the NSF file to the $8000 - $FFFF region similarly how I map RAM to $0000 - $07FF and $6000 - $7FFF?
Re: Writing a NSF player
by on (#230208)
Yes. See the section on loading and bankswitching.

The NSF data can be up to 1MB in size, mapped into 8 x 4k banks.

If it's not a bankswitching NSF, you simply load the data from the file at the LOAD address specified in the header (somewhere within $8000-FFFF).
Re: Writing a NSF player
by on (#230216)
So in other words if bank swapping is not used the loading part consists only of loading the actual file to memory so that the code can read it?
In one NSF I have exported the load address is $98D5. I'm assuming that this is the address in ROM where LOAD is located (but what is actually there and do I need to run that code at some point?). Also, is the 0x80 address in the NSF file that is the start of music and program data the start point of the "ROM" in it which I should start mapping from $8000 onward?
Re: Writing a NSF player
by on (#230223)
SusiKette wrote:
Using .dll seemed to be useless effort.

In what sense?
Re: Writing a NSF player
by on (#230225)
cpow wrote:
SusiKette wrote:
Using .dll seemed to be useless effort.

In what sense?


the program didn't sort of know it is there. It did display in the project tree and it did recognize some information about it, but you couldn't use it for some reason. It just didn't recognize it when I tried to actually include it into a script. I'm not sure if it was a bug or if it wasn't fully compatible
Re: Writing a NSF player
by on (#230290)
So here is a question about the carry flag. If I'm testing for overflow to set the carry flag, do I have to clear it if the condition to set it was not met (this might be relevant if carry was already set before the operation). Of course with ADC the carry flag would be used in the addition if it was set (is it cleared after?).

EDIT: Does the carry flag stay set if an overflow happens after adding the carry flag?

Code:
//cv.A = Accumulator
//cv.A_temp = Value of accumulator before operation

    public void OverflowTest()
    {
        if(cv.A_temp > cv.A)
        {
            Set_C();
        }
        else    //Is this part necessary?
        {
            Clear_C();
        }
    }


Secondly, do I have to code interrupts to the player? If I have to, which interrupts do I include and how should they be handled?
Re: Writing a NSF player
by on (#230296)
SusiKette wrote:
cpow wrote:
SusiKette wrote:
Using .dll seemed to be useless effort.

In what sense?


the program didn't sort of know it is there. It did display in the project tree and it did recognize some information about it, but you couldn't use it for some reason. It just didn't recognize it when I tried to actually include it into a script. I'm not sure if it was a bug or if it wasn't fully compatible


Yeah looks like I've some work to do since I only use it in C++ applications. :roll:
Re: Writing a NSF player
by on (#230303)
SusiKette wrote:
If I'm testing for overflow to set the carry flag, do I have to clear it if the condition to set it was not met

Yes. Every instruction that changes a flag can set it to 1 or clear it to 0. There aren't really any instructions where the flag output from an instruction is OR'd with the existing flag value. But many instructions leave a flag unchanged: off the top of my head, only ADC, SBC, BIT, and PHP affect V. If an instruction does not modify a particular flag, such as AND and ORA not affecting C, then programs will expect its value to be preserved across the instruction.

SusiKette wrote:
Does the carry flag stay set if an overflow happens after adding the carry flag?

Yes. This means that if carry is set, ADC #$FF or SBC #$00 leaves both A and carry unchanged.

SusiKette wrote:
Secondly, do I have to code interrupts to the player?

Only for the experimental timer support in NSF version 2. Otherwise, the only thing resembling an interrupt is the one that calls the PLAY routine.
Re: Writing a NSF player
by on (#230306)
tepples wrote:
Only for the experimental timer support in NSF version 2. Otherwise, the only thing resembling an interrupt is the one that calls the PLAY routine.


Do I have to detect the ending of the PLAY (and INIT) routines somehow or is there a infinite loop at the end that waits until I call the PLAY routine again? (basically change the program counter there or is there some other method to do it?)
Re: Writing a NSF player
by on (#230309)
The INIT and PLAY routines end with an rts instruction. But because they may themselves call subroutines, you need to wait for an rts instruction to move the stack pointer into the area of the stack that you have reserved for the player.
Re: Writing a NSF player
by on (#230310)
SusiKette wrote:
So in other words if bank swapping is not used the loading part consists only of loading the actual file to memory so that the code can read it?
In one NSF I have exported the load address is $98D5. I'm assuming that this is the address in ROM where LOAD is located (but what is actually there and do I need to run that code at some point?). Also, is the 0x80 address in the NSF file that is the start of music and program data the start point of the "ROM" in it which I should start mapping from $8000 onward?


A LOAD of $98D5 means put the data from the NSF at $98D5. (If not bankswitching.)

Yes, the data begins at $80 in the file.
Re: Writing a NSF player
by on (#230313)
tepples wrote:
...you need to wait for an rts instruction to move the stack pointer into the area of the stack that you have reserved for the player.


Can you explain this a bit further?
Re: Writing a NSF player
by on (#230315)
An NSF player is allowed to reserve the end of the stack page for memory used by the NSF player itself as opposed to memory used by the NSF being played. This is done to allow for NSF players that run on an NES, such as the NSF player in the PowerPak, which need to keep track of state such as current song, controller presses, and/or elapsed time. Suppose for example that an NSF player reserves memory addresses $01F1 through $01FF for its own use. Then when calling the INIT or PLAY routine of the NSF being played, the NSF player will push a return address within the NSF player's code, which address is stored at $01F0 (high byte) and $01EF (low byte), leaving the stack pointer (register S) at $EE before the CPU jumps to the INIT or PLAY routine. When the INIT or PLAY routine executes the rts instruction, it pulls the program counter from $01EF and $01F0, leaving the stack pointer at $F0. This would cause an NES-based NSF player to enter a routine that performs tasks such as input reading and waiting for the next opportunity to run the PLAY routine. An emulator-based NSF player, such as the one you are writing, would instead detect the end of the INIT or PLAY routine in one of two ways: the stack pointer (register S) has entered the area reserved for the player's use or the PC has taken on a particular reserved value denoting completion. You might choose $4100 or the like for this value.

In the preceding, which was the first word you failed to understand?
Re: Writing a NSF player
by on (#230340)
You don't have to put anything on the stack for an NSF player emulator, maybe let's take one step back from this.

INIT and PLAY are both subroutines, that could normally be called via JSR. The subroutine runs, possibly calling its own subroutines, but eventually it finishes with an RTS like all subroutines do.

When a subroutines is finished, the stack register will be returned to the position it started by that RTS. Your NSF player could detect that the routine has finished just by checking the stack position, but there are other ways to do it as well.


Some NSF players, especially hardware ones, push a bit of extra stuff to the stack to help manage this. This is what tepples is talking about. Because according to the spec the NSF is allowed to use all RAM in the system the way it likes, except the stack region, the stack is the only safe place for a hardware NSF player to use for its own memory storage. Emulators don't have to do this, though they can. (It can be convenient for the implementation.)


One more thing: the PLAY routine does not necessarily have to return. Usually this is for NSFs that play recorded PCM samples over the DMC channel. Not all NSF players support this properly, but to support the NSF player should continue to produce sound in step with the CPU's actions, whether or not PLAY returns.
Re: Writing a NSF player
by on (#230380)
I think I got the basic idea on how this works now. I think what I'll do is that when I call INIT or PLAY I'll push a return address to the stack, change the program counter to the routine's address and enable the CPU. The return address points to a part of the memory the program counter normally doesn't have any business begin at and this is tested after each opcode is executed. If it is within that area, I'll disable the CPU until I need to call PLAY again.
Re: Writing a NSF player
by on (#230410)
Can I run one frame's code (one PLAY routine) all at once or should I be waiting the appropriate amount of time between every instruction based on how long each instruction takes to process (559 ns per cycle iirc)? Since the player has to wait for the next call for PLAY routine anyway to continue even if it finished the routine sooner than what it would have if the wait between instructions was implemented. Could this cause any serious issues with the player?
Re: Writing a NSF player
by on (#230415)
Non-returning PLAY routines, particularly those that play raw PCM through writes to $4011, depend on each instruction taking as many cycles as it actually takes.
Re: Writing a NSF player
by on (#230504)
Do NSF players usually run on single thread or do they run on multiple? If they run on one, should the player really take much processing power at all?
Re: Writing a NSF player
by on (#230511)
The only computationally intensive thing is resampling the original audio stream at 1.8MHz down to whatever is supported by the sound card, and that only if you synthesize audio that way in the first place.
Re: Writing a NSF player
by on (#230717)
Unless some error occurred in my emulation code, FamiTracker's NSF files seem to use some illegal opcodes. I have currently set it so that when encountering one, a message is printed to console and the player is halted. The specific opcode is LAX, zpg ($A7). Since I have to implement these, how do the status flags behave on the ones that are usable? Some are mentioned to affect no flags, though. At least for LAX, both A and X will have the same value, so it probably doesn't matter which register you check the flags with, unless the status flags behave differently for some reason
Re: Writing a NSF player
by on (#230723)
Regular Famitracker does not use illegal opcodes AFAIK, so that could indicate another issue, though I couldn't say whether the 0CC fork uses them.

I think I learned most of what I needed to know about the illegal opcodes from this document, and by inspecting source code of a few emulators:
http://www.oxyron.de/html/opcodes02.html
Re: Writing a NSF player
by on (#230729)
The original cause of the issue was JMP incrementing PC after it was set to the target value. Next problem seems to be either with JSR or RTS. I'm not sure which one it is though.

Just in case there is some obvious mistake that I just can't see here is how I have implemented the said instructions: https://pastebin.com/GpFKfKQf
I'm not sure how easy it is to read, but I added some comments to hopefully make it a bit easier to understand.

(EDIT: cv. and sr. are references to other files)
Re: Writing a NSF player
by on (#230734)
JSR is supposed to add 2 to PC before pushing it, and RTS is supposed to add 1 after pulling PC. Some NSFs rely on this as a means of jumping through a jump table.
Re: Writing a NSF player
by on (#230755)
I did a few tests some time ago before implementing these instructions, but just to be sure:
-JSR pushes the high byte to stack first and then the low byte
-RTS pulls the low byte first and then the high byte
Re: Writing a NSF player
by on (#231937)
I took a look at bankswitching and I don't understand the meaning of the "padding bytes" that you get from load_addr AND #$0FFF. Can someone explain this a bit further?
Re: Writing a NSF player
by on (#232014)
An NSF that doesn't use bankswitching can be loaded directly into NES memory based on the load address, and one that does needs to be loaded as a (potentially larger) ROM that gets mapped into NES memory.

If I understand it correctly (I'll need to implement this myself soon), basically you're building a ROM out of 4kB banks. AND #$0FFF keeps the padding within that 4kB size, those upper 4 bits of the load address are ignored. The NSF load address becomes the beginning load address for your ROM. Like this:

NSF header says load address is $8456 (or $A456, $F456 would all be the same)
[ROM at 0x000000] bank 0
[padding up until 0x000456]
[load NSF data from here on]