iNES based NES music format

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
iNES based NES music format
by on (#66349)
blargg in another thread wrote:
Yeah, the basic idea is that an iNES music file is a totally standard iNES ROM whose reset handler reads the track number to play, etc. from zero-page bytes. This way an emulator doesn't need any special handling, other than doctoring the initial values of those zero-page locations before powering up the emulated NES. To change tracks, you repeat the above process. So you never need to know anything about how the music code works, where its routines are, etc. and the music code has full flexibility in how it works, and what mapper it uses. This of course would supplement NSF, rather than replace it, since NSF works fine for most music.

That covers the essential functionality. You could also have the info strings at the beginning of PRG data, so that they are always at offset $10 in the file. And you could have some way of plugging in a NES-based UI, so that it can be booted standalone on a NES. Perhaps the reset code would check for a signature in zero-page; if not present, it jumps to an area where the NES-based front-end can be plugged in, and also specifies routines similar to NSF for changing the track while running.

I want to make this happen. I have been trying to get my tracker (Pornotracker) to a state where I could release it (I was supposed to release it 3 years ago). It supports raw PCM so exporting to NSF alone isn't going to work. I'll ship the specification with it so hopefully more people will adopt the format in the future.

Let's see what we have here:
- the emulator (or PowerPak or whatever) and the ROM communicate through zero page variables
- the music header sits at the beginning of the ROM ($10 in the file)
- before reset the emulator sets the track number and a signature on ZP, allowing the ROM to know if the emulator supplies an interface for playback control
- when the user changes the CPU is reset (after rewriting the signature and track number)
- use the .nes file extension or define a new one?

What's in the header:
- a signature string (to tell the emulator it's a music file)
- version number
- number of songs
- starting song
- PAL/NTSC/both flag
- title of the song/artist/other info in string format
- track names/length information

Some of the header info should be optional.

Suggestions/comments? Can you think of something else that should be included in the header?

by on (#66356)
iNES music file
$00 Standard iNES header, whatever mapper you desire
$10 Standard NSF header
...

Player
Load NES ROM normally.
Check for NSF header at $10.
If NSF, put following in zero-page:

$00 4-byte signature, not sure what it'll be
$04 track number

Then emulate ROM as normal, no further interaction with it. To select NTSC/PAL mode, change timing of APU/PPU accordingly.

Program in music file
See if 4-byte signature is at $00.
If signature is there, start specified track.
If not, run normal player that controls music via controller.


By using a standard NSF header, current NSF parsing code can be reused with minor modifications. Might simplify things. Obviously some of the fields wouldn't be used by the player.

For NSF players being modified to support this, it could be allowed that the PPU only supports bit 7 of $2000 (and possibly $2002), nothing more, so the NSF player doesn't have to implement a lot of extra stuff. Might also specify a set of core mappers that every player should support.

To help OS, music ROMs could have a different file extension, but .nes ones should be supported. Benefit of .nes extension is that current NES emulators can play them just fine.

by on (#66359)
blargg wrote:
For NSF players being modified to support this, it could be allowed that the PPU only supports bit 7 of $2000 (and possibly $2002), nothing more, so the NSF player doesn't have to implement a lot of extra stuff.

Which would rule out using sprite 0 for 120 Hz updates.

The extension could be "iNSF", and the signature could be "NESM" which matches the first bytes of PRG ROM.

Dual-compatible NTSC/PAL songs would have to wait a frame to detect the video system.

How would timing work? Perhaps $2007 could be repurposed as a way for the program to tell the player software the name of the playing track and how long the track is. (This info is not in the NSF header.)

Game soundtracks ripped into iNSF would have to relocate some code if they already have something at $8000-$807F.

by on (#66367)
Quote:
[minimal PPU support] would rule out using sprite 0 for 120 Hz updates.

You're right. A full player must support the PPU. We could have a header flag noting what PPU features music requires, so reduced-PPU player can reject such files, rather than having them hang or whatever.

Quote:
The extension could be "iNSF", and the signature could be "NESM" which matches the first bytes of PRG ROM.

The file must be a standard iNES ROM.

Quote:
Dual-compatible NTSC/PAL songs would have to wait a frame to detect the video system.

Yes. A player can easily remove any initial silence from the output, so there's no delay.

Quote:
How would timing work? Perhaps $2007 could be repurposed as a way for the program to tell the player software the name of the playing track and how long the track is. (This info is not in the NSF header.)

I'm very much against this. The emulator never communicates with the song. It emulates it exactly as a normal iNES ROM. To change tracks, it just resets the ROM with the new track number in zero-page. Meta-information goes in the header. Yes, it lacks per-track names and times, as NSF does. If that is needed, then let's add it to the header (and perhaps in a way that NSF can also support the same additions). Separate issue though, unrelated to the core one: NES music files that can use all NES features.

Quote:
Game soundtracks ripped into iNSF would have to relocate some code if they already have something at $8000-$807F.

This isn't a replacement of NSF, at least not right now.

I desire to keep this really minimal and simple. Basically, how can we support all NES features for music, with minimal emulator changes and complexity?

by on (#66368)
blargg wrote:
Quote:
The extension could be "iNSF", and the signature could be "NESM" which matches the first bytes of PRG ROM.

The file must be a standard iNES ROM.

The first four bytes of the file are NES^Z, but the first four bytes of PRG ROM are NESM because, as thefox said, "the music header sits at the beginning of the ROM ($10 in the file)".

Quote:
Meta-information goes in the header. Yes, it lacks per-track names and times, as NSF does. If that is needed, then let's add it to the header

Then you might have to investigate how NSFE stores track metadata.

by on (#66369)
Oh sorry, I see you meant the signature it puts into zero-page could be NESM. I like that as a signature.

As for track names, I personally think the ROM format shouldn't have any meta-information. Leave that to a game music format-agnostic wrapper of some sort. The ROM should only serve one purpose: store the music player code in a way that can be easily emulated.

by on (#66379)
blargg wrote:
By using a standard NSF header, current NSF parsing code can be reused with minor modifications. Might simplify things. Obviously some of the fields wouldn't be used by the player.

I don't really think this is a good idea. First of all there's a lot of useless information. Secondly supporting this format will always require a lot of changes in a music player which previously only had to handle the APU and a single mapper... parsing the header is one of the easier things to do. :)

blargg wrote:
To help OS, music ROMs could have a different file extension, but .nes ones should be supported. Benefit of .nes extension is that current NES emulators can play them just fine.

Yeah emulators definitely should support it regardless of the extension. It would be mostly for organizing purposes, so that you could differentiate application ROM files from music files by looking at the filename.

As for the meta-information... I'm undecided. On the other hand I would like the format to be simple but then again leaving it out might just complicate things in the long run, having to standardize and get authors to adopt yet another format. If such info is supported it should be made optional though.

blargg wrote:
Quote:
Game soundtracks ripped into iNSF would have to relocate some code if they already have something at $8000-$807F.

This isn't a replacement of NSF, at least not right now.

True, most game tracks should still be ripped as NSFs, but there are some that would work better in this format.

A lot of tough decisions to make.

by on (#66405)
If the meta-information is in some standard format, then the ROM's internal player (used when run as iNES instead of iNSF) can read that same block of data when drawing its UI.

by on (#66410)
tepples, excellent point about meta-information. This was partly one reason I suggested an embedded NSF header, so that an embedded player could use the init/play addresses from it.

thefox, point taken about the header format, so it will not be in NSF format then. Let's try to allow separation of concerns between the information a player needs to play the music, and the meta-information (track length, name, etc.), so that we can focus on the information the player needs, that is, that which if not present would prevent playing it.

For the required NES features, I realized that the default should be that an iNES music file requires a fully-accurate NES. Then we have flags that specify that it will work with lesser systems, for example a PPU that only supports $2000 bit 7, or $2002 bit 7.

by on (#66565)
Have you guys seen the player code for the NESM spec? The code was converted by Chris Covell and in turn I converted to ASM6 here recently. You can take any NSF and use the tune switching code with it. It's mapper 0 and only has enough graphics in it to display the number of the tune. You change tunes by pressing the start button.

I don't know about RAW PCM working in this player, but I suppose it's possible since it's not strictly a NSF anymore.

I happen to use this quite often.

by on (#66644)
Gil-Galad wrote:
Have you guys seen the player code for the NESM spec? The code was converted by Chris Covell and in turn I converted to ASM6 here recently. You can take any NSF and use the tune switching code with it. It's mapper 0 and only has enough graphics in it to display the number of the tune. You change tunes by pressing the start button.

Neverheard, link?

blargg wrote:
For the required NES features, I realized that the default should be that an iNES music file requires a fully-accurate NES. Then we have flags that specify that it will work with lesser systems, for example a PPU that only supports $2000 bit 7, or $2002 bit 7.

Very true. Probably also sprite 0 hit ($2002 bit 6) too. Anything else?

by on (#66656)
Here is the asm file that I converted to ASM6. You can get the original copy on Chris Covell's site inside the NSF ripping document.

NESM player

by on (#66661)
thefox wrote:
blargg wrote:
For the required NES features, I realized that the default should be that an iNES music file requires a fully-accurate NES. Then we have flags that specify that it will work with lesser systems, for example a PPU that only supports $2000 bit 7, or $2002 bit 7.

Very true. Probably also sprite 0 hit ($2002 bit 6) too. Anything else?

That is, a PPU that only supports sprite 0 hit where the upper-left pixel of the sprite is always hitting a background pixel. Trivial to implement while giving the full benefit.

Sprite overflow where the first 9 sprites must be the ones that trigger it, allowing a simple implementation. Also require that sprite transfers be done with DMA and always with SPRADDR=0 when beginning DMA, further simplifying without limiting anything.

CHR ROM/RAM and $2006/$2007. I could imagine a music engine using this as a general-purpose buffer with automatic increment. Especially useful as an 8K sample buffer (or even 10K if you spilled into nametable). Couple with a mapper that switches CHR banks and you have space for PCM samples separate from PRG.

$2002 bit 7 that is set each frame and cleared when read, but without any special cases or NMI suppression since those take a significant amount of code to implement properly.

$2000 bit 7 where enabling in middle of VBL when flag is still doesn't cause second NMI.

One feature that especially a full-accuracy iNES music player should support is hooking the audio to the composite video output :)

by on (#67311)
blargg wrote:
$2002 bit 7 that is set each frame and cleared when read, but without any special cases or NMI suppression since those take a significant amount of code to implement properly.

$2000 bit 7 where enabling in middle of VBL when flag is still doesn't cause second NMI.

I'm not really sure how many helpers like this we should provide...

blargg wrote:
CHR ROM/RAM and $2006/$2007. I could imagine a music engine using this as a general-purpose buffer with automatic increment. Especially useful as an 8K sample buffer (or even 10K if you spilled into nametable). Couple with a mapper that switches CHR banks and you have space for PCM samples separate from PRG.

Why didn't I think of this before. :)

by on (#67312)
Why didn't I think of it? arfink gets credit for the idea, for cases where you need an 8K buffer but don't have any WRAM on the cartridge or don't want to alter it. The auto-incrementing is a big plus. If you have a large CHR ROM, it's a perfect place for lots of PCM samples.

Quote:
I'm not really sure how many helpers like this we should provide...
As few as possible. I just figured I'd enumerate each distinct useful thing. Then we can go over them and decide which ones require significant work to implement in an emulator, and only then consider giving them separate flags.

The less stuff the better. Featuritis lurks behind every corner.

by on (#67314)
If I was designing an NSF format based on NES files, I'd just make it put an initial value into register A to select the song, and otherwise act exactly like a regular NES emulator.
Obviously emulate the PPU's NMI. Maybe emulate the rest of the PPU as well just for the sprite 0 hit, but that's probably overkill, and would waste a bunch of CPU time.
Use whatever mappers exist. Might lead to mapper hell though.
No built-in custom playback speed, use one of the cycle counting mappers (like FME7, or VRC) if you want that.
And if you need to save on CPU usage for some reason, you can always use PocketNES style speedhacks to detect an idle CPU state.

by on (#67315)
Quote:
If I was designing an NSF format based on NES files, I'd just make it put an initial value into register A to select the song, and otherwise act exactly like a regular NES emulator.

One goal was for these ROMs to be able to support use on a NES or normal NES emulator, if they wanted to.
Quote:
And if you need to save on [host] CPU usage for some reason, you can always use PocketNES style speedhacks to detect an idle CPU state.

I hadn't thought about the host CPU issue, which is relevant. Would be simplest if it for example only optimized the case of an infinite jump loop.