Lossless NES-specific video encoding/stream format thoughts

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Lossless NES-specific video encoding/stream format thoughts
by on (#148887)
I was musing about this and had a preliminary notion of how to do it, like MIDI (but actually being a compliant IFF)
NES video format
extensions: .nvf, .nesv, .nesvideo
big-endian (for little-endian long-values begin with "RIFF" instead)
[Reminder: AIFF = 4-char chunkname, ULONG chunklength,then chunkdata, except for the FORM chunk which excludes the filetype. Even-length chunks only.]

always: "FORM" + ULONG(4B): bytes in file (excl. this 12) + "NESV"
always: "NVhd" [+ ULONG(4B): bytes in header]
-NTSC? PAL? DEND?
-Mapper? (what kind of sounds)

usually: "NAME" [+len] + string of title. Null terminator unnecessary.
'AUTH' /* chunkID for Author Chunk */
'(c) ' /* chunkID for Copyright Chunk */
'ANNO' /* chunkID for Annotation Chunk */
'COMT' : comments chunk (standard AIFF)
'MARK' : markers chunk (standard AIFF)
'ROM ' : what rom is it on (for use with? to get CHR?)

always: "NViD" [+len] + any number of events, in sequential order.
Byte of scanline to execute event (0 = "prerender", 241+ = postrender), then bytecode of event, then data.
-00 Frame break: VWF # frames to skip (0 = "done for this frame")
-xx "fine"/Mid-line-event: Which pixel specifically to execute the next event, rather than before the scanline.

-$20-$27 PPU Register hits (one can, in theory, use only this + framebreaks)
--PPUCTRL,PPUMASK,PPUSTATUS,OAMADDR,OAMDATA,PPUSCROLL,PPUADDR,PPUDATA
--then databyte
--$28: PPU_SCROLL_WHOLE: both bytes
--$29: PPU_ADDR_WHOLE: both bytes
-$40-$57 APU/I/O register hits
--$58: DMC_WHOLE: the waveform itself, rather than pointer/length
--$59: OAMDMA_WHOLE: then whole OAM contents, rather than the byte of address to copy from
-xx Palette change (whole palette? redundant with...)
-xx PPURAM data block (incl. PT, NT, AT)
--address to start at (2 byte), length of block (2 byte), then data.
-xx Fine palette block
--NT address of tile to start at, number of tiles to hit, then each's palette, four to a byte
[actual ExGrafx-style would require including the whole of CHR/referring to the proper ROM- which is probably desirable space-wise, rather than just acting like there's RAM and overwriting to emulate a bankswitch]


As the original notion was also about streaming, the events could also be sent as a stream (again, MIDI-style) (scanline-positions, frame-terminators would still be present, but would be less likely to include more frames of inactivity).

Thoughts/revisions/suggestions welcome.
Re: Lossless NES-specific video encoding/stream format thoug
by on (#148889)
Well, the existing method is just ROM + starting savestate + input stream, which works pretty well for a lot of purposes, and has relatively compact data.

If you want to make a movie-maker application for people to create an NES video, and then export it to a ROM or something like that, I figure the format of your file should be based around the technical needs of that tool and compiled ROM, not some general idea of everything the NES can do. This is approach is a little bit inside-out for that, I think.

What other applications are you thinking of? What's the intended use of this file format?
Re: Lossless NES-specific video encoding/stream format thoug
by on (#148891)
rainwarrior wrote:
Well, the existing method is just ROM

Video game publishers are historically more likely to use copyright to shut down a stream that begins with a ROM file than a stream that contains only video.

Quote:
What other applications are you thinking of? What's the intended use of this file format?

Based on the previous topic, I think it's to stream video to the public.
Re: Lossless NES-specific video encoding/stream format thoug
by on (#148893)
(TASvideos, formerly NESvideos, could also have a use for it in encodes of NES games, as might speedrunners...or just people who want to make demos or mockups?) It could also just be for encoding NES screenshots, even, as that would be a single frame, though I doubt there'd be much if any advantage over existing image formats. (Note to self: add compression option of some form)

rainwarrior wrote:
If you want to make a movie-maker application for people to create an NES video, and then export it to a ROM or something like that, I figure the format of your file should be based around the technical needs of that tool and compiled ROM, not some general idea of everything the NES can do. This is approach is a little bit inside-out for that, I think.
That's because it's more about capturing video from a played ROM (like NSF is for audio) and encoding it in a way that reduces/removes desynch from the equation without removing any of the video data. Obviously not every event is compatible with export-to-ROM, and one could in theory make a mapper that supports a lot of it, arguably all of what commercial games did.

Strictly speaking, one could do with just...PPUScroll, PPUMask, PPUCtrl, PPURAM-block, and OAM_Whole (and audio) events, with frame delineation. (It allows for iframes or delta just fine.)

The problem mentioned here about having to develop a new client-side thing would be sidestepped, as one would basically be making a pseudo-mapper (again à la NSF) though strictly-speaking the file format can do things the NES can't (really, just a full-palette change in one pixel). As long as an emulator renders by-scanline, it's basically a scanline-interrupt (possibly pixel-interrupt, though I can't actually think of any legitimate, desirable effects that require it, except maybe MMC2-style bankswitch) mapper with cart-side always-accessible nametables, dense paletting, and expansion audios in the truly-general case. (A very smart mapper wouldn't need to bother running much at all through the CPU except for palettes, OAM, 2A03 audio...but that's a slightly different pie-in-the-sky)

(Now why did I calculate that a feeder mapper could do 4963 straight register writes per NTSC frame...?)LDA #xx STA $yyyy, obviously you'd only need two bytes per write as there aren't so many registers...one could just point the NMI vector at the start of the 32k, do 4963 writes, and some NOPs to last until the NMI hits again, with some room for init routine...
Re: Lossless NES-specific video encoding/stream format thoug
by on (#148896)
TASvideos actually takes a lot of submissions in the existing compact form of savestate + input stream (and you find the ROM yourself, obviously). If copyright is a concern, obviously storing the CHR data is going to be a sticky problem, which this existing method already works around.


If you're trying to facilitate high quality streaming with low bandwidth, a different approach might be to try to implement a lossless RGB video codec directed toward emulated games in general (not just NES). Something like Lagarith, but with features like multiple planes/layers and indexed colours w/palette. It could be integrated with emulators for efficient encoding, and an available codec could make it easy to add support from streaming services. Or even just an NES codec if that's the approach you'd rather take, but I think you could get a lot more versatile by getting away from the specifics of NES.
Re: Lossless NES-specific video encoding/stream format thoug
by on (#148900)
I know input-files [+ROM-identity] is how the submissions work at TASvideos, because it makes for easy verifiability.

[size]Savestate-starts are generally banned at TASvideos. Did you mean save-data for the relatively-rare from-save plays, such as for when a mode isn't available on first-play?[/size]

This would be worthless as submissions because it could be easily "cheated"/done in a way that the game can't (or has nothing to do with a game! Video game hoaxes made easy!) It would be more for encodes as a possibility, though I suppose it would be an odd intermediate...

I was kind of disappointed back when I learned how NSFs work.

(/tangent)

I suppose including the CHR data proper is in that "Hmm, copyright?", but at some level it's going to be there anyway in a video, less unused tiles, unless one goes the "oh and get the ROM too" way...
Re: Lossless NES-specific video encoding/stream format thoug
by on (#148903)
<tangent>
Myask wrote:
I was kind of disappointed back when I learned how NSFs work.
As opposed to, what, a more MIDI-like format like VGMs?

There's a fairly well established tradition of using "strip out the non-music code" for chiptune formats: SID, SNDH, AY, &c &c.
Project AY wrote:
The main benefits of the AY file format (when compared to logged file formats, for example) are as follows:
  • Multiple tunes can be stored in a single AY file.
  • The original Z80 code remains completely intact meaning that unused tunes can be played, as can samples.
  • AY files are small - 8KB per file on average.
  • Perfect tune looping is maintained where tunes are programmed to loop.
  • AY files can be loaded back into an original Spectrum/CPC computer.
</tangent>
Re: Lossless NES-specific video encoding/stream format thoug
by on (#148907)
Myask wrote:
Savestate-starts are generally banned at TASvideos. Did you mean...

I was referring to the FCEUX movie file format used, which can include savestates as well as the input stream, and some other miscellaneous bits. Whether or not TASvideos would allow a savestate seems unimportant to this discussion. My point was this format is established and used already by them and others, and it works very well for the purpose of delivering low-bandwidth perfect-quality NES video.

(It's just not very good for integration into existing streaming forms, or for people who don't want to hunt down ROMs, etc.)