cool, didn't expect this much discussion for this forum.
@byuu: Mode 3 would've been way easier to implement than what I showed in the TMNT demo, I just didn't like the savage ROM requirements since I wanted something to show that I could present with the resources currently available. The 256MB S-DD1 ROM idea sounds so out there, but it'd actually be pretty fun to mess around with. I'd definitely use it, and even though the audio doesn't take up as much space it'd help out there too. I assume you could just extend the mmc[] array in your S-DD1 code to not have the &3? I don't know what else you'd have to do to get a 256MB ROM loaded...
About the PAL vs NTSC issue, I really don't want to make anything tied to one format. Just personal preference, I mean if I buy another console to replace my busted one it'd definitely be an NTSC one, I never want to target PAL for homebrew, but I don't want to eliminate chances of compatibility with both. 240x176 in NTSC sounds good to me actually. It would be good for 20FPS anyway, but I might have to shave off a line or two to fit the palette upload, no big deal.
@tokumaru: I don't think this video (ones from myself anyway) have practical purpose in any games outside of novelty. Wouldn't it feel wierd having an 8MB ROM with a small intro video, and 300kb of actual game content? I say 300kb as en example because that's how much is vacant in the TMNT video. I don't think much can be done without special chips to make it fit practically in any game, even then it's a major stretch.
------
Speaking of the audio end of it, I would've liked to make all of this powered by just the spc + '816, no fantasy hardware or anything complicated added to it. I wanted to make something that if you were to give a snes oodles of ROM and a simple bank switching setup, nothing more than that, it'd work no problem. I want the original hardware to do most of the heavy lifting.
I currently use 22khz mono, I don't think 32khz stereo software driven is far from reach, the SPC routine I have already spends a decent amount of time waiting for the next 2 samples bytes (1 byte for control, 1 byte for audio which I don't need at all).
So there are roughly 64 spc cyles between scanlines over on the snes end, I only transfer every 2 lines. The SPC routine looks like this:
Code:
MainLoopX:
cmp x,$f6 ;wait for matching index
bne MainLoopX
inc x ;protocol count++
mov a,$f4 ;get data byte #1
mov ($00)+y,a ;store to ARAM at appropriate spot
incw $00 ;increment entire 16bit pointer
mov a,$f5 ;get data byte #2
mov ($00)+y,a
incw $00
bbs0 $03,skip ;playback started? skip if so
mov a, $00 ;after 256 bytes uploaded, key on
bne skip
mov $03,#$01 ;set bit 0 of playback started flag, then key on ONE TIME ONLY
mov $f2,#$4c ;KON CH0
mov $f3,#$01
skip:
;cmp $00,#$00 ;once next byte pointed to is $C000, then the last address written was $BFFF
;bne UpdateVolume ;unecessary due to high byte changing immediately upon reaching the end
cmp $01,#$C0
bne UpdateVolume
mov a,$BFF7 ;read header byte of final block
or a,#$03 ;set loop/end
mov $BFF7,a
mov $00,#$00 ;reset pointer back to $3000 for another bunch of blocks
mov $01,#$30
jmp MainLoopX
UpdateVolume: ;help balance the load between cases
mov $f2,#$0c ;update mvoll
mov a,$f7
mov $f3,a
mov $f2,#$1c ;mvolr
mov $f3,a
jmp MainLoopX
it's very generic and the bbs0 could be removed just by having another routine that operates without that check. It's not pressed for time or anything though so I left it as is. It works every 2 scanlines for 2 bytes.
If I were to make something resembling this:
Code:
;$F7 is for sync (A only) sometimes and the rest are for data
;$00 and $02 point to two samples that are 9 * 8 * 256 bytes large ($4800 each)
;assuming $6000 and $A800
;(3 bytes)
ScanlineA:
cmp x,$f7 ;wait for matching index
bne MainLoopX
inc x
mov a,$f4 ;data byte #1
mov ($00)+y,a
inc y ;saves 4 cycles over incw $00, no page boundary cross checking required as long as it's incremented by power of two between checks
mov a,$f5 ;data byte #2
mov ($02)+y,a
inc y
mov a,$f6 ;data byte #3
mov ($00)+y,a
inc y ;43 cycles passed since the CMP loop
;need to stall for time a little bit so may aswell throw this check in here
bbs0 $04,wait ;playback not yet started?
bbc0 $01,wait ;256 bytes have already been buffered?
mov $f2,#$4c ;KON CH0/CH1
mov $f3,#$03
mov $04,#$01 ;only do this once
jmp ScanlineB ;71 cycles elapsed including this jump
wait:
push a ;waste 20 cycles before allowing further stuff (68 altogether, including 5 cycle branch taken bbs0)
pop a
push a
pop a
push a
pop a
nop
;(4 bytes)
ScanlineB:
mov a,$f4 ;data byte #4
mov ($02)+y,a
inc y
mov a,$f5 ;data byte #5
mov ($00)+y,a
inc y
mov a,$f6 ;data byte #6
mov ($02)+y,a
inc y
mov a,$f7 ;data byte #7
mov ($00)+y,a
inc y ;48 cycles for this block
;*insert meaningful activity here*
push a ;18 cycles + 48 = 66
pop a
push a
pop a
nop
;scanline C (1 byte):
ScanlineC:
mov a,$f4 ;data byte #8, $f7 now contains the next control byte for next iteration
mov ($02)+y,a
inc y
bne ScanlineA ;no page crossing means just loop again
inc $01 ;advance page pointers and check if the end is reached
inc $03
cmp $01,#$A8 ;once high byte of first sample points to the start of the second, it has reached the end (for both)
bne ScanlineA
mov a,$A7F7 ;final header first sample..
or a,#$03
mov $A7F7,a
mov a,$EFF7 ;likewise second sample
or a,#$03
mov a,$EFF7
jmp ScanlineA ;52 cycles after this block
Not tested or used or anything. Over 3 scanlines, it will accept 8 bytes of data. 8 bytes so I only need to test for Y crossing page boundary at the end of all of this.
224 / 3 * 8 = 592 bytes a frame after rounding, which is 8 (or 2.6) short of the 600 I need for 64000 samples a sec. Don't really care, I'll make it JUST a tad lower in rate and you won't tell the difference. Ofcourse there is overscan but meh.
My current 22khz mono code takes 10-11 scanlines per frame, trippling gives 33 which is still plenty of time for any videos to run. With the cycle counting and everything I'll probably destroy and shreds of compatibility this ever had with zsnes/snes9x but I don't care. I'll be debugging this with simulatneous spc+cpu trace logs in bsnes anyway.
I don't modify volume in the new routine I just posted, I can just assume max volume until the end where the snes just starts feeding it silence.
------
also byuu since you sound interested with videos and all of that, have you made any of these before? This must all be small potatoes to you so I assume you've done it many times over.
Also, I don't feel like touching anything related to the S-DD1 compression because I feel it'd be really challenging trying to make a compressor for something that obscure. It would help massively though...Regardless, a 128mbit powerpak with potential MMC support would be my ideal target cartridge, except I don't own one or a NTSC SNES, or a SNES that actually functions.
edit: I realise there's tons of issues with the spc code I posted although I don't want anyone to go out of their way to fix something I already covered, disregard the horrors in that snippet.