So, the PPU has a lot of cycles that it's not actively using its bus. On each rendering scanline, of the 341 pixels&cycles per scanline, 170 are spent reading the bus, 1 is idle (on almost all scanlines), and 170 are spent latching the lower 8 bits of the address bus (because A0..7 and D0..7 are multiplexed. I'll call these ALE cycles because that's the 8051 term). ALE cycles are one pixel long, or ~180ns, so definitely long enough to cheaply get RAM fast enough that we can have a whole extra access cycle.
So we could build a mapper that could DMA data to or from the CHRRAM by disconnecting CHRRAM from the PPU on these ALE cycles. Plus we could do this while rendering is enabled, although we'd have to be careful about not tearing.
The downsides:
* We could only write to RAM that was on the cart itself, so we'd either have to supply our own RAM for the nametables, or not be able to DMA to them.
* (for PRG->CHR) We have no way to force the CPU to stop reading, so we either have to piggyback on the existing read cycles (and feed the CPU fake data like "ORA #0")
* (for CHR->PRG) We'd have to force PRGRAM off the bus while we were writing to it -- i.e. code that read/wrote/ran from it during DMA wouldn't work.
* (either direction) or have fast PRGRAM+ROM on the cart (~70ns) and do our reads/writes in the first quarter of ph2. Regardless, external hardware can't rapidly read or write to the 2kB of RAM inside the NES, so if we're not just blitting data out of ROM (and if we are, why not use CHRROM with nice bankswitching?) we need PRGRAM too.
In any case, the point of all of this is: I've been trying and failing to brainstorm how you'd use this ability. Any ideas?
So we could build a mapper that could DMA data to or from the CHRRAM by disconnecting CHRRAM from the PPU on these ALE cycles. Plus we could do this while rendering is enabled, although we'd have to be careful about not tearing.
The downsides:
* We could only write to RAM that was on the cart itself, so we'd either have to supply our own RAM for the nametables, or not be able to DMA to them.
* (for PRG->CHR) We have no way to force the CPU to stop reading, so we either have to piggyback on the existing read cycles (and feed the CPU fake data like "ORA #0")
* (for CHR->PRG) We'd have to force PRGRAM off the bus while we were writing to it -- i.e. code that read/wrote/ran from it during DMA wouldn't work.
* (either direction) or have fast PRGRAM+ROM on the cart (~70ns) and do our reads/writes in the first quarter of ph2. Regardless, external hardware can't rapidly read or write to the 2kB of RAM inside the NES, so if we're not just blitting data out of ROM (and if we are, why not use CHRROM with nice bankswitching?) we need PRGRAM too.
In any case, the point of all of this is: I've been trying and failing to brainstorm how you'd use this ability. Any ideas?