Quick question... ^_^;; How many CPU cycles per sprite dma transfer 4014h? I was stated 512 cycles, but there's something more...?
Not including the 4 cycles used for the "STA $4014", sprite DMA takes exactly 513 cycles. This has been timed on a real NES by using a strictly cycle-timed video effect (namely, enabling grayscale mode for a square region in the center of the screen such that it stays reasonably still even over thousands of frames).
Thanks. Two things now...
1. The "extra" cycle reason? I know there's 1 cycle to read and 1 to write, making 512 cycles. Why 513?
2. Should I clock the 4 cycles (of STA) before or after the DMA?
And yes, the STA instruction should complete before the DMA begins.
Apparently the extra cycle is required for the 2A03 I/O logic to halt the 6502 core for long enough to reliably start the DMA without interrupting a write. (Imagine if someone hypothetically did INC $4014 or ASL $4014; you'd get a read-write-write.) The DMC playback takes longer than it "should" to fetch bytes as well (four cycles rather than one) because if an NMI or IRQ occurs, it has to wait for up to three stack writes to complete before interrupting the CPU.