question about sprite dma

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
question about sprite dma
by on (#8033)
Im still have doubts of how spirte dma is acomplished, i mean i know that writting to 4014 is the "hi" byte and that $2003 is the "low" byte wich forms a 16 value and transefers from that location in cpu ram to the 256 bytes sprite ram.

But i have been reading that this whole process takes 512 cpu cc, so what is convinient, to copy directly all 256 bytes to sprram from the location "HHLL" or if it takes 512 cpu cc, use them to copy BYTE per BYTE to spr-ram, increse sprram addr and continue copying until byte 256th is copied.

Some questions: does dma stop the cpu fetch-decode-execute?, if dma is currently "copying" bytes to spr-ram.. does those cycles are taken part of ppu emulation, i mean 512cpu cc * 3 ppu cc must be executed? of course not 512 * 3 what it takes every process of reading and writing that i think thats why are 512 cc.

Thanks.
Re: question about sprite dma
by on (#8034)
Anes wrote:
Im still have doubts of how spirte dma is acomplished, i mean i know that writting to 4014 is the "hi" byte and that $2003 is the "low" byte wich forms a 16 value and transefers from that location in cpu ram to the 256 bytes sprite ram.


I'm not sure this is right -- as there are some games which write 0 to $2003, write to $2004 (making $2003=01) then do DMA from there. If you offset by that 1 when DMA'ing the sprites get all jarbled.

I'm currently just doing DMA from HH00 (where HH is the value written to $4014) -- although I'm not sure that's entirely correct.


Quote:
But i have been reading that this whole process takes 512 cpu cc


513. There seems to be a dummy cycle in there somewhere.

Quote:
does dma stop the cpu fetch-decode-execute?


Yes. CPU is effectively frozen for the 513 CPU cycles it takes to DMA. PPU, APU and everything else continue normally.

If you're doing the typical method of emulation where you have a timestamp for each system (CPU/APU/PPU etc) to sync them up -- this can easily be simulated by just adding 513 to the CPU timestamp on DMA.

by on (#8035)
That was me -- keep getting logged out

by on (#8037)
thanks!! now Battletoads runs. It seems the ppu needed those DMA Cycles.
I freeze everything else when DMA is "ON", but i steel have doubts about this.
I dont know to much of computer's DMA but i heard that is done to let the cpu work while another device is accessing memory. Maybe this dont apply to NES, but if the docs says that takes an amount of cpu cycles it should be that freezes CPU.

ANES: Battletoads READY 8)

by on (#8040)
So it's really just a memcpy(). It takes 1 cycle to let the CPU bus clear out and then 256 read-write pairs. Well at least Nintendo's being consistent, as what is called "DMA" on the Game Boy Advance is just a memcpy() as well.

by on (#8042)
owadays (in today's computers) DMA allows direct access to memory without needing to interrupt what the CPU is doing. This requires the CPU to be able to execute without having to go to memory - as soon as the CPU needs to access memory, it must halt until any pending DMA action is completed. Since today's CPUs have caches, it does not need to access physical memory all the time, thus causing the data bus to be free for periods of time. It is during these free periods that a DMA can be accomplished without halting the CPU.

The NES's CPU performs a physical memory access on EVERY clock cycle. Therefore, DMA will always halt the CPU. However, the benefit of DMA still exists in the sense that time is saved - performing 256 LDA/STA instruction pairs takes much longer than performing sprite DMA.