Beginner questions, CGRAM/OAM/VRAM DMA, Program flow

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Beginner questions, CGRAM/OAM/VRAM DMA, Program flow
by on (#162067)
Hello, I've been attempting SNES development again. I'm using the official dev docs, Fullsnes docs, wikis, Yoshi docs, these forums etc for info but there are lots of things I don't understand.
Lots of questions:

1. The flow chart in chapter 23 in the dev docs doesn't mention shadow OAM. Is it commonly used in SNES games? I reserved space for my OAM data from WRAM page 2 anyway. I guess a VRAM buffer and a CGRAM buffer would be good to have too.

2. There seems to be three CPU speeds. I guess 1.79MHz mode is used in emulation mode for compatibility with 6502? It's also used for controllers it seems.

3. . According to the dev docs, low 8 kB WRAM can be accessed from any bank between $00~$3F and $80~$BF. If you access them rom bank $80+, wouldn't they gain the benefit from high speed mode, or are they too slow chips?

4. When doing DMA for CGRAM, I couldn't load the colours unless I picked source bank 1 for DMA. But I put the data in bank 7! What's going on? I'll upload my code soon.

5. When using DMA, do I need to wait a bit before using the same channel again so it won't interfere with the first transfer? The dev docs flow chart says to initialize CGRAM and OAM "using 2 channels DMA", what do they mean by that? I just put two DMA transfers after each other using channel 0 for CGRAM and channel 1 for OAM. But in that case wouldn't you need to use at least three channels for BG, sprites and a tilemap when initializing VRAM? The dev docs is a bit confusing for this part.

6. The flow chart also says to make a loop for the VRAM DMA transfers like this:
Code:
loop:
"VRAM address H/L INC" settings
"VRAM address Sequence Mode" settings
"VRAM address" settings
register 2115h 2116h 2117h

VRAM transfer by DMA

OBJ, BG data, BG SC data transferred?
If No: goto loop
If Yes: next

How would this be done in practice? I'd guess I'd read the VRAM using $2139 and $213A and somehow compare it with my data in ROM to see if it's been transferred properly. I haven't seen anyone do this though.

7. Although SEI was used at the beginning of the program, it seems I don't need to use CLI to enable interrupts again like on NES. I don't understand why.

8. The flow chart in chapter 23 says to enable NMI and controllers at $4200 in the beginning of every main program loop. Is this really needed? Haven't seen anyone enabling it more than once.

Thanks in advance!
Re: Beginner questions, CGRAM/OAM/VRAM DMA, Program flow
by on (#162068)
Pokun wrote:
2. There seems to be three CPU speeds. I guess 1.79MHz mode is used in emulation mode for compatibility with 6502? It's also used for controllers it seems.

As far as I can tell, ultra-slow (1.8) mode is used only for the controller ports, with everything else at 2.7 or 3.6.

Quote:
3. . According to the dev docs, low 8 kB WRAM can be accessed from any bank between $00~$3F and $80~$BF. If you access them rom bank $80+, wouldn't they gain the benefit from high speed mode, or are they too slow chips?

It's still slow RAM. If you want a small amount of fast RAM for a tight loop, set D to $4300 and use unused DMA registers as your direct page.

Quote:
5. When using DMA, do I need to wait a bit before using the same channel again so it won't interfere with the first transfer? The dev docs flow chart says to initialize CGRAM and OAM "using 2 channels DMA", what do they mean by that? I just put two DMA transfers after each other using channel 0 for CGRAM and channel 1 for OAM. But in that case wouldn't you need to use at least three channels for BG, sprites and a tilemap when initializing VRAM? The dev docs is a bit confusing for this part.

Perhaps the idea is that you can set up the channels during draw time, when writing to video memory is forbidden, and then just activate them all once vblank hits so as not to waste valuable vblank CPU cycles on poking in a new address.

Quote:
7. Although SEI was used at the beginning of the program, it seems I don't need to use CLI to enable interrupts again like on NES. I don't understand why.

If you use only the vblank NMI, not NES's mapper or DMC IRQ or the Super NES's HV IRQ, you don't need to CLI on either platform.

Quote:
8. The flow chart in chapter 23 says to enable NMI and controllers at $4200 in the beginning of every main program loop. Is this really needed? Haven't seen anyone enabling it more than once.

I'm guessing it means before the loop, not during the body of the loop. A game might turn off NMI and controller reading while sending new data to the S-PPU and S-SMP.
Re: Beginner questions, CGRAM/OAM/VRAM DMA, Program flow
by on (#162074)
Ah thanks for these answers. I'll upload my code in the attachment of this post.

tepples wrote:
Quote:
5. When using DMA, do I need to wait a bit before using the same channel again so it won't interfere with the first transfer? The dev docs flow chart says to initialize CGRAM and OAM "using 2 channels DMA", what do they mean by that? I just put two DMA transfers after each other using channel 0 for CGRAM and channel 1 for OAM. But in that case wouldn't you need to use at least three channels for BG, sprites and a tilemap when initializing VRAM? The dev docs is a bit confusing for this part.

Perhaps the idea is that you can set up the channels during draw time, when writing to video memory is forbidden, and then just activate them all once vblank hits so as not to waste valuable vblank CPU cycles on poking in a new address.

This part is before the main loop when forced blanking is still in effect. I'm pretty sure it's about copying data to VRAM for drawing the initial screen. But I guess it's enough to fill buffers and let the next NMI take care of the copying to VRAM.

Quote:
Quote:
7. Although SEI was used at the beginning of the program, it seems I don't need to use CLI to enable interrupts again like on NES. I don't understand why.

If you use only the vblank NMI, not NES's mapper or DMC IRQ or the Super NES's HV IRQ, you don't need to CLI on either platform.
I see, I always thought SEI/CLI affected all interrupts on the CPU side.

Quote:
Quote:
8. The flow chart in chapter 23 says to enable NMI and controllers at $4200 in the beginning of every main program loop. Is this really needed? Haven't seen anyone enabling it more than once.

I'm guessing it means before the loop, not during the body of the loop. A game might turn off NMI and controller reading while sending new data to the S-PPU and S-SMP.

In the flow chart it's clear that it's in the main loop (display period). Maybe it's just a safety measure in case the registers just randomly corrupts for whatever reason? But that would go for all registers set in the beginning too.

Here is the flow chart I spoke of (it's just 3 pages so it should be OK). The diagram itself is pretty clear, but the language is a bit weird. I think the whole document was a bit badly translated sometimes.
http://s2.postimg.org/ga4ygkxjd/23_1.png
http://s22.postimg.org/lyhen52wh/23_2.png
http://s15.postimg.org/xja13i68r/23_3.png


And here's my code at the bottom. I got CGRAM loaded (but seemingly random results) but OAM and VRAM still doesn't show up in the VRAM viewer of no$sns emulator. I'd appreciate any tips. I'm trying out bass (included in archive) for assembling at the moment which has a bit weird syntax, sorry about that.
Re: Beginner questions, CGRAM/OAM/VRAM DMA, Program flow
by on (#162083)
Pokun wrote:
tepples wrote:
Quote:
5. When using DMA, do I need to wait a bit before using the same channel again so it won't interfere with the first transfer?

Perhaps the idea is that you can set up the channels during draw time, when writing to video memory is forbidden, and then just activate them all once vblank hits so as not to waste valuable vblank CPU cycles on poking in a new address.

This part is before the main loop when forced blanking is still in effect. I'm pretty sure it's about copying data to VRAM for drawing the initial screen. But I guess it's enough to fill buffers and let the next NMI take care of the copying to VRAM.

You can only copy about 5 to 6 KiB to VRAM in one NTSC vblank. I haven't had a problem performing consecutive transfers to tile VRAM, map VRAM, and CGRAM because the CPU pauses until completion while a DMA copy is running.

Quote:
Here is the flow chart I spoke of

This is what it's probably trying to say:
Code:
Power on: JMP ($FFFC)

Set forced blank by writing $8F to $2100
Initial settings:
  Clear each register (chapter 26)
  Main register settings:
    Set background mode and tile size ($2105)
    Set pattern table base addresses ($210B-$210C)
    Set nametable base addresses and sizes ($2107-$210A)
    Set sprite tile base address and sizes ($2101)
  Set other settings up through $212D
  OAM and CGRAM data settings:
    Set OAM address to 0 through $2102-$2103
    Set CGRAM address to 0 through $2121
    Use two channels of DMA copy to load initial OAM and CGRAM data
  VRAM data settings:
    Repeat until all tile and map data is transferred:
      Set increment amount and increment trigger address ($2115)
      Set initial word address ($2116-$2117)
      Use DMA copy to load VRAM data
    Set up registers for initial screen
Turn off forced blank by writing $0F to $2100
While not done:
  During display period:
    Prepare new data to be copied to tile, map, OAM
    Prepare new values of PPU registers (such as scrolling)
    Enable NMI and controller autoreading by writing $81 to $4200
    Wait for NMI to signal the beginning of vertical blanking
  During vblank:
    Use DMA copy to load OAM data
    Copy tile and map updates to VRAM
    Set new values of PPU registers
  Wait until at least 215 microseconds after NMI for autoread to complete
  Read controller data from $4218-$421F

Find the parallels to how things are set up on the NES:
Code:
Power on: JMP ($FFFC)

Set forced blank by writing $00 to $2001
Wait 2 frames through $2002
Initial settings:
  Clear each register
  Main register settings:
    Set pattern table base addresses ($2000 and mapper)
    Set nametable base addresses and sizes ($2000 and mapper)
    Set sprite tile base address and size ($2000)
  OAM and CGRAM data settings:
    Set OAM address to 0 through $2003
    Use DMA copy to load initial OAM data
    Set VRAM address to $3F00 through $2006
    Load CGRAM data through $2007
  VRAM data settings:
    Repeat until all map data is transferred:
      Set increment amount ($2000)
      Set initial address ($2006)
      Write VRAM data through $2007
    If using CHR RAM, repeat for tile data
    Set up registers for initial screen
Turn off forced blank by writing $1E to $2001
While not done:
  During display period:
    Prepare new data to be copied to tile, map, OAM
    Prepare new values of PPU registers (such as scrolling)
    Enable NMI by writing $80+ to $2000
    Wait for NMI to signal the beginning of vertical blanking
  During vblank:
    Use DMA copy to load OAM data
    Copy tile and map updates to VRAM
    Set new values of PPU registers
  Read controller data through $4016-$4017
Re: Beginner questions, CGRAM/OAM/VRAM DMA, Program flow
by on (#162090)
tepples wrote:
You can only copy about 5 to 6 KiB to VRAM in one NTSC vblank. I haven't had a problem performing consecutive transfers to tile VRAM, map VRAM, and CGRAM because the CPU pauses until completion while a DMA copy is running.

Oh of course I can't update the whole screen in vblank, I forgot. So the CPU pauses during DMA, then what is the meaning of 2 channel DMA for OAM and CGRAM if they can't be used at the same time?


tepples wrote:
Code:
    Repeat until all tile and map data is transferred:
      Set increment amount and increment trigger address ($2115)
      Set initial word address ($2116-$2117)
      Use DMA copy to load VRAM data

Aha so it's not like they mean to use an actual loop in your code, just that you should keep adding DMA transfers to your code until you have covered all your initial data you want to upload to VRAM.


Do they mean anything with "Recognize beginning of VBLANK period by NMI"? The NMI flag at $4210 is automatically cleared at the end of vblank, and according to Fullsnes you normally don't need to clear it (although in Bazz tutorial they do clear it at the end of NMI).
Re: Beginner questions, CGRAM/OAM/VRAM DMA, Program flow
by on (#162093)
Pokun wrote:
So the CPU pauses during DMA, then what is the meaning of 2 channel DMA for OAM and CGRAM if they can't be used at the same time?

You don't have to use two channels if you don't want. Not being a licensed developer, I can only speculate why they used 2 channels instead of 1. Perhaps it saved having to activate DMA twice.

Quote:
Do they mean anything with "Recognize beginning of VBLANK period by NMI"? The NMI flag at $4210 is automatically cleared at the end of vblank, and according to Fullsnes you normally don't need to clear it (although in Bazz tutorial they do clear it at the end of NMI).

Likewise on the NES, the NMI flag at $2002 is automatically cleared at the end of vblank, and according to Everynes you normally don't need to clear it (although in a few tutorials they do clear it at the end of NMI).

So anyway, per NMI on the wiki, what normally happens on the NES or Super NES is that the NMI handler sets a flag somewhere in memory, and the main thread spins on that to see whether NMI has occurred.
Code:
.proc vsync
  lda f:nmis  ; fetch the count of NMIs
loop:
  wai         ; turn off the CPU until next interrupt
  cmp f:nmis  ; if the NMI handler didn't update the count,
  beq loop    ; it must not been something other than NMI
  rts
.endproc

.proc nmi_handler
  rep #$20
  inc f:nmis  ; signal to main thread that NMI count has increased
  rti
.endproc

This is the "main thread only" option described at NMI thread. It's also possible to run your VRAM update code or even your entire game in the NMI handler.

(For those with NES experience: wai is a 65816 instruction that pauses the CPU until the next IRQ or NMI. It reduces power consumption on hardware, reduces host CPU usage on emulators, and reduces jitter on either one. The f: prefix in ca65 forces a "far" (24-bit) address, making the code independent of the value in data segment register B. The rep #$20 instruction sets a processor status flag such that instructions dealing with register A or with read-modify-write operations handle 16-bit, not 8-bit, values.)
Re: Beginner questions, CGRAM/OAM/VRAM DMA, Program flow
by on (#162097)
Pokun wrote:
tepples wrote:
You can only copy about 5 to 6 KiB to VRAM in one NTSC vblank. I haven't had a problem performing consecutive transfers to tile VRAM, map VRAM, and CGRAM because the CPU pauses until completion while a DMA copy is running.

Oh of course I can't update the whole screen in vblank, I forgot. So the CPU pauses during DMA, then what is the meaning of 2 channel DMA for OAM and CGRAM if they can't be used at the same time?


If you enable multiple DMA channels at once, each one will start as soon as the previous one finishes, and the CPU will only start executing instructions again after all of them are done.
Re: Beginner questions, CGRAM/OAM/VRAM DMA, Program flow
by on (#162100)
tepples wrote:
As far as I can tell, ultra-slow (1.8) mode is used only for the controller ports, with everything else at 2.7 or 3.6.

Also if I recall correctly the controllers are the only thing in the SNES that is still mapped to the registers the NES used. So I wouldn't be surprised if it was indeed originally for compatibility (obviously pointless later on).
Re: Beginner questions, CGRAM/OAM/VRAM DMA, Program flow
by on (#162109)
Yup they even have the same address $4016/$4017. Apparently if auto-controller-reading is turned off the old registers can be used instead the same way as for NES. Also in order to use special peripherals like multitap or mouse, the old registers must be used.


Alright I figured out the problem! I was misunderstanding how $420B works. It enables DMA channels 0~7 by setting the corresponding bit 0~7, of course! Now everything works as expected!


It's just question 4 that remains unanswered. You'd have to download my code and check the DMA settings for CGRAM, VRAM etc in "main.asm". There I'm using bank 1 as source bank, yet my data is supposed to be placed in bank 7. It might just be something with bass I just don't understand.