PPU scrolling help/PPU tips

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
PPU scrolling help/PPU tips
by on (#99301)
Hey guys, been plugging away all weekend at getting my PPU more and more bug free for NROM games and I'm really stuck on implementing Loopy's method for scrolling. I've been staring at this for way too long so I'm hoping it's just a couple of dumb mistakes and some of you guys can point me in the right direction.

Basically I started learning the PPU by building a nametable viewer, nothing really more than that, so its rendering was really sloppy. After enough cycles for VBlank to begin I would render every tile in the nametable in one fell swoop. I was actually able to get Donkey Kong, Donkey Kong Jr, and Balloon Fight very very playable and almost totally cool by using that method. Now I'm trying to break up my run loop into a more cycle/scanline accurate way so that I can add scrolling (Ice Hockey and Excitebike are my current testing targets). What I've done is break up the tile rendering into a process that occurs every 8 scanlines on the last cycle of the scanline. This way I still do rendering tile-by-tile but it happens in a manner that allows things to happen mid-frame.

I've left the original sloppy render function in place, here is what I get (note the palette error is something new, I was playing with a new method of picking palettes last night):

Image

When I flip to the row-of-tiles method I get:

Image

The functions in question are renderNametable() and renderTileRow(). The code is available here: https://gist.github.com/3687553

I'm hoping I'm just missing some minute detail, but like I said, I've been staring at this way too long. Thanks for any help guys!
Re: PPU scrolling help/PPU tips
by on (#99306)
What happens to the VRAM address while rendering is off? My first guess is that the title screen loads wrong because some of the incrementing logic remains active even while rendering is disabled with a write of #$00 to PPUMASK ($2001).
Re: PPU scrolling help/PPU tips
by on (#99312)
Aha! Thank you!

Image

Any thoughts on what might be going wrong with the palette at this point?
Re: PPU scrolling help/PPU tips
by on (#99313)
fergus_maximus wrote:
Any thoughts on what might be going wrong with the palette at this point?

Are you properly implementing palette mirroring? Some games like to write to the mirrors, for whatever reason.
Re: PPU scrolling help/PPU tips
by on (#99319)
tokumaru wrote:
Are you properly implementing palette mirroring? Some games like to write to the mirrors, for whatever reason.

I am, and I believe it's correct:
Code:
func (p *Ppu) writeMirroredVram(a int, v Word) {
   if a >= 0x3F00 && a < 0x3F20 {
      // Palette table entries
      p.PaletteRam[a-0x3F00] = v
   } else if a >= 0x3F20 && a < 0x3F40 {
      // Palette table entries
      p.PaletteRam[a-0x3F20] = v
   } else if a >= 0x3F40 && a < 0x3F80 {
      // Palette table entries
      p.PaletteRam[a-0x3F40] = v
   } else if a >= 0x3F80 && a < 0x3FC0 {
      // Palette table entries
      p.PaletteRam[a-0x3F80] = v
   } else {
      p.Vram[a-0x1000] = v
   }
}

The current method I'm using to compute the palette is a hodge podge of a bunch of things, either my own devices or other tips I've found on these forums and other emulators. Here's the method I'm using currently to render a tile:
Code:
attrAddr := 0x23C0 | (p.VramAddress & 0xC00)
shift := p.AttributeShift[p.VramAddress&0x3FF]

attr := p.Vram[attrAddr+((x&0x1F)>>2)+((x&0x3E0)>>7)*8]
attr = (attr >> shift) & 0x03

t := p.bgPatternTableAddress(p.Vram[p.VramAddress + 0x2000])
p.decodePatternTile(t, x*8, p.Scanline - 8, p.bgPaletteEntry(attr), nil)

p.VramAddress++

Note that the above code works fine for rendering the nametable all at once at the end of a frame, which I'm thinking is just dumb luck with the games I'm testing on. I've also tried the method described here but that results in this:
Image
And here is that code:
Code:
for i, _ := range p.AttributeShift {
   x := uint(i)
   p.AttributeShift[i] = ((x >> 4) & 0x04) | (x & 0x02)
   p.AttributeLocation[i] = ((x >> 2) & 0x07) | (((x >> 4) & 0x38) | 0x3C0)
}

...

shift := p.AttributeShift[p.VramAddress&0x3FF]
attr := p.Vram[p.AttributeLocation[p.VramAddress&0x3FF]] >> shift) & 0x03) << 2

t := p.bgPatternTableAddress(p.Vram[p.VramAddress + 0x2000])
p.decodePatternTile(t, x*8, p.Scanline - 8, p.bgPaletteEntry(attr), nil)

p.VramAddress++
Re: PPU scrolling help/PPU tips
by on (#99388)
Um, I think that the palette is only 32 bytes long, so why is there an interval of 64 bytes for the last 2 mirror bits?
Re: PPU scrolling help/PPU tips
by on (#99391)
Alegend45 wrote:
Um, I think that the palette is only 32 bytes long, so why is there an interval of 64 bytes for the last 2 mirror bits?


Bah yes, good call. That was the result of trying everything I could think of at 1am.
Re: PPU scrolling help/PPU tips
by on (#99507)
Okay guys, been digging through a ton. I've rewritten the PPU to render backgrounds per scanline which also meant I had to get each PPU tick behaving a bit more naturally. I know I'm on the right trail but I have one weird issue that I can't seem to isolate:

Image

Similar to my original question, except that it goes away if I change the end-of-vblank value to 321 scanlines from 261:

Image

Consequently Super Mario Bro's also totally loads up with the correct palettes and tile layouts if I do this hack, but other games obviously break (Lode Runner doesn't show the title screen anymore, Dig Dug is just all hell). I know it's some kind of timing issue between the CPU and PPU because I've nailed down the fact that it's rendering the first frame of the title screen before the game has finished filling up the nametables. I know this is the case because I'm seeing scanline -1 get triggered, and the line to reset the VRAM register from the latch is getting fired way too early. Is there a delay in the power up state for the PPU that I'm missing? Maybe an extra vblank that has to occur? Here's my step code for a PPU cycle:

Code:
func (p *Ppu) Step() {
   switch {
   case p.Scanline == 241:
      if p.Cycle == 1 {
         // We're in VBlank
         p.setStatus(StatusVblankStarted)
         // Request NMI
         cpu.RequestInterrupt(InterruptNmi)

         // TODO: per scanline for sprites too
         if p.ShowSprites {
            p.renderSprites()
         }

         // Go lingo for "send the pixels to SDL"
         p.Output <- p.Framebuffer
         p.Cycle++
      }
   case p.Scanline == 261: // End of vblank
      if p.Cycle == 341 {
         p.Scanline = -1
         p.Cycle = 0
         return
      }
   case p.Scanline < 240 && p.Scanline > -1:
      if p.Cycle == 341 {
         p.Cycle = 0
         if p.ShowBackground && p.ShowSprites {
            p.renderTileRow()
            p.updateEndScanlineRegisters()
         }
         p.Scanline++
         return
      }
   case p.Scanline == -1:
      if p.Cycle == 304 {
         // Copy scroll latch into VRAMADDR register
         p.VramAddress = p.VramLatch
      } else if p.Cycle == 1 {
         // Clear VBlank flag
         p.clearStatus(StatusVblankStarted)
         p.clearStatus(StatusSprite0Hit)
      }
   }

   if p.Cycle == 341 {
      p.Cycle = 0
      p.Scanline++
   }

   p.Cycle++
}