In-Hardware Pixel Changing Possible?

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
In-Hardware Pixel Changing Possible?
by on (#23948)
I read that if PPU rendering was off, the palette address chosen by $2006 would determine the color to be rendered on screen, and the color could be changed on screen by $2007. Would it be possible to do something similar, with PPU rendering and $2006 set to a palette address, and have hardware inside a cartridge changing the PPU data bus?

by on (#23950)
Maybe if there's a way to take over the CPU bus and write to the PPU registers (when phase2 clock is low, would that be possible?). The palette memory is internal to the PPU, so AFAIK the PPU doesn't touch the data bus while the screen is off.

I was thinking if you have to increment the address to change the color, then you'll be cycling through color #0 which would end up drawing bars on the screen. But I guess that's no problem if color #0 is always changing.

by on (#23952)
Memblers wrote:
Maybe if there's a way to take over the CPU bus and write to the PPU registers (when phase2 clock is low, would that be possible?).


I don't think it would be possible for hardware to update the PPU registers quickly enough. The CPU R/W signal is output only, so I don't think hardware inside the cartridge could directly force a write signal to the CPU. Maybe the CPU bus could be hijacked at that instruction-fetching cycle to force a STA $2007, but since STA takes 4 CPU cycles, that's 12 PPU cycles, so STA would be too slow to update individual pixels. Perhaps someone could test a program that sets a palette address at $2006 and have hardware fix a value at the PPU data bus to check if the PPU does anything with it while rendering is off.

by on (#23953)
The PPU reads the color value from internal palette RAM, so it seems the only way to change this is to write to palette RAM normally or change the PPU address via $2006 or $2007. Doesn't seem workable to me. The MMC5's method seems the only available, where you feed it customized data for each tile and attribute read.
what about if....
by on (#23973)
what about if you do this... though it probably wont work cause i thought of it. hehe

set all internal vram locations to $00, init PPU to show the background using $2000, and then toggle a latch in the cart to pull the read signal for the VRAM high to disable its output

on the cartridge, have a programable chip output certain values when certain addresses are shown on the bus.

examble:
$2000 = $01
$3f00 - $3f03 = current color you want on screen

$0000-$000f = pattern of colors 1 and 2 alternating:
12121212
12121212
12121212
12121212
12121212
12121212
12121212
12121212


$0010-$001f = pattern of colors 1 and 2 alternating as in previous tile, but with color 0 in the top left position:
02121212
12121212
12121212
12121212
12121212
12121212
12121212
12121212


by monitoring the switch between colors 0,1, and 2, you can increse one of two counters when the colors change. one counter should count from 0 to 255 and the other should count from 0 to 239. when the first counter reachs 256, increase the second counter by one and reset the first counter. this should give you an X,Y location on the screen. when color 0 shows on the address bus, reset both counts to 0,0

now, depending on where the X,Y location is, we can pull the color information for a given pixel from out own VRAM located on the cart and spit that info out when locations $3f00 - $3f03 are read by the PPU

by on (#23974)
hrm. after re-reading blarggs post... that wouldnt work. hehe

by on (#23976)
IF you could externally provide data for $3F00-$3FFF (which you cannot, to my best knowledge), you could just disable PPU rendering then feed the data as you please each time the PPU read $3F00.

EDIT: clarified feasibility of approach.

by on (#23977)
What happens to the CPU bus when a KIL/JAM undocumented opcode is executed?

by on (#23991)
It seems that you can't really feed individual pixels to the PPU without modifying the NES. :( I did think about some ideas for a mapper to enhance the graphical abilities of the NES.
  • Every 8x8 pixels would have its own attributes, similar to the MMC5. But unlike the MMC5, these extra attributes would fit for all 4 nametable spaces to be used.
  • Attribute flickering to create illusions of different colors, something similar is mentioned in Brad Taylor's hardware guide
  • Selecting CHR banks with specific BG tiles and sprites
  • Support for both CHR ROM and CHR RAM
  • Maybe around 32-64KB of dual-ported ExRAM for an extra OAM copy, extra nametables, extra tile attributes, scratch RAM for some special effects, and CHR RAM. The ExRAM would be specified at certain points in the 2A03's memory map and would be able to be written directly by the program.
  • Individual BG tiles would have the ability to have higher priority than certain sprites rather than sprites having lower priority than the whole BG. This would be handled by a mapper evaluating a copy of OAM in extra RAM inside the cartridge for every scanline, similar to how the PPU itself evaluates its own internal OAM. The mapper would determine whether the sprite pixels on a high-priority BG tiles should be displayed or not. If it shouldn't, the mapper would blank out those sprite pixels when the pattern data for its sprite is fetched by the PPU.
  • OAM cycling and setting up sprite layers with the extra OAM copy inside the cart
  • More versatile priority settings for sprites (under whole BG, under high-priority BG tiles, and on top of BG)
  • BG and sprite tile modifications, including pixel replacement (for example, changing pixels that are color 2 to those of color 1), pixel masks, tile pasting, tile flipping, and tile shifting. Pixel replacement, tile flipping, and tile shifting would be applied as the PPU fetches the pattern data, or could be used to draw new tiles to CHR RAM.
  • For pixel masks, pixels of affected tiles, affected section of the whole BG, or affected section of the screen covered by the mask would turn to a certain pixel.
  • For tile pasting, certain tiles could also be pasted onto other tiles to draw new tiles to CHR RAM.
  • For tile shifting, tiles can shift either independently or along with other tiles. This could be used to fake parallax scrolling.
  • With another processor inside the cartridge, memory could be sent to ExRAM by DMA ordered by the program.
  • Fill mode like the MMC5 would be supported as an extra nametable
  • Nametable mirroring would be very versatile, with the ability to use the NES's usual internal nametables, the in-cartridge nametables, fill mode, and CHR ROM banks to assign each of the NES's 4 nametable spaces.
  • Multiple vertical screen splits similar to the MMC5's single vertical screen split, with the tile limitations in mind, but with the ability to use different nametables
  • Support in attribute tile data for specifying animating banks
  • Support for automatic nametable mirroring changes, attribute changes, effect changes, and animation changes at individual pixels on a scanline.
  • Special pre-programmed IRQs (spIRQs) specified for changing the scroll or a single palette color at a scanline. The IRQs would go to somewhere pre-specified by the mapper. Here would be example code:
    Code:
    ;For changing the scroll...(equations are based on the Everynes site)
    setupstuff: ;somewhere in the program
     lda horiz
     sta horizvalue ;mapper register for the horizontal scroll value
     lda vert
     sta vertvalue ;mapper register for the vertical scroll value
     lda screenoff
     sta turnoffvalue ;mapper register for the PPU off value
     lda pal
     sta paltoupdate ;mapper register for the palette to update
     lda color
     sta colortouse; mapper register for the color to update palette with
     lda ppumaskltemp
     sta turnonvalue ;mapper register for the PPU turn on value
     lda ppuctrltemp
     sta ctrlvalue ;mapper register to store state of PPUCTRL
    scrollirq: ;at specific area in memory map, $4100 for example
     pha
     bit $2002 ;reset flip-flop
     sta $2006 ;hijack data bus with 4(horizvalue/256) + 8(vertvalue/240) + 4(ctrlvalue AND 7Fh)
     sta $2005 ;hijack data bus with (vertvalue MOD 240) AND C7h
     sta $2005 ;hijack data bus with horizvalue AND 7h
     sta $2006 ;hijack data bus with (horizvalue AND F8h)/8 + 4((Y MOD 240) AND 38h)
     pla
     rti
    scrollirqfine: ;scanline after scrollirq, $4112 for example
     pha
     bit $2002 ;reset flip-flop
     sta $2005 ;hijack data bus with horizvalue
     pla
     rti
    updateonecolor: ;$411B for example
     pha
     bit $2002 ;reset flip-flop
     sta $2001 ;hijack data bus with turnoffvalue
     sta $2006 ;hijack data bus with $3F
     sta $2006 ;hijack data bus with paltoupdate
     sta $2007 ;hijack data bus with colortouse
     sta $2006 ;hijack data bus with 4(horizvalue/256) + 8(vertvalue/240)
     sta $2006 ;hijack data bus with (horizvalue AND F8h)/8 + 4((Y MOD 240)
     sta $2001 ;hijack data bus with turnonvalue
     pla
     rti

All of this (not including normal mapper features) would probably need at least some RAM, some logic chips, a small ROM chip, and a processor (faster than the NES PPU). Does anybody know about how much such a mapper would cost?

by on (#23993)

by on (#23994)
strangenesfreak wrote:
It seems that you can't really feed individual pixels to the PPU without modifying the NES. :( I did think about some ideas for a mapper to enhance the graphical abilities of the NES.

A lot of your changes (dual ported VRAM, tile priority, cutout window, parallax scrolling, scroll splits, etc.) sound like features of the Game Boy Advance video hardware. So why not just take a GBA motherboard and run its video output through a video encoder that feeds 8x1 pixel slivers in red, green, blue, or gray to the PPU, and use the CPU just to read the controllers and copy audio out to the DAC at $4011?

Quote:
All of this (not including normal mapper features) would probably need at least some RAM

GBA has 32 KiB of IWRAM, 256 KiB of slower EWRAM, and 96 KiB of memory-mapped pseudo-dual-ported VRAM. The video translation would need a frame buffer of roughly 10 KiB to even out the scan rate difference between the NES and GBA.

Quote:
some logic chips

Dunno.

Quote:
a small ROM chip

8 KiB.

Quote:
and a processor (faster than the NES PPU)

Would ARM7TDMI work?

Quote:
as the mapper would have to do a lot of complex work

You know, like running the game.

Quote:
Does anybody know about how much such a mapper would cost?

Nintendo was able to sell Game Boy Player for $50.

by on (#24006)
Don't try to overdo PPU enhancements. No matter what features you want to implement, you must take into account the PPU's limitations that can never be overcome, most notably regarding the palette (the original topic here).

My personal idea for PPU enhancement would have 16-bit nametable entries. The lower 12 bits would contain the tile ID, bits 12-13 would have the color attribute, and bits 14-15 would have H/V flip control (H-Flip would be implemented by reversing the bit order of pattern fetches, while V-Flip would involve a simple XOR of CHR A0-A2 during pattern fetches).

I'd also like to see VRAM updating through DMA. This will require special RAM in the cart that would contain the VRAM update buffer. The game would write to this buffer during its normal frame execution, then when VBlank comes, it would write to a mapper register to start DMA. The mapper would then block CPU access to the special RAM as it parses the buffer and updates VRAM. (The 2KB of VRAM in the PPU would be unused, with all VRAM cart-loaded so the PPU bus doesn't need to be utilized.)

Those are the two main enhancements I'd like to see personally.

by on (#24007)
tepples wrote:
A lot of your changes (dual ported VRAM, tile priority, cutout window, parallax scrolling, scroll splits, etc.) sound like features of the Game Boy Advance video hardware. So why not just take a GBA motherboard and run its video output through a video encoder that feeds 8x1 pixel slivers in red, green, blue, or gray to the PPU, and use the CPU just to read the controllers and copy audio out to the DAC at $4011?

I guess those 8x1 pixel silvers would sort of work, though the (probably needed) flickering between shades of red, green, and blue might be a bit noticible.
loopy wrote:
This is exactly why I'm anxious for the PowerPak ... all kinds of potential for making your own hardware enhancements.

I guess I kind of forgot about the PowerPak there. I can't wait to see what kinds of projects and games people can do with the PowerPak. :)
dvdmth wrote:
Don't try to overdo PPU enhancements. No matter what features you want to implement, you must take into account the PPU's limitations that can never be overcome, most notably regarding the palette (the original topic here).

Yeah, I could see how too many PPU enhacements could make the graphics look way too ahead of its time for the NES and look more like for a newer system.
dvdmth wrote:
I'd also like to see VRAM updating through DMA. This will require special RAM in the cart that would contain the VRAM update buffer. The game would write to this buffer during its normal frame execution, then when VBlank comes, it would write to a mapper register to start DMA. The mapper would then block CPU access to the special RAM as it parses the buffer and updates VRAM. (The 2KB of VRAM in the PPU would be unused, with all VRAM cart-loaded so the PPU bus doesn't need to be utilized.)

This would be great for quickly making huge nametable updates, like for large animations, huge bosses, etc. It could possibly allow for very large moving BG-rendered objects while on a regular BG (instead of a black screen like most NES games with moving BG bosses) without using too much of the CPU's time. Faking large layers could also be possible with tiles pre-drawn for the effect without using too much time.

by on (#24009)
Quote:
My personal idea for PPU enhancement would have 16-bit nametable entries. The lower 12 bits would contain the tile ID, bits 12-13 would have the color attribute, and bits 14-15 would have H/V flip control (H-Flip would be implemented by reversing the bit order of pattern fetches, while V-Flip would involve a simple XOR of CHR A0-A2 during pattern fetches).

That's somewhat similar to the MMC5 exgraphics mode, exept it doesn't have tile flipping (at least it's not known to have it). Vertical flipping could defintiely be done pretty much simply in hardware.
Quote:
I'd also like to see VRAM updating through DMA. This will require special RAM in the cart that would contain the VRAM update buffer. The game would write to this buffer during its normal frame execution, then when VBlank comes, it would write to a mapper register to start DMA. The mapper would then block CPU access to the special RAM as it parses the buffer and updates VRAM. (The 2KB of VRAM in the PPU would be unused, with all VRAM cart-loaded so the PPU bus doesn't need to be utilized.)

Do it as you describe is impossible, because no mapper can halt the CPU. If that would be possible, you wouldn't even need to use any other RAM than the internal one.
Actually one possible thing would be to have a mapper that maps some RAM on your cartridge to the CPU bus, then switch it to the PPU bus, and alternate between 2 banks (one would be PPU and the other CPU, and both swappable). This would be tedious to design, not to mention routing.