Well, it's that complicated because they wanted it to be.
As for what goes on inside, here's my guess at it:
PPU1 handles sprites and addressing. PPU2 snoops the data bus to pick up BG data.
PPU2 handles backgrounds, and all the actual screen composition.
PPU1 figures out the sprite priority among sprites, and feeds the 4-bit color, 3-bit palette select, and 2 bit priority for the current pixel to PPU2 via the CHR, PRIO, and COLOR lines as listed on the schematics at neviksti's site.
From there, the ordering isn't that crazy. Two main cases to worry about, 3/4 screen and 1/2 screen.
3/4 screen:
Background color
BG3/4 Pri 0
OBJ 0
BG3/4 Pri 1
OBJ 1
BG1/2 Pri 0
OBJ 2
BG1/2 Pri 1
OBJ3
(BG3 Pri 1 override case)
The $2105.d3 override would be for placing a 16-color HUD in BG3, and having it appear over sprites. Handy for fighting games or whatnot.
1/2 screen:
Background color
BG2 Pri 0
OBJ 0
BG1 Pri 0
OBJ 1
BG2 Pri 1
OBJ 2
BG1 Pri 1
OBJ 3
Internally, PPU2 has 2 priority decoders, one for the main screen, one for the sub screen. The inputs to each of these are the set of BGs and the OBJ, after filtering down based on which ones are enabled for that screen, and based on the window logic.
There's a number of ways the priority logic could be implemented, but a couple of observations simplify things:
Aside from the d3 exception, BG1/2 are always over BG3/4.
Priority 1 BGs are in front of Priority 0 BGs.
First we magic up some signals, where BGx is 1 if that BG is not transparent at this pixel, and m1ex is 1 if we're in mode 1, and bit3 of $2105 is set. Then we come up with a 2 bit select, which will also serve as the priority.
For 2-screen mode, sel[1] = (BG2 & BG2P) | (BG1 & BG1P). sel[0] = (BG1 &BG1P) | (BG1 & ~(BG2 & BG2P)). Feed these in as the selector lines for a 4:1 multiplexer, with the inputs set to (BG2, BG1, BG2, BG1). Feed the output of this selector to the next stage, along with the transparency flag for the selected BG.
For 4-screen mode, we have two of those, one for BG1/2, one for BG3/4. Call the signals for this fsel, ft, bsel, and bt, for the select and the transparency (1=solid, 0=transparent). f is for BG1/2, b is for BG3/4. sel[1] = ft; sel[0] = (ft & fsel[1]) | (~ft & bsel[1]); color output is from the f mux if ft, b mux if not. send the color output, sel, and ft | bt as the transparency flag forward.
Screen color logic would be something like:
Code:
if (BG3 & BG3P & m1ex)
color <= BG3
else if (OBJ & (OBJP >= sel))
color <= OBJ
else if (BGT)
color <= BG
else
color <= background
There's probably some sneaky way of building a priority encoder that would reduce this stuff down a bit, but the above should be reasonably efficient in hardware.