What's going on with the MMC5 counter?

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic

What's going on with the MMC5 counter?
by Drag on 2011-04-13 (#76682)

Might as well have this too.

Blargg did some RE work on it
Here's the wiki page for it
Disch's mapper docs are valuable too

So unless there's been some kind of breakthrough that hasn't been posted or documented anywhere, I'm going to assume that, since these sources all say they don't know how the IRQ counter specifically detects scanlines, I'm going to assume nobody knows how it works.

The current theory is that the MMC5 monitors the PPU address lines to detect scanlines. The MMC5 already has to monitor the address lines so that it knows when to inject its own extended tile/attribute data into the PPU.

The current theory is that the MMC5 watches for the two dummy nametable reads that occur at the end of each scanline. The easiest way I can think of to get this to work is to just check to see if the PPU fetches the same address twice in a row.

In theory, the PPU will never fetch from an address twice in a row. (This is unless the garbage nametable fetches during sprite fetches can be the same, but if the sprite fetches use the same circuit as the bg fetch, wouldn't it be a garbage nametable fetch, and then a garbage attribute fetch? That would always be two different addresses) The only time it does fetch from the same address twice is during those two dummy fetches at the end of the scanline.

However, there's also another potential way to get this to work:

Conveniently, A13 toggles for each tile, because the PPU fetches nametable/attribute data, and then pattern table data. A13 is high during the nametable fetch and the attribute fetch, and it's low during the pattern table fetches.

Since the MMC5 is likely already monitoring A13 to detect tiles, maybe it's also counting how many times A13 rises per scanline? A13 will rise exactly 43 times each scanline: 32 times for the BG tiles, 8 times for the sprites, 2 times for the two tiles of the next scanline, and then once for the two dummy nametable reads at the end of the scanline.

This probably isn't how it's done though, because the MMC5 seems to know exactly when a scanline starts, such that it can set the in-frame flag on the next scanline, when the PPU rendering is activated mid-scanline that is.

by qbradq on 2011-04-14 (#76690)

I have to agree that watching for those two consecutive read accesses to the $2000-$2FFF address range seems to be how it is done. Thinking from the hardware side I can see a very convenient (and low pin-count) way of detecting this.

You will need to capture the CHR A11, CHR A12 and CHR /RD signals. This should only add two input pins as CHR A11 is already used for name table mirroring.

Here is what the MyHDL code might look like:
Code:
@always(chr_rd_n.negedge)
def irq_clock():
if chr_a12 == 0 or chr_a11 == 1:
   flags_irq_nt_fetch_last_frame.next = 0
elif flags_irq_nt_fetch_last_frame == 0:
   flags_irq_nt_fetch_last_frame.next = 1
else:
   flags_irq_nt_fetch_last_frame.next = 0
   flags_irq_in_frame.next = 1
   irq_counter.next = irq_counter + 1
   if irq_counter + 1 == irq_target:
      flags_irq_pending.next = 1
      out_irq_n.next = 0

I actually like this better than the MMC3 IRQ counter's implementation

by Drag on 2011-04-14 (#76713)

qbradq wrote:
You will need to capture the CHR A11, CHR A12 and CHR /RD signals. This should only add two input pins as CHR A11 is already used for name table mirroring.

Why CHR A11? That's the vertical nametable select bit.

I was thinking you'd need to latch A6-A9 when A13 is high and A12 is low, then check to see if A6-A9 is the same for two consecutive fetches. A6-A9 is 1111 during attribute fetches, and less than 1111 during "normal" nametable fetches. However, if you set the internal PPU registers improperly (as in, set the Y scrolling >= $F0, or set the scrolling by writing $03C0 to $2006), then the nametable could fetch attribute data as nametable data, in which case, the MMC5 would see a whole bunch of double %xxxxx1111xxxxxx fetches. Depending on how the MMC5 compares the reads, this could either cause 34 extra scanline clocks (if the MMC5 simply checks to see if A6-A9 is the same) for up to 16 scanlines, or the scanline counter would be delayed by up to 16 scanlines (if the MMC5 specifically checks for two reads where A6-A9 != 1111)

However, if the MMC5 checks the entire PPU address to see if it access the exact same byte twice, then improperly setting the scrolling as I've stated above will cause exactly one extra scanline clock on up to 8 scanlines, as the ppu accesses %10xx1111111111 twice during visible scanline rendering (once as a nametable fetch, and the other as an attribute fetch)

This is pretty cool, because these are all situations that are easily testable on an MMC5 cart, using just software, and if any of those quirks occur, then we know for certain that it looks for duplicate sequential reads.

by qbradq on 2011-04-14 (#76717)

Ah, now I see why folks have been saying it looks for three consecutive reads from the $2000-$2FFF range. We can use the same method if we revise the code to clock on the third consecutive read to that address range.

This does not address the issue of setting the in_frame flag, but I am not worried about that for my purposes

by qbradq on 2011-04-15 (#76761)

I've hit a snag. How does the MMC5 detect the end of the frame? Perhaps by listening for the idle bus during scanline 241? If so, how would this be accomplished? Some sort of counter maybe? Clocked by M2?

by tokumaru on 2011-04-15 (#76763)

Do you really have to stick to the exact same technique used by an actual MMC5? As long as the scanline counter is reliable and the same registers are used to interface with it, the exact internal workings shouldn't matter, right?

Can anyone think of a decent way to count scanlines that will not break with different pattern table and sprite sizes configurarions? I don't think it has to be the exact same solution used by the MMC5, specially considering the MMC5 doesn't work on clones (not sure if the scanline counter has anything to do with this, but still).

by tepples on 2011-04-15 (#76766)

tokumaru wrote:
Do you really have to stick to the exact same technique used by an actual MMC5?

If we want programs to be testable in existing emulators during development, we have to use either A. the same technique or B. another technique where all behavioral differences between the techniques are documented. That's one reason why we want to know exactly how the MMC5 behaves: so we can document the differences between it and a subset clone. Otherwise, the maker of these cartridge boards will have to arrange for mapper plug-ins to be made available for some emulator on each major PC platform. We don't want to develop programs that rely on emulator behaviors that differ from the actual hardware.

by tokumaru on 2011-04-15 (#76770)

What I mean is that as long as the same regirsters are used to interface with the mapper, and the overall timing is the same (i.e. interrupts fire at the same time in the scanline) the actual scanline counting method doesn't matter. Emulators will not be able to tell the difference... as far as they are concerned the software will appear to be using an MMC5.

by qbradq on 2011-04-15 (#76774)

I didn't know the MMC5 does not work on clone systems. Now I finally have an excuse to grab a copy of Laser Invasion or Uncharted Waters and try it

Anyhow, I am open to trying different scan line detection methods. My major problem is I do not have a logic analyzer or scope, so I have to guess at what the NOAC is actually doing. Perhaps it does not make the dummy NT reads at the end of the scan line?

by Bregalad on 2011-04-15 (#76780)

I have a Gemfire card I was planning to sacrifice one day anyways, so if someone can make test ROMs for the scanline counter but can't test them because of the lack of the possibility to test them on a MMC5, I could handle that.

Quote:
What I mean is that as long as the same regirsters are used to interface with the mapper, and the overall timing is the same (i.e. interrupts fire at the same time in the scanline) the actual scanline counting method doesn't matter. Emulators will not be able to tell the difference... as far as they are concerned the software will appear to be using an MMC5.

If both methods are extremely accurate and works every time, and give an IRQ on the same cycle, then yes. Otherwise....

For example, at first glance you could say that with the MMC3's counter. However :
- it will count if you do $2006 writes that will clock it
- it will screw up if you use sprites from the BG pattern table
- it will screw up if you enter into forced blanking mode
etc... etc...

That's why it's important to get the exact behavior of the mapper.

Quote:
I didn't know the MMC5 does not work on clone systems.

I think most clone systems have CIRAM /CE and CHR /A13 wired together internally. So any cart who tries to do something fancy with nametables, such as 4-screen mirroring or using VROM for nametables, or adding an extra nametable will not work.

In addition to that the MMC5 might use the CHR A13 or CHR /A13 lines for counting because it toggles accurately between each pattern and nametable reads, no matter how the chip is programmed (the only condition for this is that the PPU is not in forced VBlank mode).

by qbradq on 2011-04-15 (#76785)

Bregalad wrote:
I have a Gemfire card I was planning to sacrifice one day anyways, so if someone can make test ROMs for the scanline counter but can't test them because of the lack of the possibility to test them on a MMC5, I could handle that.

I can write up the test ROM. I am working on one anyway Do you have a clone system to test on as well? That would help me out a ton.

What exactly do you want tested?

Bregalad wrote:
In addition to that the MMC5 might use the CHR A13 or CHR /A13 lines for counting because it toggles accurately between each pattern and nametable reads, no matter how the chip is programmed (the only condition for this is that the PPU is not in forced VBlank mode).

Yuck, that really bites if you're a fan of CV3. The sub-set I am interested in does not do anything of the sort though, so I might be OK on that front. Thanks for the info.

by qbradq on 2011-04-15 (#76793)

Speculation on Detecting PPU Idle Time

Ok, so here's the revised plan of checking for CPU idling. I'll use HDL pseudo-code as that seems appropriate.

Code:
On the falling edge of CHR /RD:
Set flag_ppu_reading True
If CHR A11 == 0 and CHR A12 == 1: // Reading from $2000-$2FFF, or $6000-$6FFF, etc.
   Increment ppu_nt_read_count
Else:
   Set ppu_nt_read_count to 0
If ppu_nt_read_count >= 3:
   // CLOCK IRQ COUNTER HERE

On the rising edge of PRG M2:
If flag_ppu_reading is False:
   If flag_ppu_not_reading_last_clock is True:
      Set flag_ppu_idle True
   Else:
      Set flag_ppu_not_reading_last_clock False
      // RESET IRQ STATE MACHINE HERE
Set flag_ppu_reading False

With this logic we would detect the PPU idle state in a minimum of 6 PPU cycles (NTSC) or a maximum of 9 (I think). This should skip over single idle PPU cycles flawlessly.

Can someone find the flaw here?

As for testing if this is how the MMC5 detects the end of frame, I don't think we can.

by Drag on 2011-04-15 (#76797)

tokumaru wrote:
Do you really have to stick to the exact same technique used by an actual MMC5? As long as the scanline counter is reliable and the same registers are used to interface with it, the exact internal workings shouldn't matter, right?

No, we don't have to use the absolutely exact implementation as the real MMC5. However, if we know what the real hardware implementation of the MMC5 is, it's helpful for how we approximate it. For example, if we know the exact workings of the MMC5, we know exactly which pixel the IRQ fires on (versus a rough estimate), we know about any oddball quirks (such as the 3 I mentioned in my previous post), and just stuff like that.

tokumaru wrote:
Can anyone think of a decent way to count scanlines that will not break with different pattern table and sprite sizes configurarions?

Yeah, use the Gameboy.
haha, I'm kidding. However, once again, I'd say the MMC5's scanline counter is probably the best we'll get for the time being. I mean, it avoids the exact problems you just mentioned. From the sounds of it, the MMC5 is very robust.

tokumaru wrote:
I don't think it has to be the exact same solution used by the MMC5, specially considering the MMC5 doesn't work on clones (not sure if the scanline counter has anything to do with this, but still).

Even if we don't know how the MMC5 counter really works, we have a number of hypotheses as to how it could work, and any of those theories could be implemented in a custom mapper right now and work just fine.

by Drag on 2011-04-15 (#76799)

Disch's Mapper Doc wrote:
The IRQ will trip at the *start* of the desired scanline. Or, more precisely, near the very end of the previous scanline (closest I can figure is dot 336). That is... if the trigger line is set to 1, the IRQ will trip on dot 336 of scanline 0.

PPU cycle 336 is when the first dummy nametable fetch is made, 339 is when the second dummy fetch ends. 340 is the last cycle of the scanline.
If this is the case, then there's no way the MMC5 can be looking for 3 duplicate fetches (as in, the two dummy nametable fetches, and then the first nametable fetch of the next scanline).

So if the MMC5 looks for duplicate fetches, it'd only be looking for the two at the end of the scanline.

by tepples on 2011-04-15 (#76801)

MMC5 has to count horizontal fetches to tell sprite banks from background banks and figure out at what point on the scanline to vertically split.

by qbradq on 2011-04-15 (#76802)

tepples wrote:
MMC5 has to count horizontal fetches to tell sprite banks from background banks and figure out at what point on the scanline to vertically split.

Dang Tepples, that was totally lost on me! Thanks for pointing that out!

I was just thinking in the John that a more robust way of detecting scanlines is to look for the idle cycle at the end of one for synchronization, then just count the CHR /RD edges until you hit 336. If you see more than one idle PPU cycle you know rendering is disabled, and you can reset the state machine.

That should approximate the behavior we see. When you disable the PPU through a register write there are at least 2 PPU cycles between the end of the write and the execution of the next CPU instruction byte, during which time the PPU should be idle.

by Drag on 2011-04-15 (#76817)

MMC5 test rom (Edit: Redownload, I forgot to properly initialize sprites)
This rom ought to test for the quirks I mentioned a while back. I don't have the means to put this on an actual MMC5 cart, so could someone here do it?

When the screen scrolls up, you should see a bunch of 0s at the top of the screen. That's attribute data being rendered as nametable data, by setting the Y scrolling to $FF. Underneath this is a row of down arrows, that's just to indicate the top of the nametable.

The PPU will incidentally make a duplicate read all on its own, when it goes to render the "tile" at 23FF (also happens at 27FF, 2BFF, and 2FFF), because it'll fetch 23FF as a tile, and then fetch 23FF as an attribute. If the scanline counter indeed looks for a duplicate fetch, then the ruler thing at the bottom of the screen should move when the invalid data scrolls onto the screen. It could:
Shift down 8 pixels
Shift up 8 pixels
Cover the screen
Do nothing at all

If the ruler moves at all, how it moves will determine what kind of duplicate read the MMC5 looks for.

If the ruler stays still the whole time, then we can't quite conclude anything.

Press start to pause/unpause the scrolling at any point, and use the left/right keys to scroll the screen left and right (this changes where the incidental double-read occurs). So, if the ruler doesn't immediately freak out when the invalid data scrolls on screen, try pausing it, and then moving the screen left and right.

by Bregalad on 2011-04-16 (#76848)

Well I said I'd help but I didn't expect to have to do it so soon. I think I only have one 256 kb Flash ROM and one 32kb permanant SRAM at the momment, I hope I'll be able to make a devcart using those next week so I could tell the result on a real MMC5.

by qbradq on 2011-04-16 (#76853)

I could also make a test cart, but I'd have to get some new ROM chips first. I got a Willem PCB5 and it is not working with the 29F020 chips I have.

by Drag on 2011-04-16 (#76856)

I put all of the code in a single 16k bank, so you can use whatever size rom you want, just mirror the data.

by Drag on 2011-05-05 (#77722)

So has anyone had a chance to test this? I'm personally interested in whether or not any of the quirks I mentioned are happening.

by loopy on 2011-12-13 (#87442)

I tried your test rom. The ruler does indeed move, if you scroll to the left a little bit. Interesting! Not what I was expecting.

http://youtu.be/dGhVxdFJVuY

by Zepper on 2012-02-19 (#90205)

Just tried this ROM, but something's strange. It has 16k of PRG ROM, but it writes $00 to $5100, setting the PRG mode to 32k. My emu crashes because there's no 32k page available.

by tepples on 2012-02-19 (#90207)

Mirror, mirror on the wall...

by Zepper on 2012-02-19 (#90209)

Non-standard settings. Why? I had to duplicate the 16k page in the file & change the iNES header for 2 of 0x4000 PRG banks. It worked.

by Drag on 2012-02-22 (#90308)

Sorry about that, I think I originally coded this to be 32k in size, but then I later changed it to 16k, probably without updating my code to explicitly set up a 16k + 16k bank setup. Then again, I was under the understanding that if your ROM chip is smaller than 32k, it'll always be mirrored to 32k because the ROM chip won't ever see the upper address lines from the NES.

Thank you so incredibly much for testing this, loopy! I'll need to reread my ramblings in this thread in order to figure out what my train of thought from 7 months ago was, but I think this confirms that the MMC5's scanline counter detects duplicate PPU fetches in order to count scanlines.

The reason I can conclude this is because of the trick I mentioned, where I'm giving the PPU an invalid Y-scroll value, and tricking it into reading $23FF twice consecutively (once as a tile and then again as an attribute byte). The MMC5 detects this duplicate read, and clocks the scanline counter on each scanline that the trick-tile occupies, which is why the ruler moves up when it scrolls onto the screen.

Since it's detecting my trick duplicate read, that means it must also be detecting the duplicate read that occurs at the end of each scanline (cycles 336-339), and since it clocks the scanline counter when it detects these duplicate reads, that must be how it's counting scanlines.

So, that's one MMC5 mystery solved.

by infiniteneslives on 2012-02-22 (#90313)

Drag wrote:
The reason I can conclude this is because of the trick I mentioned, where I'm giving the PPU an invalid Y-scroll value, and tricking it into reading $23FF twice consecutively (once as a tile and then again as an attribute byte). The MMC5 detects this duplicate read, and clocks the scanline counter on each scanline that the trick-tile occupies, which is why the ruler moves up when it scrolls onto the screen.

Since it's detecting my trick duplicate read, that means it must also be detecting the duplicate read that occurs at the end of each scanline (cycles 336-339), and since it clocks the scanline counter when it detects these duplicate reads, that must be how it's counting scanlines.

So, that's one MMC5 mystery solved.

NICE! Nifty little way to get inside and figure out how it's detecting what's going on. So the logic involved here seems pretty simple for detecting this. The only costly thing is the number of I/O, but that's needed for the EXRAM anyways.

by Bregalad on 2012-02-22 (#90315)

Yahooooo !
Let's toast on this !

I think those extra unnecessarly fetches at the end of each scanlines are really weird/undocumented. I didn't even know it was a duplicate fetch.
This also explains/confirms why the MMC5's counter doesn't work on clones at all, clones which obviously have another fetching sequence.

What sucks, though, is that you can't intentionally display the attribute table as a nametable on the MMC5 while using the SL counter, as it will be clocked twice in that part (apparently). Displaying the attribute table as a nametable is the only way to fully use the ExGraphix mode without any waste, as you can get a 32x32 tilemap with every tile having single color.
Not a big issue though, since to do that you'd only need to trigger an IRQ after displaying the last true nametable row ($23a0-$23bf) to force the counter to $23c0, so after this line the IRQ counter is not necessary any longer.

Is there any other "mysteries" in the MMC5 left ? Like how it detects forced blanking, how exactly the ExRAM is accessed (dual port), how fast is the multiplier, etc...

by Drag on 2012-02-22 (#90323)

An interesting thing with this video is that the effect doesn't start until the trick tile is actually the third tile on the scanline (from the left). As in, the ruler didn't move until there were 3 columns of dots on the screen.

The reason this is interesting is due to the fetches the PPU makes to render a scanline, which there's a handy reference for here.

The first two tiles of a scanline are fetched at the end of the PREVIOUS scanline, just before the duplicate read. However, when the trick tile is within the first two columns of the screen (when there are only 2 columns of dots in the video), the ruler is unaffected.

With one column of dots on the screen, the fetches should look something like this:

23FF -- First tile for next scanline --
23FF <- Trick duplicate read
0000
0008
23E0 -- Second tile for next scanline --
23F8
0000
0008
23E1 -- The end-of-scanline duplicate read, MMC5 detects this --
23E1
-- Idle Cycle --
23E1 -- Beginning of next scanline, third tile for this scanline --
23F8
0000
0008

However, the MMC5 doesn't seem to be recognizing this. Same for when there are two columns of dots on the screen:

23FE -- First tile for next scanline --
23FF
0000
0008
23FF -- Second tile for next scanline --
23FF <- Trick duplicate read
0000
0008
23E0 -- The end-of-scanline duplicate read, MMC5 detects this --
23E0
-- Idle Cycle --
23E0 -- Beginning of next scanline, third tile for this scanline --
23F8
0000
0008

but with three columns of dots on the screen, the MMC5 detects the trick and clocks the scanline counter one extra time.

23FD -- First tile for next scanline --
23FF
0000
0008
23FE -- Second tile for next scanline --
23FF
0000
0008
23FF -- The end-of-scanline duplicate read, MMC5 detects this --
23FF
-- Idle Cycle --
23FF -- Beginning of next scanline, third tile for this scanline --
23FF <- Trick duplicate read
0000
0008

I don't know what happens when the tile is on other parts of the scanline, but why does this trick only work when the tile is on the third column of the screen?

by loopy on 2012-02-22 (#90325)

Because it's looking for *3* reads, not 2.

1 column:
Code:
23FF
23FF
0000
0008

23E0
23F8
0000
0008

23E1 \
23E1 \ 1
/
23E1 /
23F8
0000
0008

2 columns:
Code:
23FE
23FF
0000
0008

23FF
23FF
0000
0008

23E0 \
23E0 \ 1
/
23E0 /
23F8
0000
0008

3 columns:
Code:
23FD
23FF
0000
0008

23FE
23FF
0000
0008

23FF \
23FF \ 1 \
/ \ 2
23FF / /
23FF /
0000
0008

That's how I've implemented it on the Powerpak anyway, and the results look the same.

by Drag on 2012-02-22 (#90329)

Oh nifty, that makes sense then. I'm just trying to think about how it would do this on a hardware level. For example, it might have two latches, and each read gets pushed into those latches, and if the PPU fetches a particular address, and that same address is in both of the latches, then it clocks the scanline counter. That would mean that each additional read after the first two would clock the scanline counter. However, the only way to test this would be to pull a Kevin Horton and feed the MMC5 some custom "PPU" signals directly.

That's a good find though, if the quirk only happens when the trick tile is on the third column of the screen specifically, then that means the MMC5 would indeed be looking for 3 (or more) identical fetches.

by Bregalad on 2012-02-22 (#90330)

Oh, I'm not too sure about this.

On a "normal" scanline (not one which uses attribute table as nametable), when does 3 consecutive fetches happen ?
Does the 2 dummy fetches at the end of each scanline always read the same adress as the actual fetch that was before it ? I didn't remember reading about it.

Does this means it's possible to clock the MMC5's counter artificially by tweaking $2006 and $2007 manually ?

How does the MMC5 know when it's VBlank and when it's outside of VBlank (I'm pretty sure there is a flag like that you can read) ?

by loopy on 2012-02-22 (#90339)

Bregalad wrote:
Oh, I'm not too sure about this.

On a "normal" scanline (not one which uses attribute table as nametable), when does 3 consecutive fetches happen ?
Does the 2 dummy fetches at the end of each scanline always read the same adress as the actual fetch that was before it ? I didn't remember reading about it.

No. The 2 dummy fetches are from the address the _next_ real fetch will be at. It's documented here. And as I said earlier, I did it this way on the powerpak and it does work.

Bregalad wrote:
Does this means it's possible to clock the MMC5's counter artificially by tweaking $2006 and $2007 manually ?

I don't think so. You could only do that in vblank and the line counter isn't operating then. It would be an easy theory to test though.

Bregalad wrote:
How does the MMC5 know when it's VBlank and when it's outside of VBlank (I'm pretty sure there is a flag like that you can read) ?

I don't think this is known yet.

by tepples on 2012-02-22 (#90343)

Bregalad wrote:
Does the 2 dummy fetches at the end of each scanline always read the same adress as the actual fetch that was before it ?

I think that's the key. The fetches at x=337 and x=339 of one line have the same address as the fetches at x=1 of the next line.

Now we just need to figure out how to do the same thing with fewer I/Os. I'd bet just watching for several consecutive reads with bit 13 set ($2000-$3FFF) would do it. The most consecutive fetches you get from $2000-$3FFF during a scanline is two, and the end of a scanline has four: x=337, x=339, x=1, x=3.

Quote:
Does this means it's possible to clock the MMC5's counter artificially by tweaking $2006 and $2007 manually ?

I remember reading somewhere that the MMC5 watches writes to $2001: if the background and sprites are disabled, the counters doesn't get clocked. Or it could detect the post-render scanline somehow.

by Bregalad on 2012-02-23 (#90405)

Quote:
That's how I've implemented it on the Powerpak anyway, and the results look the same

I don't remember ever booting a MMC5 game successfully on the powerpak. This made my try again but my memory was right : MMC5 doesn't work yet on the power pak (even with the latest mappers), even for CV3 which only uses a small part of MMC5's capabilities.

by loopy on 2012-02-23 (#90406)

http://home.comcast.net/~olimar/NES/mmc5.zip

I didn't link to it on my webpage because it's not complete. It was mentioned over in this thread.

by Bregalad on 2012-02-23 (#90413)

Oh okay. I didn't think this thread was all that serious but apparently it ended this way.

It looks great for a beta mapper ! You don't have to worry about it being incomplete, apparently most things works great, it's just the sound is lacking.
And it's probably quite simple to implement since it's like the 2A03 but with the most complex part (the sweep) out.

If only I was able to play with .MAP files I would sure do some experiments... but I have no idea how to achieve this.

Back on the subject about how MMC5 detects the VBlank/Frame, I don't know but I have some feeling that it's something dead simple nobody ever thought.
Like a particular fetch the PPU does at the beginning which simply enables the "frame" mode - which would be easy to replicate by reading $2007 during VBlank and trick the MMC5 into thinking the frame has begun.

by Drag on 2012-02-23 (#90438)

It'd be pretty nice to know if the PPU's fetches during the prerender scanline are unique in some way. As in, because the fetched data is just thrown away, is it valid nametable/attribute data for the scanline that would be displayed? Or is the PPU fetching some bullshit addresses (similar to how the PPU fetches the same nametable byte twice at the end of the scanline to waste time)? If the PPU is doing something like fetching $0000 over and over again, the MMC5 could be using that to detect the start of a frame.

As for how it detects the end of the frame, it could just simply be using the scanline counter and counting until the last scanline is rendered. To test for this, you could use my test rom again, but wait in a loop for the in-frame flag to be unset, and have some kind of onscreen indication for when it happens (such as setting the monochrome bit). Perform the ruler-moving-up quirk, and see if it causes a monochrome section of the screen to slide up too.

by krzysiobal on 2012-04-14 (#92488)

I made a programmable cart for Famicom (to be preciselly - PEGASUS, which is a Polish clone of Famicom with one chip inside: um6561
http://lh6.ggpht.com/_tbPUSWhIUVo/S8CfJ ... ackage.jpg

and last days I was implementing MMC5 for it.
I made it using the directives here:
http://wiki.nesdev.com/w/index.php/MMC5
and here:
http://wiki.nesdev.com/w/index.php/INES_Mapper_005
and it works really great (tested on Castlevania 3). There are some minor chr glitches, but they might be due to the mistake in assigning banks, I need to check it again.

When I find three consecutive accessess by the ppu to the same address, I treat it as a begin of new scanline.
When I find that the CPU is accessing $FFFA, it means that the PPU ended generating frame.

Try guys to disable the NMI generation by PPU and see if MMC5 still correctly detects end of frame.

by infiniteneslives on 2012-04-14 (#92492)

All I can say is WOW! You should make a separte post of your cart to share, I don't know how many people will look at this thread and your board in all it's glory. That is one HELL of a devcart... You should win some kind of award for most complex through hole design. I love how the board kept growing with the added on PCBs And you've got MMC5 working to some degree, wowzers.

by 80sFREAK on 2012-04-14 (#92498)

Looks scary but if it's working, it's working.
Couple questions:
1)RAM IC's above flash memory could be used instead of flash?
2)all mappers sits in the "big black square chip"?

by tokumaru on 2012-04-14 (#92499)

krzysiobal wrote:
When I find three consecutive accessess by the ppu to the same address, I treat it as a begin of new scanline.

Sounds like a simple way to count scanlines, that could be used in new mappers developed by the community for future homebrews.

by krzysiobal on 2012-04-16 (#92535)

Hello again!
I will describe it soon if you wish
Yea, the black-square box inside is a Spartan XC3S200 FPGA, which emulates mappers. There are currently support for NROM, UNROM (UOROM), CNROM, BNROM, GNROM, ANROM, MMC1, MMC3, MMC5, CAMERICA (71 & 232), JALECO (18). The FPGA utilization level is 62%.
PRG ROM 512KB, CHR ROM 512KB, PRG RAM 32KB, CHR RAM 32KB, support for 4mirroring. Games are loaded from MMC card - full fat16 implementation for 6502 from scratch ;]

The design is ready for PRG ROM chips up to 4MB, but I've never found such large FLASH ROMS (I'm using 29F040 which is 512KB. 29F080 is 1MB, but it is rare).
4 x 29F040 can be connected in parallel with LS319 decoding, but there will be too many wires and too many noise.

However, I need to ask for some help! I implemented MMC5 (it is almost complete, with all features except the vertical split).
I tested the features on home-made mmc5 test roms (for example - the extended tile/attributes, extra nametable) and it works OK.

The one game that I am focused on is Castlevania 3. It works really great except one minor flaw in the intro. Few days ago I thought it's because some mistakes in assigning banks, but checked everything 3 times and seems to be ok.

I thought also that's because it might use extended tile/attributes, extra nametable but now it is implemented and it still flaws. As I guess, Castlevania does not use split scrolling - the only STA $5200 opcode (hexadecimally 8D0052) is at $3E100 offset in the NES file and it's a part of such routine:

so storing #00 here definitely turns off the splitting.
Also checked for any STXes and STYes at this address and none was found.

It is the video how it looks on my cart.. (for those impatients - it's at 1:30)

http://youtu.be/OrYAC_1bNSk

See that the title screen is correct for a second, then something get switched and it breaks.

Inspecting how it plays at the FCEUX emulator and taking look at the nametables and pattern tables I see exactly the same behaviour (in the name table's window), but in the game window everything is ok of course.

Click on the image for enlargement.

That's just a second before the fatal switch:

and that's after:

When I played it in nesticle, it behaved the same way like on my dev-cart!

Strange thing is that at the second screen, the big Castle Vania text is displayed correctly in the game window, but there aren't any pattern tables for it!

Very similar behaviour can be seen in the MMC3 game - Doki Doki Yuuenchi (Doki Doki Amusement Park):

The scrollbar with START/PASSWORD text is buggy in name table window and there aren't any fonts in pattern tables!
However, it is working because this game uses IRQ counter - it sets the irq counter at the scanline just before the START/PASSWORD text and when it is fired, it quickly swapps the chr banks so that the text is rendered using different pattern tables.
But this game runs correctly on my dev-cart!

Similar actions might be used in the CastleVania 3, but I also implemented IRQ counter in MMC5 and it seems to work correctly (tested it for example on the Laser Invasion MMC5 game - and when the IRQ counter was not implemented, ths game glitched and blinked). Also, Castlevania 3's title screen wasn't displayed at all without the IRQ Counter.

Have you got any guesses?

by infiniteneslives on 2012-04-16 (#92541)

krzysiobal wrote:
However, it is working because this game uses IRQ counter - it sets the irq counter at the scanline just before the START/PASSWORD text and when it is fired, it quickly swapps the chr banks so that the text is rendered using different pattern tables.
But this game runs correctly on my dev-cart!

Similar actions might be used in the CastleVania 3, but I also implemented IRQ counter in MMC5 and it seems to work correctly (tested it for example on the Laser Invasion MMC5 game - and when the IRQ counter was not implemented, ths game glitched and blinked). Also, Castlevania 3's title screen wasn't displayed at all without the IRQ Counter.

Have you got any guesses?

I'm just guessing here but is it possible your IRQ counter is off by a scanline or a few dots on your MMC5? Your test using Laser vision might not be conclusive that your IRQ counter is working properly if it's not sensitive to being off a scanline or a few dots. I would think it's probable that your on the right scanline but wrong dot. If that's the case I would guess some games won't care, but others will which might be your issue.

I'm not sure when the MMC5 fires an IRQ exactly but if you can't figure it out I can *try* to get a sample using my logic analyzer for CV3 on that screen.

by krzysiobal on 2012-04-17 (#92581)

Hello again!
Thanks for reply - the problem was solved!
It was indeed connected with the IRQs.

Hovewer, it really does not matter if the irq is fired late a scanline, a dot or something (the banks will be switched one scanline too late, but that won't be quite vissible).

I have examined in details what castlevania 3 does in the title screen:

Code:

nmi:
...
lda $5204 ;acknowledge irq
lda #$80
sta $5204 ;turn irqs on
lda #8
sta $5203 ;set irq to be fired at scanline 8
...
rti

irq:
...
lda $5204 ;acknowldge IRQ
lda bank ;check if current chr-bank is 1 or 0
beq scanline_8
bne scanline_d0

scanline_8:
lda #1
sta $512b ;switch bank
sta bank

lda #$D0
sta $5203 ;set irq to be fired on $d0 scanline
jmp irq_end

scanline_d0:
lda #0
sta $512b ;switch bank
sta bank
lda #$D3 ;set irq to be fired on $d3 scanline - don't know why, but the irqs are turned off so maybe it even is ignored by the cpu
sta $5203
lda #0 ;turn off irqs
sta $5204
irq_end:
...
rti

So it turns the IRQ to be fired at scanline 8 (it is constant) - when you look at the screen in PAL mode, you see glitches above that scanline. In NTSC it is invissible. Because PEGASUS it is an NTSC console with forced PAL (probably the same as russian DENDY), it can be seen on my tv

Nevermind, after that it changes CHR bank.
The next IRQ is set to be fired at scanline $D0 (this is variable - as the title screen advances, this value is decreased (the next IRQ is fired more and more early). After that, the second CHR bank switch takes place.

I am writing my dev-cart in VHDL and have two separate processess (one which is responsible for the CPU part and one for the PPU).

And I have also such signals as mmc5_ppu_in_frame, mmc5_irq_pending, and some others (no matter).

Hovewer, becase those two signals need to be changed by both of the processes:

mmc5_irq_pending must be cleared after $5204 read - the "CPU" process is doing that, but it also must be set after PPU notices that the current scanline number matches the one at $ 5203 - the "PPU" process is doing that.

Similar with the second signal.

Because one signal cannot be driven by the two processes, I had to make some `trick` - I added some more signals:
mmc5_irq_pending_set_req (set request)
mmc5_irq_pending_set_ack (set acknowledge)
So the the CPU is driving the irq_pending signal as it wants, but when the PPU wants to set it high, it sets mmc5_irq_pending_seet_req to '1';

When the CPU notices that mmc5_irq_pending_set_req equals '1', it
sets mmc5_irq_pending to '1' and sets mmc5_irq_pending_set_ack to '1'.

When the PPU notices that the mmc5_irq_pending_set_ack is '1', it clears
mmc5_irq_pending_set_req to '0'.
And when the CPU notices that the mmc5_irq_pending_set_req is '0', it clears the mmc5_irq_pending_set_ack to '0';

So it is some kind of handshake
The CPU_PROCESS is activated on rising edge on M2.
The PPU_PROCESS is activated on falling edge on PPU_!RD or falling edge on PPU_!WR.

However, before it started to work it was a little bit exchanged - the PPU process was driving the mmc5_irq_pending signal and the CPU process was setting mmc5_irq_pending_clear_req if it wanted to pull it down.
And it didn't worked as good because the PPU process is not active during VBLANK because PPU_!RD and PPU_!WR is high there! So if the CPU wanted to pull it low during VBLANK, it was pulled low by the PPU_PROCESS but after the VBLANK.

The CPU PROCESS is active all the time (the M2 is changing low to high all the time)

So it was my ingenious idea to change the roles
Sorry for such long description, thought someone might need it in the future.

BTW. I love reading DISCH's notes about mappers, for example:
Code:
Detailed Operation:

The IRQ counter is an up counter, rather than a down counter (like MMC3). Every time the MMC5 detects a
scanline, it does the following:

- If In Frame Signal is clear...
a) Set In Frame signal
b) Reset IRQ counter to 0
c) Clear IRQ pending flag (automatically acknowledging IRQ)

- otherwise...
a) Increment IRQ counter
b) If IRQ counter now equals the trigger value, raise IRQ pending flag

Such detailed and algorithmic approach makes no ambiguities and implementing that is just piece of cake! This guy should be awarded a medal.

by infiniteneslives on 2012-04-18 (#92593)

Well I'm glad I was able to help point you in the right direction at least.

Congrats on getting it running, I might have to pick your brain when I get around to my implementation of MMC5.

by TmEE on 2012-04-19 (#92620)

That is very very awesome

by Drag on 2012-04-20 (#92673)

Bregalad wrote:
Oh okay. I didn't think this thread was all that serious but apparently it ended this way.

I know this is an old post, but why did you think this thread wasn't meant to be serious? Or did I just misinterpret what you wrote?

Bregalad wrote:
Back on the subject about how MMC5 detects the VBlank/Frame, I don't know but I have some feeling that it's something dead simple nobody ever thought.
Like a particular fetch the PPU does at the beginning which simply enables the "frame" mode - which would be easy to replicate by reading $2007 during VBlank and trick the MMC5 into thinking the frame has begun.

I believe the prerender scanline makes the same triple-fetch that other scanlines do, so I'd hypothesize that the MMC5 looks for the first triple-fetch to see when the PPU starts rendering.

The only other way to triple-fetch an address (aside from how the PPU naturally does it) is to do it intentionally, by writing the same address to 2006 three times (reading from 2007 each time), so it seems plausible to me. It'd even be easy to implement the in-frame flag this way; whenever the scanline counter is clocked, set the in-frame flag.

That's how it could detect the start of a frame, but how it detects the end of a frame is beyond me. Krzysiobal's earlier post mentioned that he watches for a read from CPU$FFFA, which would happen when the vblank NMI happens. Right now, I'd be most willing to wager on that being how the MMC5 does it.

This is all just speculation though. I don't have the same amount of free time I did when I first made this thread, otherwise I'd be more gung-ho about testing things. :\

by infiniteneslives on 2012-04-20 (#92674)

Drag wrote:

I know this is an old post, but why did you think this thread wasn't meant to be serious? Or did I just misinterpret what you wrote?

He was talking about this thread I believe:

loopy wrote:
http://home.comcast.net/~olimar/NES/mmc5.zip

I didn't link to it on my webpage because it's not complete. It was mentioned over in this thread.

by Bregalad on 2012-04-21 (#92678)

Yes, the thread I didn't think it was serious was the "OMG I invented the MMC7" one, not this thread.

I guess I'd had to say "I didn't think that thread ..." instead of "this thread" to be more precise - in french (my language) we only have a single word for "this" and "that" so it's hard for me to know which one to pick when I'm writing in english.

When it comes to deteting the end of the frame, wouldn't it be as simple as counting 240 scanlines ? Of course it you enable rendering late (Battletoads style) it will not work, but who ever said the MMC5 counter worked properly in this setting ?

Detecting reads from $fffa can be clever, but it will only work if NMI is enabled. We all agree it's almost always the case, but you could decide for some reason not to always use NMIs, and then this would make the MMC5 counter fail as well ?

I think there is at least a NROM game, Portopia, which relies entirely on $2002 pooling and never uses NMI. This game hardly uses any animations so randomly missing VBlanks has no visible effects on the game.

Anyways this should be extremely simple to test by reading $fffa manually. If the MMC5 however looks for a $fffa read followed by a $fffb read on the next cycle we can't do that with code so we'd have to test this by disabling NMIs and see if the counter still works or if it's stuck in its "in frame" state.

by loopy on 2012-04-21 (#92684)

The in-frame flag ($5204.6) also goes low goes low if you disable screen rendering. I suppose you could also watch $2001 writes in addition to $FFFA+FFFB reads, but this seems unnecessarily complicated.

On Powerpak, it's detected when the CHR RD pin goes inactive (low for X cpu clocks).

by Drag on 2012-04-21 (#92710)

loopy wrote:
The in-frame flag ($5204.6) also goes low goes low if you disable screen rendering. I suppose you could also watch $2001 writes in addition to $FFFA+FFFB reads, but this seems unnecessarily complicated.

On Powerpak, it's detected when the CHR RD pin goes inactive (low for X cpu clocks).

If it watches for CHR RD to settle down, then the in-frame flag should go low before the NMI fires, because there's supposedly an idle scanline before the PPU actually sends the NMI.

Also, continually alternating between reading 2007 and 5204 would hypothetically cause the in-frame to stay high (since reading 2007 would generate activity on CHR RD), even when it's supposed to be low.

Again, speculation, but easily testable.

by tepples on 2012-04-21 (#92717)

Bregalad wrote:
I guess I'd had to say "I didn't think that thread ..." instead of "this thread" to be more precise - in french (my language) we only have a single word for "this" and "that" so it's hard for me to know which one to pick when I'm writing in english.

English used to have three: this, that, and yon. Spanish still does (este/esta, ese/esa, aquel/aquella), as does Japanese (kore, sore, are). I think this merger might have something to do with the process that turned the Vulgar Latin words that became Spanish ser and estar into French être.

Quote:
Anyways this should be extremely simple to test by reading $fffa manually. If the MMC5 however looks for a $fffa read followed by a $fffb read on the next cycle we can't do that with code

Not even JMP ($FFFA)?

by tokumaru on 2012-04-21 (#92722)

tepples wrote:
English used to have three: this, that, and yon. Spanish still does (este/esta, ese/esa, aquel/aquella), as does Japanese (kore, sore, are).

Just wanted to add portuguese to the list: it also has 3 in theory (este/esta, esse/essa, aquele/aquela) but "este" is hardly used in spoken form, where "esse" is used in both cases.

by infiniteneslives on 2012-06-04 (#94997)

tepples wrote:
Bregalad wrote:
Does the 2 dummy fetches at the end of each scanline always read the same adress as the actual fetch that was before it ?

I think that's the key. The fetches at x=337 and x=339 of one line have the same address as the fetches at x=1 of the next line.

Now we just need to figure out how to do the same thing with fewer I/Os. I'd bet just watching for several consecutive reads with bit 13 set ($2000-$3FFF) would do it. The most consecutive fetches you get from $2000-$3FFF during a scanline is two, and the end of a scanline has four: x=337, x=339, x=1, x=3.

Did some probing around with my logic analyzer to try and prove the validity of your idea Tepples. It might be even easier that that, I'll have to test this at some point but it appears that a simple flipflop could sense scanlines if the timing was properly established.

Watching the CHR A13 line you see the four consecutive reads like you discussed. Below the blue trace is CHR A12 where you can see the sprite fetches for the scanline. Yellow is CHR /RD, Red is CHR A13, and Green is CHR /WR (not much to see). This is coming from the intro of "To the Earth" for anyone curious.

[EDIT: attaching photo at bottom since Dropbox changed link urls and nesdev image attaching has improved...]

But looking at the traces above one other thing sticks out. CHR A13 and CHR /RD always rise WITH eachother, but there's ONE exception. Between scanlines CHR A13 falls low for a small period of time, and when it rises again for the 3rd read CHR /RD has 'hung' for that same period of time. So it almost looks like you could just have a counter that was allowed to be clocked when CHR /RD was high and then clocked with CHR A13. This looks like a great solution with the above discrete traces. In actuality CHR A13 lags CHR /RD slightly as expected, which wouldn't allow the idea above to work if the logic was reasonably fast. One way to get around that would be to add some extra delay to CHR /RD with a few gates or something. I wouldn't expect this to be to stable, just thought it was interesting how CHR A13 toggling could be taken advantage of.

Your idea to merely check for 4 consecutive writes seems pretty sound. You could even reduce the counting logic from 3 to 2 bits and just check for 3 consecutive reads I'd think. The IRQ's would be delayed compared to MMC3 but you'd loose the mess of goofing things up with $2006/7 only using TWO inputs. BEAUTIFUL