CPU-test: cpu_exec_space [DONE]

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
CPU-test: cpu_exec_space [DONE]
by on (#91816)
In case anyone ever gets an urge to write "JSR $2001" to their NES code, this test verifies that everything works as it should.

Download test at: http://bisqwit.iki.fi/src/nes_tests/cpu_exec_space.zip
Code:
NES Memory Execution Tests
----------------------------------
These tests verify that the CPU can execute code from any possible
memory location, even if that is mapped as I/O space.

In addition, two obscure side effects are tested:

1. The PPU open bus. Any write to PPU will update the open bus.
   Reading from 2002 updates the low 5 bits. Reading from 2007
   updates 8 bits. The open bus is shown in any addresss/bit
   that the PPU does not write to. Read from 2000, you get open bus.
   Read from 2006, ditto. Read from 2002, you get that in high 3 bits.
   Additionally, the open bus decays automatically to zero in about one
   second if not refreshed.
   This test requires that a value written to $2003 can be read back
   from $2001 within a time window of one or two frames.

2. One-byte opcodes must issue a dummy read to the byte immediately
   following that obcode. The CPU always does a fetch of the second
   byte, before it has even begun executing the opcode in the first
   place.

Additionally, the following PPU features must be working properly:

1. PPU memory writes and reads through $2006/$2007
2. The address high/low toggle reset at $2002
3. A single write through $2006 must not affect the address used by $2007
4. NMI should fire sometimes to salvage a broken program, if the JSR/JMP
   never reaches its intended destination. (Only required in the
   test IF the CPU and/or open bus are not working properly.)

The test is done FIVE times: Once with JSR $2001, again with JMP $2001,
and then with RTS (with target address of $2001), and then with a JMP 
that expects to return with an RTI opcode. Finally, with a regular     
JSR, but the return from the code is done through a BRK instruction.   

Tests and results:

         #2: PPU memory access through $2007 does not work properly. (Use other tests to determine the exact problem.)
         #3: PPU open bus implementation is missing or incomplete: A write to $2003, followed by a read from $2001 should return the same value as was written.
         #4: The RTS at $2001 was never executed. (If NMI has not been implemented in the emulator, the symptom of this failure is that the program crashes and does not output either "Fail" nor "Passed").
         #5: An RTS opcode should still do a dummy fetch of the next opcode.  (The same goes for all one-byte opcodes, really.)
         #6: I have no idea what happened, but the test did not work as supposed to. In any case, the problem is in the PPU.
         #7: A jump to $2001 should never execute code from $8001 / $9001 / $A001 / $B001 / $C001 / $D001 / $E001.
         #8: Okay, the test passed when JSR was used, but NOT when the opcode was JMP. I definitely did not think any emulator would trigger this result.
         #9: Your PPU is broken in mind-defyingly random ways.
         #10: RTS to $2001 never returned. This message never gets displayed.
         #11: The test passed when JSR was used, and when JMP was used, but NOT when RTS was used. Caught ya! Paranoia wins.
         #12: Your PPU gave up reason at the last moment.
         #13: JMP to $2001 never returned. Again, this message never gets displayed.
         #14: An RTI opcode should still do a dummy fetch of the next opcode.  (The same goes for all one-byte opcodes, really.)
         #15: An RTI opcode should not destroy the PPU. Somehow that still appears to be the case here.
         #16: IRQ occurred uncalled
         #17: JSR to $2001 never returned. (Never displayed)
         #18: The BRK instruction should issue an automatic fetch of the byte that follows right after the BRK. (The same goes for all one-byte opcodes, but with BRK it should be a bit more obvious than with others.
         #19: A BRK opcode should not destroy the PPU. Somehow that still appears to be the case here.



Expected output:
        TEST: test_cpu_exec_space
        This program verifies that the
        CPU can execute code from any
        possible location that it can
        address, including I/O space.

        In addition, it will be tested
        that an RTS instruction does a
        dummy read of the byte that   
        immediately follows the
        instructions.

        JSR test OK
        JMP test OK
        RTS test OK
        JMP+RTI test OK
        BRK test OK

        Passed

by on (#91817)
Many thx Bisqwit, with this test I discovered that I've forgotten to implement the dummy fetch in the RTS. Fixed and now I pass the test without problems.

by on (#91819)
Nice. I updated the test and added test also for the RTI and BRK opcodes.

I have no idea why, but Nintendulator fails #18.

by on (#91822)
Failed #5 here... :) Fixing...

EDIT: RTS, RTI, BRK... It's OK now. Anyway, much interesting that a readvalue(cpu->PC) makes difference, instead of just clocking the PPU. The same "rule" probably occurs with the NMI/IRQ/RESET interrupts.

by on (#91823)
Zepper wrote:
Anyway, much interesting that a readvalue(cpu->PC) makes difference, instead of just clocking the PPU.

Only in such a contrived example as this. But technically, if the cartridge can monitor reads, it can also act on whether the CPU does this stuff or not.

Zepper wrote:
The same "rule" probably occurs with the NMI/IRQ/RESET interrupts.

I was pondering the same. Basically, any and all instructions begin with two sequential fetches from the current PC location. (Which are actually performed at the end of the previous instruction.) It's just that the one-byte instructions discard the results of the second fetch, and do not increment PC (so the same byte is fetched twice).
As I understand it, when an NMI/IRQ/RESET occurs, the first byte already fetched will be re-interpreted as 0x00, but no extra fetches occur at the location of the current code.

Now, to keep with the theme of this test, I should also try to execute APU I/O ports. It is a bit more difficult, because if I am right, the APU open bus shares the same open bus as is used for the instruction fetches, and also for stack fetches. Meaning that the last "valid" byte loaded before the jump to APU registers, is the one that is found in the APU register space as well. An RTS might work, if RAM did not also use the same bus. With RAM in the same bus, the values read from stack must also end in a $60. Which is not true for $40xx addresses... But hmm, $40 is RTI. Maybe I can do this after all.

by on (#91825)
Allright, added a test in which the APU space ($4010..$4017) is also tested (with exception of $4015, which is a readable port), but also the unallocated space from $4018 to $40FF. The link in the first post has both tests.

by on (#91826)
Passed. :)

by on (#91827)
Zepper wrote:
Passed. :)

When I enable the jmp to $4015, your emulator (and also puNES, and Nintendulator, and Nestopia) still passes the test even though they should not. I wonder how that is possible. For that to happen, $4015 would have to yield a $40 value, which happens when a frame IRQ is pending, but that is not the case.

by on (#91828)
OK, you probably knows this info, but I'm double-checking my code.
The only way of reading bit 6 ($40) set is activating frame IRQ by writing to $4017:$40. Then, it's triggered after 29828 cycles, and for the next 3 ones.

Plus...
Code:
$4017 = $00 (frame irq enabled) [power up sequence]
APU mode in $4017 was unchanged [after RESET]

by on (#91830)
Zepper wrote:
OK, you probably knows this info, but I'm double-checking my code.
The only way of reading bit 6 ($40) set is activating frame IRQ by writing to $4017:$40.

My bad, I thought I had APU IRQs disabled, but I only had the CPU's I flag on which is why I was not noticing them.

by on (#92194)
Just for the record, the dummy fetch should happen with all of the following opcodes (all are single byte).
Code:
                40 RTI  60 RTS
08 PHP  28 PLP  48 PHA  68 PLA
88 DEY  A8 TAY  C8 INY  E8 INX
18 CLC  38 SEC  58 CLI  78 SEI
98 TYA  B8 CLV  D8 CLD  F8 SED
0A ASL  2A ROL  4A LSR  6A ROR
8A TXA  AA TAX  CA DEX  EA NOP
1A*NOP  3A*NOP  5A*NOP  7A*NOP
9A TXS  BA TSX  DA*NOP  FA*NOP

With these opcodes, the first cycle of the opcode has the fetching the next byte, and the last cycle has the fetching of the next opcode, which may or may not be the same as the byte fetched earlier. If it is the same byte, the same byte is fetched twice.

Without a custom mapper I know presently only means to test those opcodes that divert the program flow, which are RTI, RTS and BRK (though BRK is technically two-byte).

There might be a way to do it with precise NMI / IRQ timing, but I have not yet been able to create a testing framework that can reliably predict the occurrence of an NMI / IRQ at cycle accuracy even when the emulator that runs it does not obey particular standard (read: hardcoded) PPU/APU timings. I am also not sure if the NMI / IRQ is checked _after_ fetching the opcode byte anyway, making the test pointless. (I don't really know at which points exactly the NMI / IRQ is checked.)