First steps in writing an emulator

First steps in writing an emulator
by janzdott on 2013-10-05 (#118963)

Hey guys, I'm new here. Glad I found the place

I'd say I'm a pretty good programmer. I've been programming for about 5 years. Started with Python, then Javascript, then C#, and just recently I've done a few things in C++. I've been interested in emulators since I found out they existed. I downloaded a N64 emulator and started playing old N64 games from my childhood. The N64 was my favorite, and the nostalgia of playing those games again was great

Afterwards, I thought it would be interesting to write a N64 emulator. I did some research for about an hour until I came to the conclusion that it would be ridiculously hard haha. So I decided I'd try emulating a simpler system. I did some research on the Chip-8, and programmed my own, all in one sitting. I learned a lot, but I like a challenge. The Chip-8 was far too easy, so yesterday I started an NES emulator. I've been doing a LOT of reading, and a bit of programming. After what I've read and programmed so far, I can say this is definitely more challenging!

I'm using C++. So far, I wrote a NES class that holds all the other components. I made a CPU class, and started a cpuCycle() function, which has switch statement for the opcodes. I plan to write a function for each opcode, and a function for each addressing mode. That way, I'd have a lot less functions compared to if I wrote a function for each unique opcode with addressing mode. That should work, right? I also wrote a Memory class, which doesn't do much besides have methods for reading and writing. I believe I understand mirroring, but it seems strange. When you write to an address on the CPU memory, it gets mirrored to the corresponding mirror addresses. Can I program this explicitly into the CPU Memory class? Can anything change how the CPU memory is mirrored? I also wrote a ROM loader. It reads the header, and loads the 1 or 2 program banks into the CPU PRG memory. I'm only going to worry about ROMs with 1 or 2 program banks, until I get it running.

So now, my NES reads the ROM and loads the program banks into memory. Now what? I'm a little confused. The program counter starts at 0, right? Which corresponds to the very first byte in the CPU memory? What gets loaded there when the ROM loads? Or does the program counter start at the lower PRG bank?

I know, it's quite a few questions. But I'm finding it hard to find answers to simple questions like these. Trust me, I've been doing a LOT of reading. I'd like to have the CPU memory set up first, so I can start programming the opcodes. That should be my next step, right? Or are there other things I have to get set up first? Could someone give me a basic step-by-step order I should program things in? Some help would be GREATLY appreciated. Thanks

EDIT: Also, does anyone know where I could find source code for a very simple and organized emulator? I looked at the Nintendulator source, and it's crazy. I'm a freak about commenting, organizing, and simplifying my code. It makes it very very easy for someone else to read and understand.

Re: First steps in writing an emulator
by Shiru on 2013-10-05 (#118965)

Take a look at SIDE and PIE.

A function per opcode and per addressing mode will work and could allow to organize things clearly, but don't forget that it could hit the performance a lot as well, with many thousands of funciton calls per frame (so think about inlining, call table, etc).

Read/write handlers is the place to implement mirroring, so yes, Memory class is the place.

When you writing an emulator, one of very first things to do is to take a look at memory map. If you do so, you'll realize NES can't start from address 0, because there is RAM. The answer is in 6502 docs, though, read about so called 'vectors' (namely reset vector), they are the thing that tells CPU where to start and where interrupt handlers are located.

And kind of an offtopic response, I've been a programmer for about 20 years now, wrote few emulators, but still I wouldn't say I'm a pretty good programmer.

Re: First steps in writing an emulator
by janzdott on 2013-10-05 (#118966)

Thank you. Yeah, I'm going to inline the functions. I'm not too worried about performance yet. I just want to get it up and running before I worry about optimization. So the reset vector tells the program counter where to start? If I have the program banks loaded into memory, and get memory mirroring set up, and set the program counter to the reset vector, can I start programming opcodes and letting the CPU cycle?

Re: First steps in writing an emulator
by Shiru on 2013-10-05 (#118967)

Yes, the program counter is loaded with value from the reset vector at reset.

I'd say that writing and debugging a 6502 emulator with a NES emulator at once is way more difficult thing to do than doing these things separately, because you could make mistakes in both counterparts without being sure where it is. A better approach for start would be just writing 6502 emulator, with very simple abstact system, like plain 64K RAM where you will manually put some test code. Just make your 6502 emulation code well separated from everything else, and you'll be able to both debug it in the test enviroment (when you need to catch a bug), and use it in complete NES emulator enviroment.

Re: First steps in writing an emulator
by tokumaru on 2013-10-05 (#118980)

janzdott wrote:

When you write to an address on the CPU memory, it gets mirrored to the corresponding mirror addresses. Can I program this explicitly into the CPU Memory class?

The best way to handle memory mirroring is doing it like the hardware does: partially decoding addresses. The reason $0000-$07FF is mirrored all the way up to $1FFF is because the NES only has 2KB of memory to fill a space of 8KB. Once the NES detects that $0000-$1FFF is being accessed (detecting $0000-$07FF would need more hardware, because it would have to watch more address lines), it uses the 11 bits it takes to address 2KB and ignores the remaining 2 address line it would take to access 8KB. Mirroring is just a side effect of ignoring some address lines, because it's cheaper to do so. You can do the same thing in software.

Quote:

Can anything change how the CPU memory is mirrored?

Carts can map anything they want from $4020 to $FFFF. Even though this is not terribly common, it's certainly possible.

Quote:

I also wrote a ROM loader. It reads the header, and loads the 1 or 2 program banks into the CPU PRG memory. I'm only going to worry about ROMs with 1 or 2 program banks, until I get it running.

Keep in mind that the NES doesn't "load" anything, no data is copied anywhere. It instantaneously sees the ROM when it's powered on. Even when bankswitching is used nothing is ever copied anywhere, what happens is that the address lines are manipulated to make different sections of a larger memory chip visible in the small window that the CPU can see.

Quote:

The program counter starts at 0, right?

Nope. It starts at the address pointed by the RESET vector. The 6502 looks for 3 special addresses at the end of the addressing space: $FFFA-$FFFF contain the addresses the CPU is supposed to jump to in case of an NMI, RESET or IRQ. If the RESET vector points to $0000 then the CPU will try to execute code from there, but that wouldn't really work because $0000 is RAM, and right after power on it's contents are undefined (NMI and IRQ can safely point to RAM if the program puts the code to handle these interrupts there though).

Quote:

What gets loaded there when the ROM loads?

Nothing, which means the CPU would try to execute a "random" sequence of undefined bytes as if they were code... that's a certain crash!

Quote:

I'd like to have the CPU memory set up first, so I can start programming the opcodes. That should be my next step, right? Or are there other things I have to get set up first? Could someone give me a basic step-by-step order I should program things in?

The CPU is a good place to start. Keep in mind that you must have the CPU, PPU and APU running in parallel at all times, each one doing there thing and interacting with each other, so you'll have to program these components in a way that they can either run little by little one at a time (often called "cycle-by-cycle", which is somewhat slow depending on the machine that's running the emulator), or predict when the next interaction with other components will be and emulate up until that point (this is significantly harder!).

Quote:

I looked at the Nintendulator source, and it's crazy. I'm a freak about commenting, organizing, and simplifying my code.

Nintendulator is one of the most accurate NES emulators out there, so it's no surprise that it's source code is pretty complex. I'm not sure how simple you can keep things if you plan on making accurate emulators, because there's all sorts of little hardware quirks that make it impossible to solve problems with straightforward solutions. If you just want to get games running (as opposed to faithfully emulating all hardware aspects), things get easier and you can make use of game-specific hacks (this makes your emulator suck as a game development tool though, since it's specifically tailored for existing games). I believe most N64 emulators are like that though, since cycle by cycle emulation gets incredibly slower as the complexity of the systems increases.

Re: First steps in writing an emulator
by janzdott on 2013-10-06 (#119030)

Thanks guys. That answered all my basic questions. So, for memory mirroring, the NES has to know when memory gets accessed, right? Would it work if I kept an array of function pointers that get called when the memory is read from or written to? And about the whole CPU testing... I found a website that had programs that test the CPU. It might have been this website actually, I don't remember. But that'll definitely be my next step. Thanks guys

Re: First steps in writing an emulator
by Dwedit on 2013-10-06 (#119033)

Also, this really should be moved to the NESemdev section...

Re: First steps in writing an emulator
by janzdott on 2013-10-06 (#119036)

Well it says NESemdev now, but someone must've moved it. Sorry! I'll post any further questions I have in this thread when they come up

Re: First steps in writing an emulator
by cpow on 2013-10-06 (#119037)

janzdott wrote:

Would it work if I kept an array of function pointers that get called when the memory is read from or written to?

Yep that would work.

janzdott wrote:

And about the whole CPU testing... I found a website that had programs that test the CPU. It might have been this website actually, I don't remember. But that'll definitely be my next step. Thanks guys

I gathered oodles of them and put them up on GitHub here.

Re: First steps in writing an emulator
by ulfalizer on 2013-10-07 (#119052)

For what it's worth, I do "slow" CPU emulation with interrupt polling in each instruction and no prediction, and CPU emulation accounts for about 4-5% of the runtime in my emulator (out of a total of using about 40% of one core on my two-year-old Core-i7 2600K). Most of that is in the read() and write() routines. My PPU code is very slow due to rendering pixel-for-pixel and doing sprite/bg pixel selection and sprite evaluation like the real PPU.

The point is that you should know your range of target systems before optimizing, and not optimize parts of your program prematurely on a guess that they'll be significant. For modern desktop systems, you can usually get away with doing the straightforward thing in an NES emulator.

Re: First steps in writing an emulator
by ulfalizer on 2013-10-07 (#119054)

When it comes to inlining, one thing you should definitely do is whole-program/link-time optimization. When you're callings lots of function per tick, letting the compiler inline across compilation units can help a lot.

Re: First steps in writing an emulator
by janzdott on 2013-10-07 (#119072)

ulfalizer wrote:

For what it's worth, I do "slow" CPU emulation with interrupt polling in each instruction and no prediction

Hmm, I was going to write a interrupt function that just sets the program counter whenever it's called. How did you set it up to check for interrupts every cycle? I would imagine that's how the CPU actually handles interrupts, but I'm not very knowledgeable about this stuff... yet :wink:

Though I do have about half of my opcodes written now. But I can't test it until they're all done. It's probably gonna be riddled with bugs. I'll stay up late working on it, and I'll find out tonight!

Oh, and one more dumb question to clarify something haha. Each memory read/write uses one cycle, right? I've been looking at a table that shows how many cycles each opcode uses. Do I just waste dummy cycle(s) for the PPU and APU to catch up when the number of cycles in my opcode doesn't match the table? I'm a little confused by this, and obviously keeping the timing correct is very important.

Re: First steps in writing an emulator
by ulfalizer on 2013-10-07 (#119074)

janzdott wrote:

ulfalizer wrote:

For what it's worth, I do "slow" CPU emulation with interrupt polling in each instruction and no prediction

The CPU checks for interrupts each instruction. The precise details are at http://wiki.nesdev.com/w/index.php/CPU_interrupts, though it's overkill to get the timing exactly right when starting out (you could just check for pending interrupts between instructions). In addition to setting the program counter, an interrupt also saves the old program counter and the flags on the stack.

janzdott wrote:

Oh, and one more dumb question to clarify something haha. Each memory read/write uses one cycle, right? I've been looking at a table that shows how many cycles each opcode uses. Do I just waste dummy cycle(s) for the PPU and APU to catch up when the number of cycles in my opcode doesn't match the table? I'm a little confused by this, and obviously keeping the timing correct is very important.

Yup, each read/write is one cycle. In fact, every cycle executed by the 6502 is either a read or a write cycle, with some being dummy reads/writes that don't do any useful work. Those can still be significant for some games due to reads/writes to certain addresses having side effects though - see Cobra Triangle in http://wiki.nesdev.com/w/index.php/Tric ... late_games.

You can look in http://nesdev.com/6502_cpu.txt to see what reads/writes are done for different instructions. Implementing the instructions like in that doc is feasible, and makes the timing work out "automagically" without tables. You can also factor out the fetch of the opcode and the byte after that, since all instructions do it.

Re: First steps in writing an emulator
by janzdott on 2013-10-07 (#119082)

ulfalizer wrote:

You can look in http://nesdev.com/6502_cpu.txt to see what reads/writes are done for different instructions. Implementing the instructions like in that doc is feasible, and makes the timing work out "automagically" without tables. You can also factor out the fetch of the opcode and the byte after that, since all instructions do it.

Thanks, that's a pretty helpful page. I hadn't read that one before. But I'm a little confused. It says, "The processors also use a sort of pipelining. If an instruction does not store data in memory on its last cycle, the processor can fetch the opcode of the next instruction while executing the last cycle." So only the opcodes that don't store data in memory on the last cycle do that, or do all of them? Is it necessary to emulate that behavior?

And by the way, thanks for helping guys. I wasn't expecting this forum to be as active as it is! Time to get crackin' on the keyboard and load up a test program and get this CPU up and running tonight :twisted:

I'm expecting bugs up the wazzoo though haha

Re: First steps in writing an emulator
by janzdott on 2013-10-08 (#119097)

Well I tried out a CPU test ROM, and a hard time getting it to work, but there was a problem with my ROM loading. I got it working now, and it executes a lot of code until it hits a loop. I'm not sure if that's intended or not. I do have 4 opcodes that aren't implemented yet. I wrote a function that dumps the memory to a text file. My next step is to write one that dumps the opcode and all the registers for each cycle, so I can compare it to the correct log file and see what problems I have to fix.

Re: First steps in writing an emulator
by ulfalizer on 2013-10-08 (#119099)

janzdott wrote:

ulfalizer wrote:

It's invisible above the hardware level and does not need to be emulated. The most useful part of that doc is the timing charts near the end.

In case you're interested though, it has to do with the 6502 being able to overlap the final cycles of one instruction with the opcode and operand fetch for the next instruction. This is possible for instructions that change state internally within the CPU during their final cycles and don't need to access memory. For example, LDA #constant really takes three cycles internally rather than two, but the move of the fetched value (from an internal register) into the A register is overlapped with the opcode fetch for the next instruction, making it effectively a two-cycle instruction. From an emulation standpoint it's not important exactly when A is set though, as long as it happens before the next instruction can use the value, so it's easier to just set A before fetching the next opcode.

Re: First steps in writing an emulator
by tokumaru on 2013-10-08 (#119101)

janzdott wrote:

I got it working now, and it executes a lot of code until it hits a loop.

The PPU needs some time to "warm up" and become usable (roughly a frame), so programs often have loops to wait a couple of VBlanks before using the PPU. You will remain stuck in these loops until you implement (or fake) the VBlank flag in register $2002 (PPUSTATUS).

Re: First steps in writing an emulator
by janzdott on 2013-10-08 (#119110)

@ulfalizer So by overlapping, you mean they happen at the same time? There's so many quirks that make this stuff hard to understand. I'm definitely gonna need to keep reading up on this stuff.

@tokumaro I checked the correct log file and compared it to mine. There's a bug in an opcode somewhere, but I'm not sure where yet. It's gonna help when I have the CPU log the opcodes and registers for each cycle in a file.

Re: First steps in writing an emulator
by ulfalizer on 2013-10-08 (#119115)

janzdott wrote:

@ulfalizer So by overlapping, you mean they happen at the same time? There's so many quirks that make this stuff hard to understand. I'm definitely gonna need to keep reading up on this stuff.

Yup, the CPU carries out the final cycles of some instructions at the same time that it fetches the next instruction (a simple form of pipelining). Another example is ADC #immediate, which seems to actually be a four-cycle instruction internally, but overlaps the last two cycles with fetches for the next instruction (the opcode byte and the byte after that). The reason this is safe is that no other instruction looks at the value of A within the first two cycles.

This is just trivia though, and not something you will need to be aware of when writing an emulator (I learned how it works pretty recently). You can pretend that A holds the sum after the second cycle of ADC #immediate, and that's what all emulators do in practice.

Re: First steps in writing an emulator
by janzdott on 2013-10-09 (#119158)

Well its good I don't have to worry about emulating that then

I spent a lot of time on a class for logging CPU instructions. It can be enabled or disabled, because I suspect it'll effect performance down the road. For each instruction executed, the program counter, instruction, and register values are pushed to a circular buffer of a fixed length. The current buffer index is kept track of, so when I dump to a file, it writes them in the correct order.

I used a circular buffer so I could limit the number of instructions that are dumped to the file. If my emulator is running for 10 minutes and I want to dump the instructions to a file, it won't dump 100,000,000 instructions. Instead it'll only dump the 1000 most recent ones. It works great, and outputs to a nice format like this...

Code:

C000 4C F5 C5 JMP $C5F5 A:00 X:00 Y:00 P:24 S:FD

It was a pain to convert the bytes into text that shows correct assembly instructions, but I felt it was necessary. I'm running nestest and looking at my log file. It seems to be executing most instructions correctly (Surprisingly

). Though, it does go off track at some point. I'll have to spend a lot of time looking through the log files to find where the problems are. I'm excited though. Things are coming along well

Re: First steps in writing an emulator
by 3gengames on 2013-10-09 (#119159)

Why in the world would you not output the same format as the other logs, compare them, and fix the ones that go off? It's not that hard at all to get CPU running, honestly, it's cake compared to PPU and audio.

Re: First steps in writing an emulator
by janzdott on 2013-10-09 (#119160)

3gengames wrote:

Isn't my format the same as the others? I'm comparing it to a log from nintendulator and it's the same, except I left out the cycle number, and one other number which I don't know the meaning of. But I decided I'm going to write a program to compare the logs for me, and just give me the first address where they differ. I would rather spend my time fixing bugs than reading through log files haha

Re: First steps in writing an emulator
by 3gengames on 2013-10-09 (#119161)

You won't find the debugs if you don't compare the logs though, see how you this is a needed step? I mean, I haven't written an emu, but I've written enough programs to know how to design them.

Re: First steps in writing an emulator
by tepples on 2013-10-09 (#119163)

When I was verifying a Python 6502 simulator against neatest, I just read register values, PC, and CYC from each line of the Nintendulator log and compared those. That way I don't need to worry about disassembly.

Re: First steps in writing an emulator
by janzdott on 2013-10-09 (#119164)

3gengames wrote:

You won't find the debugs if you don't compare the logs though, see how you this is a needed step? I mean, I haven't written an emu, but I've written enough programs to know how to design them.

But I AM comparing the logs. I just said I was writing a program to do it for me, so I don't have to read through the files by hand. My computer reads much faster than I do :wink:

tepples wrote:

^^This is exactly what I'm doing, except I added disassembly for readability

Re: First steps in writing an emulator
by janzdott on 2013-10-10 (#119169)

I'm utterly stuck

Could someone with their own emulator PLEASE help me out? I've been trying, and there's no way for me to find the source of my problem. This is from Nintendulator's log of nestest.

Code:

CFBF  85 FF     STA $FF = 00                    A:00 X:55 Y:69 P:27 SP:FB CYC: 16 SL:1
CFC1  A9 04     LDA #$04                        A:00 X:55 Y:69 P:27 SP:FB CYC: 25 SL:1
CFC3  85 00     STA $00 = 00                    A:04 X:55 Y:69 P:25 SP:FB CYC: 31 SL:1
CFC5  A9 5A     LDA #$5A                        A:04 X:55 Y:69 P:25 SP:FB CYC: 40 SL:1
CFC7  8D 00 02  STA $0200 = 00                  A:5A X:55 Y:69 P:25 SP:FB CYC: 46 SL:1
CFCA  A9 5B     LDA #$5B                        A:5A X:55 Y:69 P:25 SP:FB CYC: 58 SL:1
CFCC  8D 00 03  STA $0300 = 00                  A:5B X:55 Y:69 P:25 SP:FB CYC: 64 SL:1
CFCF  A9 5C     LDA #$5C                        A:5B X:55 Y:69 P:25 SP:FB CYC: 76 SL:1
CFD1  8D 03 03  STA $0303 = 00                  A:5C X:55 Y:69 P:25 SP:FB CYC: 82 SL:1
CFD4  A9 5D     LDA #$5D                        A:5C X:55 Y:69 P:25 SP:FB CYC: 94 SL:1
CFD6  8D 00 04  STA $0400 = 00                  A:5D X:55 Y:69 P:25 SP:FB CYC:100 SL:1
CFD9  A2 00     LDX #$00                        A:5D X:55 Y:69 P:25 SP:FB CYC:112 SL:1
CFDB  A1 80     LDA ($80,X) @ 80 = 0200 = 5A    A:5D X:00 Y:69 P:27 SP:FB CYC:118 SL:1
CFDD  C9 5A     CMP #$5A                        A:5A X:00 Y:69 P:25 SP:FB CYC:136 SL:1
CFDF  D0 1F     BNE $D000                       A:5A X:00 Y:69 P:27 SP:FB CYC:142 SL:1
CFE1  E8        INX                             A:5A X:00 Y:69 P:27 SP:FB CYC:148 SL:1
CFE2  E8        INX                             A:5A X:01 Y:69 P:25 SP:FB CYC:154 SL:1
CFE3  A1 80     LDA ($80,X) @ 82 = 0300 = 5B    A:5A X:02 Y:69 P:25 SP:FB CYC:160 SL:1
CFE5  C9 5B     CMP #$5B                        A:5B X:02 Y:69 P:25 SP:FB CYC:178 SL:1
CFE7  D0 17     BNE $D000                       A:5B X:02 Y:69 P:27 SP:FB CYC:184 SL:1
CFE9  E8        INX                             A:5B X:02 Y:69 P:27 SP:FB CYC:190 SL:1
CFEA  A1 80     LDA ($80,X) @ 83 = 0303 = 5C    A:5B X:03 Y:69 P:25 SP:FB CYC:196 SL:1
CFEC  C9 5C     CMP #$5C                        A:5C X:03 Y:69 P:25 SP:FB CYC:214 SL:1
CFEE  D0 10     BNE $D000                       A:5C X:03 Y:69 P:27 SP:FB CYC:220 SL:1
CFF0  A2 00     LDX #$00                        A:5C X:03 Y:69 P:27 SP:FB CYC:226 SL:1
CFF2  A1 FF     LDA ($FF,X) @ FF = 0400 = 5D    A:5C X:00 Y:69 P:27 SP:FB CYC:232 SL:1
CFF4  C9 5D     CMP #$5D                        A:5D X:00 Y:69 P:25 SP:FB CYC:250 SL:1

The instruction at CFBF stores the value 00 at the address FF. In my emulator, this is the ONLY time this address is written to, up until my problem. My problem is the instruction at CFF2. X = 0, so this instruction looks up an address at FF and 0100. The thing is, FF is 0, and 0100 is never written to in my emulator. If both of them were 00, that points to address 0000, which is 04. That's the number my emulator loads into the accumulator. It's not right, because Nintendulator's log says it should be 5D.

I've tried and tried, and there's no way for me to find the source of the problem. It could be an instruction writing the wrong value to an address, or it could be an instruction writing a value to the wrong address. Since it's a problem with an instruction that uses indirect x addressing, that makes it even harder to find the problem. I've tracked the memory reads and writes, and that didn't help. Basically I need someone with a working emulator to help me. If someone could just tell me one thing...

1. Which instructions write to the addresses 0000, 00FF, and 0100?

If someone could just add a check for writes to those addresses before CFF4, that would help me SO MUCH. I'm horribly stuck and I don't know how to find the source of the problem

EDIT: I just thought of something... Does indirect addressing wrap around back to 00 when you go over FF? If so, that could be my problem. I'll have to try it out tomorrow because I'm exhausted. I'll let you guys know when I find out.

Re: First steps in writing an emulator
by tepples on 2013-10-10 (#119170)

Zero page indexing wraps within zero page. ($80,X) with X=$7F reads the low byte from $00FF and the high byte from $0000.

Two of the writes that set up this test are at $CFBF and $CFC3.

Re: First steps in writing an emulator
by janzdott on 2013-10-10 (#119174)

tepples wrote:

Zero page indexing wraps within zero page. ($80,X) with X=$7F reads the low byte from $00FF and the high byte from $0000.

Two of the writes that set up this test are at $CFBF and $CFC3.

Sure enough, that was the problem

I've made it almost half way through the test, and now I'm stuck on another one which should be simple, but doesn't seem to make any sense at all. My problem is at DBB5, the instuction JMP ($02FF). The value at the address 02FF is A900. I know this is correct, because that's the value in my emulator, and the Nintendulator log conveniently shows the value after the instruction like so, "JMP ($02FF) = A900".

Correct me if I'm wrong, but doesn't that mean jump to A900? My emulator correctly executes an indirect jump at DB7B just before this... I know my indirect addressing is working correctly, and my emulator jumps to A900. The Nintendulator log shows that it should jump to 0300. I don't understand...

Code:

DB7B  6C 00 02  JMP ($0200) = DB7E              A:DB X:07 Y:00 P:E5 SP:FB CYC:326 SL:62
DB7E  A9 00     LDA #$00                        A:DB X:07 Y:00 P:E5 SP:FB CYC:  0 SL:63
DB80  8D FF 02  STA $02FF = 00                  A:00 X:07 Y:00 P:67 SP:FB CYC:  6 SL:63
DB83  A9 01     LDA #$01                        A:00 X:07 Y:00 P:67 SP:FB CYC: 18 SL:63
DB85  8D 00 03  STA $0300 = 89                  A:01 X:07 Y:00 P:65 SP:FB CYC: 24 SL:63
DB88  A9 03     LDA #$03                        A:01 X:07 Y:00 P:65 SP:FB CYC: 36 SL:63
DB8A  8D 00 02  STA $0200 = 7E                  A:03 X:07 Y:00 P:65 SP:FB CYC: 42 SL:63
DB8D  A9 A9     LDA #$A9                        A:03 X:07 Y:00 P:65 SP:FB CYC: 54 SL:63
DB8F  8D 00 01  STA $0100 = 00                  A:A9 X:07 Y:00 P:E5 SP:FB CYC: 60 SL:63
DB92  A9 55     LDA #$55                        A:A9 X:07 Y:00 P:E5 SP:FB CYC: 72 SL:63
DB94  8D 01 01  STA $0101 = 00                  A:55 X:07 Y:00 P:65 SP:FB CYC: 78 SL:63
DB97  A9 60     LDA #$60                        A:55 X:07 Y:00 P:65 SP:FB CYC: 90 SL:63
DB99  8D 02 01  STA $0102 = 00                  A:60 X:07 Y:00 P:65 SP:FB CYC: 96 SL:63
DB9C  A9 A9     LDA #$A9                        A:60 X:07 Y:00 P:65 SP:FB CYC:108 SL:63
DB9E  8D 00 03  STA $0300 = 01                  A:A9 X:07 Y:00 P:E5 SP:FB CYC:114 SL:63
DBA1  A9 AA     LDA #$AA                        A:A9 X:07 Y:00 P:E5 SP:FB CYC:126 SL:63
DBA3  8D 01 03  STA $0301 = 00                  A:AA X:07 Y:00 P:E5 SP:FB CYC:132 SL:63
DBA6  A9 60     LDA #$60                        A:AA X:07 Y:00 P:E5 SP:FB CYC:144 SL:63
DBA8  8D 02 03  STA $0302 = 00                  A:60 X:07 Y:00 P:65 SP:FB CYC:150 SL:63
DBAB  20 B5 DB  JSR $DBB5                       A:60 X:07 Y:00 P:65 SP:FB CYC:162 SL:63
DBB5  6C FF 02  JMP ($02FF) = A900              A:60 X:07 Y:00 P:65 SP:F9 CYC:180 SL:63
0300  A9 AA     LDA #$AA                        A:60 X:07 Y:00 P:65 SP:F9 CYC:195 SL:63

My emulator correctly executes DB7B "JMP ($0200) = DB7E" by jumping to DB7E. At DBB5, my emulator executes "JMP ($02FF) = A900" and jumps to A900 instead of 0300 like the Nintendulator log says it should. I'm absolutely clueless here. Is the Nintendulator log correct? There has to be an explanation for this...

EDIT: The guys on EFnet explained the "hardware quirk" with the indirect jump. I've gotta say, it amazes me how active and helpful the nesdev community is. Thank you guys for helping out a noob like me

Re: First steps in writing an emulator
by janzdott on 2013-10-10 (#119177)

A little after halfway through nestest, I hit some weird NOPs that I wasn't expecting (opcodes 04, 44, and 64). Those are unofficial opcodes right? I wasn't planning on implementing unoffical ones until my emulator is running. If I'm hitting those, is it safe to say my official ones are working and I can move on?

Re: First steps in writing an emulator
by tepples on 2013-10-10 (#119180)

There are both homebrew and commercial games that use unofficial opcodes. Among them are Puzznic, Super Cars, Driar, and STREEMERZ: Super Strength Emergency Squad Zeta. They're used because they're useful.

Re: First steps in writing an emulator
by janzdott on 2013-10-10 (#119182)

^^I know that home made games use them, but no official games do, right? (EDIT: I totally missed the part where you said commercial games do... I didn't know that.) I'm just starting out with official mapper 0 games like Super Mario Bros 1. I'll implement unofficial opcodes once my emulator is actually up and running. Is Super Mario Bros 1 an alright ROM to start with? My CPU runs the title screen just fine as far as I can tell. I know enough OpenGL, so I can create a window and get the PPU working with just the title screen first. At least, that's what I was planning on doing. Here's what I think my next steps are...

- Start a PPU class with registers
- Create memory callbacks from 2000 to 3FFF to emulate mirroring, and redirect reads/writes to PPU registers
- Create PPU memory class (inherits from my generic memory class)
- Create memory callbacks for PPU class for mirroring
- Start working on the PPU

So far, I have very little knowledge of how the PPU works. I've only been focusing on the CPU so far. I'll go back to reading once I get to that point. Hopefully I can get something to display on the screen soon. I only started working on this 6 days ago, so I think I'm making good progress

Oh, and one question... For each CPU cycle, the PPU and APU cycle 3 times, right?

Re: First steps in writing an emulator
by lidnariq on 2013-10-10 (#119183)

Conventional wisdom is that SMB1 is actually one of the hardest "simple" games to get working. Starting off with something simpler, like Donkey Kong, might be preferable.

(See also: nesdevwiki:Tricky-to-emulate games and nesdevwiki:Game bugs )

Re: First steps in writing an emulator
by janzdott on 2013-10-10 (#119184)

Oh I didn't know that. I'll start with Donkey Kong then. What are the best resources for writing the PPU?

Re: First steps in writing an emulator
by tokumaru on 2013-10-10 (#119185)

janzdott wrote:

So far, I have very little knowledge of how the PPU works.

The PPU, just like the CPU, is constantly running, but instead of fetching and executing instructions it repeats the process of generating a video signal, which consists of: 20 scanlines of VBlank (70 for PAL), 1 pre-render scanline, 240 scanlines of picture, and finally, 1 dummy scanline (51 for Dendy). You just repeat this over and over, modifying the rendering parameters as the registers are written to.

Quote:

Oh, and one question... For each CPU cycle, the PPU and APU cycle 3 times, right?

The APU, being part of the CPU chip, probably runs at the same rate as the CPU, but someone else will have to confirm that, since my knowledge of the APU is limited.

As for the CPU/PPU ratio, yes, on NTSC each CPU cycle corresponds to 3 PPU cycles. But I wouldn't hardcode it this way, since the ratio for PAL is different (1 CPU cycle = 3.2 PPU cycles).

BTW, you can check NTSC/PAL/DENDY differences here. If you plan on supporting all 3 systems, you shouldn't hardcode any of the variable parameters.

Quote:

Oh I didn't know that. I'll start with Donkey Kong then. What are the best resources for writing the PPU?

I'd say our wiki has the most up to date information, but it's all in the format of reference documents, not guides of any sort. This looks like a good overview of what happens every frame.

Re: First steps in writing an emulator
by janzdott on 2013-10-10 (#119188)

tokumaru wrote:

janzdott wrote:

So far, I have very little knowledge of how the PPU works.

That was actually very helpful. I've been looking, and I've been unable to find a simple description of what the PPU actually does each cycle. That gave me a better idea than any of the things I've read so far. I'm guessing it doesn't do a whole scanline each cycle?

Re: First steps in writing an emulator
by tepples on 2013-10-10 (#119192)

The PPU counts time in "dots", and it takes 341 dots (with some exceptions) to render one scanline.

Re: First steps in writing an emulator
by janzdott on 2013-10-10 (#119198)

I created a ppu class with registers, and a ppuMemory class. I have the character bank loaded into the ppu memory. Now I'm a little lost. I know about pattern tables and tiles and nametables. But I don't understand where the nametables come from. I'm guessing the cpu passes them to the ppu through 2000-2007? If that's right, how does the ppu read them into it's memory? All the information I can find discusses patters, tiles, nametables, sprites, palettes, etc. But I can't find anything that actually says "This is how the ppu works... This is what it does each cycle..."

Edit: I'm starting to understand it. I'm going to try loading nametables and outputting them as text to the console. I'll check back in when I get that done.

Re: First steps in writing an emulator
by tokumaru on 2013-10-11 (#119200)

janzdott wrote:

But I don't understand where the nametables come from.

The name tables are usually stored in a 2KB RAM chip inside the NES (this can be changed though: carts can and do disable those 2KB and instead use their own memory for name tables, which can even be ROM! - so don't hardcode the name table memory to 2KB and don't assume it's always RAM), which the game program fills through $2006/$2007. Name tables are constantly changed as games run.

Quote:

If that's right, how does the ppu read them into it's memory?

When the CPU uses $2006/$2007 to send data to the PPU, this data is stored into the PPU memory. During rendering, the PPU will gather information from its own memory to generate the pixels that form the final picture.

Quote:

All the information I can find discusses patters, tiles, nametables, sprites, palettes, etc. But I can't find anything that actually says "This is how the ppu works... This is what it does each cycle..."

The wiki page I linked to has a very helpful diagram that indicates what happens each cycle. It doesn't get into details though, for that you have to check the rest of the page, and some other pages in the wiki. The info is all there, you just have to put it all together (otherwise writing an emulator would be no challenge at all!).

Re: First steps in writing an emulator
by janzdott on 2013-10-11 (#119224)

tokumaru wrote:

I've been looking at all the wiki pages. I usually have about 20 browser tabs open at a time, where most of them are the wiki! The problem was, there's SO much information that it makes it very difficult to just find the basics to get things started. If I'm looking for basic information, I have to read through tons of advanced stuff that just ends up confusing me. I don't mean to complain about the wiki at all. Its great. It was just a little overwhelming for someone who doesn't know any of this stuff.

I was having a really hard time understanding the PPU basics. But Ulfalizer and jero32 at #nesdev explained a lot of the basics to me. I was able to get my PPU registers and vlbank working enough so the games will start writing nametable data to the registers. I should be okay for a while. Now that I have something to build off of, I can hopefully get things to work. I was just clueless about where to start, and I needed some clarification.

I'm glad everyone here and at #nesdev is so helpful. I would have still been stuck on my CPU if you guys weren't here to explain things to me. If I get my emulator up and running, I'd like to write a really in-depth explanation of how to write one and where to start, in hopes that it would help out noobs like me

Re: First steps in writing an emulator
by tepples on 2013-10-11 (#119232)

janzdott wrote:

Ulfalizer and jero32 at #nesdev explained a lot of the basics to me. I was able to get my PPU registers and vlbank working enough so the games will start writing nametable data to the registers.

Would it have helped to describe a simplified behavior to mock the PPU in a new emulator to the point where Donkey Kong will start to boot?

Re: First steps in writing an emulator
by janzdott on 2013-10-11 (#119237)

Yeah, for people who don't know much about this stuff, it would be nice if there was a place that gave you a simple outline of what you need to implement to get you started and then you can worry about the details on your own. At least that's the way I learn, but everybody learns differently. I couldn't read through the wiki pages and discern the difference between minute details and important features. That's why I'd like to write a really detailed explanation for other newcomers when my emulator's working.

But they explained how the 2006-2007 registers work, and how the vblank flag works, and that's all I needed to get started. My games started writing to the registers after that, so I could start on my rendering code. Currently my PPU manages to get the correct nametable and pattern addresses, get the current tile number, read the pattern bytes, and come up with color index for each pixel of the background, not including attribute color because I'm not sure how that works yet. Next, I'm adding the color palette and a gui. I'm usually pretty good at figuring this stuff out once I get started. It's getting started that's the hard part haha. But hopefully I'll get some pixels on the screen tonight! Do you think I should use OpenGL or SDL? I know how to do this kind of thing easily in OpenGL, but I heard SDL handles input and audio as well, but I've never used it.

Edit: I decided to try out SDL. I gotta say, I'm liking it. It's very easy to use. It's kind of a pain to use OpenGL for this kind of thing. I would have had to create a texture and display it on a quad. With SDL, it's a lot easier and only a few lines.

Re: First steps in writing an emulator
by janzdott on 2013-10-12 (#119248)

Well, my first PPU render was pretty disappointing. I don't think I see donkey kong in there.... Lol I'll see if I can find what the problem is.

Edit: I'm on the right track. It looks like I'm not reading the right pixels from the pattern tiles

Edit #2: It took a LOT of debugging and a pretty long time. I was doing almost everything correctly, there were just 2 very small and very stupid mistakes. Here's my first real screenshot! Does anyone know why the Donkey Kong letters are messed up?

Re: First steps in writing an emulator
by tepples on 2013-10-12 (#119253)

Just a guess, but are you emulating the +32 increment mode in PPUCTRL ($2000)? I imagine that it uses +32 to make vertical columns of blocks in the title.

Re: First steps in writing an emulator
by janzdott on 2013-10-12 (#119255)

Yep. Everything else displays correctly, besides sprites because I'm not touching those yet. It even shows the little demo screen and Donkey Kong moves when he throws the barrels. Idk what it is

Oh, and there was a question I wanted to ask about the address register. Does it always keep track of the upper and lower bytes, or can something reset it?

Re: First steps in writing an emulator
by Dwedit on 2013-10-12 (#119265)

Reading ppu status (2002) will reset the high-byte, low-byte latch.

Re: First steps in writing an emulator
by janzdott on 2013-10-12 (#119268)

Okay. How does it actually deal with incrementing? Right now, I have it add the upper and lower bytes together and then store them in a separate variable. Then when the data register is written to, that separate variable gets incremented and the registers are left alone. Is that right, or should I actually increment the register?

Re: First steps in writing an emulator
by Dwedit on 2013-10-12 (#119269)

The registers are write only, and cannot be read back. So writing to 2005 or 2006 will make changes to the state of the PPU, and 2005 and 2006 can not be read back.
There's lots of more details about scrolling, and it's been posted to death on the boards and wiki already.

Re: First steps in writing an emulator
by janzdott on 2013-10-12 (#119271)

Well what I'm curious about is, say a game writes to the address register with $10 and then $00, the address would then be $1000. If in increments 256 times, it would then be $1100. Then you write $08 to the address register, without resetting the latch... Would it then be $1108? Or does it go off of the original last byte written, so it would be $0008?

By the way, I got attribute tiles and the background color palette working. I still don't know why the Donkey Kong letters are messed up, or why some of the ladders are too, or why only half of Donkey Kong is showing... Any ideas?

Re: First steps in writing an emulator
by WedNESday on 2013-10-12 (#119272)

janzdott have you passed all of the nestest.nes opcode tests? Just a thought to rule out CPU error.

Re: First steps in writing an emulator
by Dwedit on 2013-10-12 (#119274)

Looks like you haven't implemented the 32 or 1 increment mode.

Re: First steps in writing an emulator
by janzdott on 2013-10-12 (#119278)

It passed all the CPU tests up until it hit unofficial opcodes. I'm assuming my CPU is working fine, but I wouldn't be surprised if there was still a couple bugs. And I've implemented both incrementing modes. I've checked my code several times and it looks fine. But I'll check again by doing some test writes,

Re: First steps in writing an emulator
by 3gengames on 2013-10-12 (#119279)

Things to note:

The $2006 latch is only updated to LoopyV when the 2nd write is done, could that be it? But when it is, it'll update it...I don't know what the default first value is honestly.

Re: First steps in writing an emulator
by tokumaru on 2013-10-12 (#119280)

janzdott wrote:

say a game writes to the address register with $10 and then $00, the address would then be $1000. If in increments 256 times, it would then be $1100. Then you write $08 to the address register, without resetting the latch... Would it then be $1108? Or does it go off of the original last byte written, so it would be $0008?

Loopy's famous document, "the skinny on NES scrolling", documents everything that happens on each write to $2005/$2006 and to the lower 2 bits of $2000 (i.e. everything that deals with scrolling). The basic idea is that writes to $2006 don't affect the PPU's address register directly, the data actually goes to a temporary register first, and only after the second write the contents of the temporary register are copied to the actual address register. So writing $08 to $2006 won't really do anything to the address register, since the address is only modified after the second write.

But if you do manage to trick the PPU into setting an incomplete address (like by writing to $2005 and then to $2006), bits that are not updated will most likely maintain the value from when you last set the address, because unlike the address register, the temporary register doesn't auto increment.

PS: Loopy's document is not the easiest to understand, but there's a wiki page about it, which tries to explain things further (and ends up making things look pretty complicated!). The basic idea is that writes to $2000, $2005 and $2006 don't affect the VRAM address directly, but instead go to a temporary register (the original document uses 1's to indicate which bits get copied where), and only after the second write to $2006 (the latch that selects between first and second writes is shared between $2005 and $2006, so if you write to $2005 and then to $2006, the $2006 write is still detected as a second write) the temporary register is copied to the address register.

The document also says that the X coordinates are copied from the temporary register to the address register every scanline (so that each scanline starts from selected horizontal position), and that the full address is copied at the beginning of the frame (because games are supposed to use $2000/$2005 to set the scroll during VBlank, not $2006).

In case it's not clear, the PPU's address register is used when the CPU needs to access VRAM, but also when the PPU reads it for rendering the image, which is why talking about the address register eventually brings up talks about scrolling.

Re: First steps in writing an emulator
by janzdott on 2013-10-12 (#119285)

I came across that page and thought it looked like a good read for when I work on scrolling. I'll give it a look tomorrow after I try fixing the graphics bug

Edit: I fixed it, and it was because it was checking for the 2nd bit of the controller register to be set for 32 incrementing, instead of the 3rd bit. *Sigh* its always the little things that get me haha. There's still something funky going on with my attribute tiles. Part of Donkey Kong is discolored, and the numbers are strange because they should all be white. I've checked my code several times and rewrote it once, and I don't know what it is. I'm just going to move on and see if I can get sprites to show up

Re: First steps in writing an emulator
by janzdott on 2014-01-08 (#123508)

Hey guys. I stopped working on my emulator for a while, but I recently started again. I was using SDL, but I switched over to WinApi and OpenGL. I may switch to Qt or some other cross-platform GUI library if this ever gets finished. I completely changed the way my PPU renders. Now I'm actually emulating the shift registers and fetching the nametable, attribute, and patterns at the correct times, though there's a glitch with the leftmost 2 tiles. Surprisingly, its faster than before. I also fixed my color glitch, which was caused by reading attributes incorrectly. No controls yet, I'm going to add that after sprites. Kinda pointless to have controls and no character to move around haha. I also wrote a nice little CPU debugger window. I can see and change the registers in real time, and also view the disassembly.

It runs the demo screens of Donkey Kong and Donkey Kong 3 fine. There's some problem with Super Mario Bros 1 after it draws the title screen background where it tries writing to CHR ROM. Not sure what's causing that yet. Here's a screen of Donkey Kong 3 and my CPU debugger window.

Re: First steps in writing an emulator
by lidnariq on 2014-01-08 (#123510)

Looks like an off-by-one. Are you using the Y value for the preceeding scanline for the first two fetches of the next scanline?

Re: First steps in writing an emulator
by janzdott on 2014-01-08 (#123514)

Yep lol. I had to leave right away, so I didn't have time to fix it before I posted the screens. I started working on sprite evaluation, so I should have sprites showing up soon. Probably by tomorrow if I feel like programming later tonight

Re: First steps in writing an emulator
by tokumaru on 2014-01-08 (#123527)

janzdott wrote:

There's some problem with Super Mario Bros 1 after it draws the title screen background where it tries writing to CHR ROM.

Writing to CHR-ROM? It's a known fact that SMB *reads* data from CHR-ROM in order to draw the title screen. For this to work, it's important that you emulate the $2007 read delay: when $2007 is read, a buffered value is sent to the CPU and the actual value from VRAM/ROM is put into the buffer, meaning that games have to throw away the first of a group of $2007 reads in order to read data from VRAM/ROM. The palette is an exception, because it's stored inside the PPU itself so there's no delay.

Re: First steps in writing an emulator
by tepples on 2014-01-08 (#123533)

tokumaru wrote:

Writing to CHR-ROM?

Some of Shiru's programs do this on accident, and I've been guilty too. I had to patch out these bugs for the multicart. I wonder whether this is something that lot check would normally have caught during the NES's commercial era.

Re: First steps in writing an emulator
by janzdott on 2014-01-08 (#123535)

By writing to CHR-ROM, I don't mean its SUPPOSED to be doing that haha :wink:

There's a bug in my emulator somewhere causing it to do that. I'm not sure if it's a CPU or PPU problem. I tried for a while to find what's causing it, but it's not easy to track down. My emulator throws exceptions when writing to ROM. If I comment out the exceptions, it continues running but doesn't finish drawing the title screen. It gets here and doesn't set the background color or draw the title.

Re: First steps in writing an emulator
by Joe on 2014-01-08 (#123536)

Super Mario Bros is on the list of tricky-to-emulate games for having issues very similar to what you're describing.

Re: First steps in writing an emulator
by janzdott on 2014-01-08 (#123537)

"Super Mario Bros. is probably the hardest game to emulate among the most popular NROM games, which are generally the first targets against which an emulator author tests his or her work. It relies on JMP indirect, correct palette mirroring (otherwise the sky will be black; see PPU palettes), sprite 0 detection (otherwise the game will freeze on the title screen), the 1-byte delay when reading from CHR ROM through $2007 (see The PPUDATA read buffer), and proper behavior of the nametable selection bits of $2000 and $2006.[1] In addition, there are several bad dumps floating around, some of which were ripped from pirate multicarts whose cheat menus leave several key parameters in RAM."

Thank you, Joe! Describes my situation exactly. I'll forget about SMB until I get Donkey Kong completely working.

Re: First steps in writing an emulator
by Dwedit on 2014-01-09 (#123547)

Devil World is slightly harder to emulate, since it's the first game to use split screen vertical scrolling.

Re: First steps in writing an emulator
by janzdott on 2014-01-09 (#123548)

I've done my PPU timing exactly how the diagram shows, which is important for things like splitscreen right? Does the game just count cycles from vBlank and change the scroll half way through the scanline?

And good news, I got sprites working last night! Was much easier than I thought it would be. I haven't done flipping of sprites yet, but that should only take a couple minutes. Then on to controls

Re: First steps in writing an emulator
by tepples on 2014-01-09 (#123550)

Scroll changes between one scanline and the next are common. Scroll changes halfway through a scanline require much more precise synchronization to the raster, and I'm not aware of any commercial NES game with tight enough timing to do that stably. Some games, such as Super Mario Bros. 3 and Mega Man 3, slightly mistime a scroll change, causing a single glitched scanline with OK scanlines below it.

Re: First steps in writing an emulator
by janzdott on 2014-01-09 (#123551)

So that's not how splitscreen works? I don't know anything about scrolling yet. Once I do controls, I'll do scrolling. Is Excitebike a good one to start with? My rom gets to the demo screen and doesn't show sprites, but I'm guessing that's because I haven't added scrolling yet.

Re: First steps in writing an emulator
by tepples on 2014-01-09 (#123554)

I was confused by your use of "half way through the scanline". Split screen scrolling in NES games works by changing the scroll value mid-screen, usually in the 64-dot (21-cycle) period between one scanline and the next. Doing so mid-screen allows scrolling each horizontal strip separately. A game would have to change the scroll value mid-scanline in order to have vertical strips, and MMC5 is the only mapper I know of on NES that's capable of that.

Re: First steps in writing an emulator
by janzdott on 2014-01-09 (#123555)

Alright, that's what I thought. Getting the timing right for that is gonna suck.

And the big day's finally here guys! I added controls and I can now play games on my emulator

I was too lazy to fix the leftmost 2 tiles. It's just a matter of adding an if statement to check the dot an increment the scanline. What I don't understand is, why does my sprite rendering cause artifacts on the rightmost column of pixels? I'm rendering my sprites at the exact correct coordinates. I checked this with Nintendulator. When both emulators are maximized, they line up perfectly. Here's a screen.

Re: First steps in writing an emulator
by beannaich on 2014-01-09 (#123568)

janzdott wrote:

What I don't understand is, why does my sprite rendering cause artifacts on the rightmost column of pixels? I'm rendering my sprites at the exact correct coordinates. I checked this with Nintendulator. When both emulators are maximized, they line up perfectly.

IIRC sprites with X=$FF have the bit planes masked to $00 on load (dots 262, 264, 270, 272, 278, 280, 286, 288, 294, 296, 302, 304, 310, 312, 318, 320), that would cause those sprites to be rendered transparently. Either do that or refuse to output sprite pixels on that dot, and don't allow sprite zero collisions for that dot either.

As far as the first two tiles goes, are you sure that the tile pre-fetch (321-337) are being handled properly? Both fetches increment the 'x' bits of the counter, and both happen after the 'y' bits have been reset to the latched values.

Re: First steps in writing an emulator
by janzdott on 2014-01-09 (#123589)

Ah, that makes sense. I'll add a check to make sprites with an x position of 0xFF clear.

And yeah, it's doing the nametable and attribute fetching correctly, where it wraps around to the next scanline. I just didn't do the same with the pattern fetches. That's why it draws the correct tiles on the left, they're just 1 pixel down from where they should be. I haven't fixed that yet because of laziness on my part. I was too excited to get working on sprites

I just fixed both those problems, so graphics should be 100% correct now. I'm going to add scrolling next. I'm not understanding it very well at the moment. I'm still learning about all this stuff as I go.

Re: First steps in writing an emulator
by janzdott on 2014-01-10 (#123603)

I have a problem that I can't figure out, and it's driving me absolutely crazy.

If anyone could help me, I would very very much appreciate it.

I started emulating the v, t, x and w registers for the PPU. I'm not yet using them during rendering at all. I'm only using them when reading or writing $2000, and writing $2005 and $2006. My writes to the PPU registers set the values of v, t, x and w correctly. I am positive of this.

Since the second write to $2005 (PPU scroll) doesn't set v equal to t, PPU scroll must be written to before rendering, and the PPU must set v equal to t before rendering. This happens at dot 304 of the pre-render scanline. So far, this is the only time I'm changing the value of v during rendering. When entering the first level of a game, new nametable data needs to be written. The problem is, the PPU changes the value of v on the pre-render scanline, then the games AREN'T writing a new address to $2006 after rendering, before they start writing data to $2007. It's like they are expecting v to not change. So parts of the title screen are not being overwritten. If I don't set the v equal to t on the pre-render scanline, the problem goes away. This happens with every game I try.

I have verified 100% that they are NOT writing a new address to $2006 before they start writing to $2007. I have also verified that they aren't writing to $2007 outside of vBlank. They write to PPU scroll, which changes the value of t. Then on the pre-render scanline, the PPU sets v equal to t. After that frame is rendered, and vBlank starts, the games start writing to $2007, without writing to $2006 first. So the address being written to isn't correct.

So my question: At the start of vBlank, does the PPU restore the value of v to what it was before it was changed on the pre-render scanline? That's the only way I can make sense of this... I haven't seen that mentioned anywhere in the wiki.

Re: First steps in writing an emulator
by tepples on 2014-01-10 (#123604)

The PPU doesn't set v equal to t when rendering is turned off in $2001. When loading a level, a game will keep rendering off until an entire screen has been constructed, then set the scroll and turn on rendering before the pre-render line.

Re: First steps in writing an emulator
by janzdott on 2014-01-10 (#123617)

Here's what the problem is. I was unable to solve it correctly, but I fixed it temporarily with a hack.

When leaving the title screen and entering the first level, the games must replace the nametable data. They don't finish writing the nametable data before the end of vBlank. Once rendering starts, the value of the v register changes. At the end of the frame, when vBlank starts again, the games try to continue writing nametable data. Since v changed during rendering, the nametable data is written to incorrect addresses. This is happening with every game I try, now that I'm emulating v, t, x and w.

I tried slowing my PPU down by a factor of 3, and the CPU still wrote nametable data past vBlank. It would be okay if it set the address again when it resumes, but it doesn't.

I have no idea how to fix this. What I did was add a second v register, which is used only by the $2006 and $2007 registers. That way, the address written to by $2007 doesn't change after rendering a frame. That fixed the problem temporarily. If anyone has any ideas why this is happening, I would be glad to hear them.

Re: First steps in writing an emulator
by tepples on 2014-01-10 (#123619)

"Second V register" is the solution adopted by the ColecoVision, MSX, and Master System, not the NES. The correct solution for the NES is just to skip setting v = t on the prerender line if rendering is turned off.

Re: First steps in writing an emulator
by janzdott on 2014-01-10 (#123624)

I do only set v on the pre-render line when rendering is on. I have all of my rendering code (except for the vBlank flag set/reset and the NMI interrupt) inside an if statement that checks if rendering is enabled. Rendering is enabled if bits 3 or 4 are set in PPUMASK. It still increments the dot and scanline, and geneterates NMIs normally if rendering is off. Is there something about rendering being on/off that I'm missing?

Re: First steps in writing an emulator
by janzdott on 2014-01-14 (#123855)

I just rewrote my CPU from the ground up, to make sure there were no bugs. It's also 100% cycle-accurate now, including the instructions that add 1 cycle when crossing page boundaries. Every read/write triggers 3 PPU cycles. It's hardcoded to 3 right now, but I'll add the option for switching to PAL later.

I was able to squeeze a little more performance out of it, but it's the PPU that's the bottleneck. It ran at insane framerates before I added sprite rendering. It runs slightly faster than Nintendulator when Nintendulator's sound is off and framerates aren't capped. I still have a lot of room for optimization in the PPU.

Scrolling works in Bomberman and Balloon Fight, but there's a glitch in Excitebike. I think that's due to a bug with my sprite 0 hit, but I'm not sure yet. I haven't tried any other scrolling games yet. SMB1 still crashes due to some error making it try to write to CHR ROM. I'm positive it's a PPU bug. Now that everything is cycle-accurate, I should be able to make some logs and compare it to Nintendulator's. The dot and scanline counters between our logs look spot on now.

And tepples, I don't know what happened on EFnet the other day. The site completely stopped responding for me. Tried refreshing it every now and then for over an hour and it still wouldn't work.