The main loop of nes emulator

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
The main loop of nes emulator
by on (#46061)
There is some kind of pattern to make the main loop of nes emulator... or just I follow my feelings?

In first moment I thougth in run a separete thread to ppu but now I guess it could be harder...

My proto code is this:
Code:
  public void stepDebugger() {
        if (cpu.cycles >= CYCLES_TO_SCANLINE && !inVBlank) {
                cpu.cycles = 0;
                Ppu2C02.getInstance().scanLine();         
        }

    if(Ppu2C02.getInstance().scanlineNumber()>=240)
   {
     inVBlank = true;
      VBLANK_REG(1);
      SPRITE0_REG(0);
       
      if(VBLANK_ON_NMI)
         cpu.nmi();
   }
        cpu.debugStep();
    }

by on (#46099)
Why would you use threads, doesn't that just create more overhead ? Sorry, but that code pasted needs work, curly braces and tabs do not line up at all. Do you have a working cpu or ppu at all yet ?

matt

by on (#46100)
mattmatteh wrote:
Why would you use threads, doesn't that just create more overhead ? Sorry, but that code pasted needs work, curly braces and tabs do not line up at all. Do you have a working cpu or ppu at all yet ?

matt


Sorry for the code that is not the real one! I've worked mainly on cpu core (I believe that I "close" the opcodes [not illegal]) and moving forward to ppu.

by on (#46109)
dreampeppers99 wrote:
Sorry for the code that is not the real one
I didn't think it was. I suggest keeping the cpu and ppu separate. I have catch up that is called when ever the cpu reads or writes a ppu register. The cpu and ppu have their own clocks. There is also prediction code for nmi, but i think you need to get your ppu started before you do that. And if you want to have working nmi, then you can still code for catch up, but only run in tiny cpu increments and catch the nmi when it really happens. This will be slower but good for getting started.

by on (#46137)
Unless you're trying to emulate on a Propeller microcontroller (it has 8 wimpy cores) I don't see much point in going multithreaded.

I'm always arguing against going multithreaded at work for otherwise trivial tasks, it's one of those rookie developer mistakes (among using other "cool" technologies/buzzwords where they are inappropriate). It just wastes memory(more stacks), can cause context switch thrashing, and rarely improves performance. Threads can (and do) improve scalability. But a NES emulator is not exactly a problem looking for a scalable solution.

It is rare that I see a multithreaded solution to a problem that is less complex than a single threaded one. Avoiding complexity can keep you from landing in debugger hell.

Now if you just want to do it to do it, that's another story, and I can totally related to that. I won't stop you from doing something for the sake of it, just don't try to convince me that it is better without some solid data to back up the claims. (not that you made any claims)

by on (#46150)
mattmatteh wrote:
dreampeppers99 wrote:
Sorry for the code that is not the real one
I didn't think it was. I suggest keeping the cpu and ppu separate. I have catch up that is called when ever the cpu reads or writes a ppu register. The cpu and ppu have their own clocks. There is also prediction code for nmi, but i think you need to get your ppu started before you do that. And if you want to have working nmi, then you can still code for catch up, but only run in tiny cpu increments and catch the nmi when it really happens. This will be slower but good for getting started.

I keep differents classes to Ppu and Cpu... the connection between them is made by a third class called Emulator.
The ppu has you own clocks ok, but it will influency on the emulation...

I'm based my timming on : (cpu cycling)

Each 114 cycles of cpu I made one of 240 scanlines...
Then I plus 3 (or 2) scanlines and the Ppu enter on VBLANK timming, so I made the VBLANK things (check nmi, zero to bit0detection) on 20 scanlines timming then I exit of VBLANK period and start again the 0th line scanline...

by on (#46151)
Jon wrote:
Unless you're trying to emulate on a Propeller microcontroller (it has 8 wimpy cores) I don't see much point in going multithreaded.

I'm always arguing against going multithreaded at work for otherwise trivial tasks, it's one of those rookie developer mistakes (among using other "cool" technologies/buzzwords where they are inappropriate). It just wastes memory(more stacks), can cause context switch thrashing, and rarely improves performance. Threads can (and do) improve scalability. But a NES emulator is not exactly a problem looking for a scalable solution.

It is rare that I see a multithreaded solution to a problem that is less complex than a single threaded one. Avoiding complexity can keep you from landing in debugger hell.

Now if you just want to do it to do it, that's another story, and I can totally related to that. I won't stop you from doing something for the sake of it, just don't try to convince me that it is better without some solid data to back up the claims. (not that you made any claims)



I'm also don't like to use BUZZWORD I have a couple of articles published on magazines talking about buzzwords ... fancy developer... I just wondering the two threads by what I've read...

I agree with you that multithread can made slower the nes emulator... but in some cases as game cube or ps2 (or even the psp) emulators the multithread is needed and improve the speed too.

"It is rare that I see a multithreaded solution to a problem that is less complex than a single threaded one"
Perfect sentence to...

Now I got just one thread (logically that swing has yours own) the Emulator one on my run method.

Code:
public void run(){
 while (running){
  while (cpu.cycles < 114){ // 114 is the HBLANK or Scanline time
      cpu.step();
  }
   ppu.scanline(); //do 240 scanlines then
                         // + 3 scanlines and enter on VBLANK timming
                        // wait more 20 scanlines timming to exit the VBLANK
 }
}

by on (#46173)
dreampeppers99 wrote:
I keep differents classes to Ppu and Cpu... the connection between them is made by a third class called Emulator.
The ppu has you own clocks ok, but it will influency on the emulation...

I'm based my timming on : (cpu cycling)

Each 114 cycles of cpu I made one of 240 scanlines...
Then I plus 3 (or 2) scanlines and the Ppu enter on VBLANK timming, so I made the VBLANK things (check nmi, zero to bit0detection) on 20 scanlines timming then I exit of VBLANK period and start again the 0th line scanline...
This is kinda how i have my emulator set up. The cpu, ppu, apu, and mappers are all separate, and the emulator code is the glue for them. The only code that knows about scanlines or vblank is the code that needs it, the ppu, sprites, and some mappers. The emulation code knows about vblank and some other events, but only for prediction, and the ppu will set that.

by on (#46174)
dreampeppers99 wrote:
I agree with you that multithread can made slower the nes emulator... but in some cases as game cube or ps2 (or even the psp) emulators the multithread is needed and improve the speed too.
I believe that is to take advantage of multicore systems when the single core can not run fast enough.

by on (#46176)
mattmatteh wrote:
dreampeppers99 wrote:
I keep differents classes to Ppu and Cpu... the connection between them is made by a third class called Emulator.
The ppu has you own clocks ok, but it will influency on the emulation...

I'm based my timming on : (cpu cycling)

Each 114 cycles of cpu I made one of 240 scanlines...
Then I plus 3 (or 2) scanlines and the Ppu enter on VBLANK timming, so I made the VBLANK things (check nmi, zero to bit0detection) on 20 scanlines timming then I exit of VBLANK period and start again the 0th line scanline...
This is kinda how i have my emulator set up. The cpu, ppu, apu, and mappers are all separate, and the emulator code is the glue for them. The only code that knows about scanlines or vblank is the code that needs it, the ppu, sprites, and some mappers. The emulation code knows about vblank and some other events, but only for prediction, and the ppu will set that.


:D thanks for all of your helps.
I'm just wondering ... do you deal with ppu cycling on your ppu class or even on emulator?

by on (#46195)
dreampeppers99 wrote:
I'm just wondering ... do you deal with ppu cycling on your ppu class or even on emulator?
Mine is written in C, so there are no classes. My ppu code has its own time stamp or clock. So if the cpu does 30 cpu clock cycles, then my ppu would do 30 * ( 3 for ntsc and 3.2 for pal) clock cycles. Actually, cpu 5, ntsc ppu 15, pal ppu 16, you can find alot of info about that all over here.