Code:
Hello,
I am starting to develop my first emulator so I have a few questions for you.
I have already written cpu emulator, I did not realize it to be cycle exact
about reads and writes, however.
Tell me if I am in the right way about cpu main loop:
void cpu_main_loop (void)
{
scanline = 0;
vblank = 0;
for (;;) {
remaining_cycles = cpu_cycles_per_scanline;
/* 113 NTSC, 107 PAL */
while (remaining_cycles > 0) {
..
(sssttt! cpu at work!)
..
if (vblank & nmi_enabled)
nmi();
}
++scanline;
if (scanline == scanlines_per_frame) { /* 262 NTSC, 312 PAL */
scanline = 0;
vblank = 0;
update_screen();
}
else if (scanline == 242)
vblank = 1;
}
}
Bye,
tano
It's not 113, it's 341/3, or 113.666 cpu cycles per scanline.
Also, the code does not seem to count leftover cycles, since with 2 cycles left, it might run an instruction which takes 3 cycles. Where does the leftover cycle go?
Why is NMI inside the while loop?
Ok, you are right.
I think I can do the following:
Code:
unsigned scanline_cycles[] = { 113, 113, 114 }; /* NTSC, PAL would be
{ 106, 107, 106, 107, 106, 107, 106, 107, 107, 106, 107, 106, 107, 106, 107, 107 } */
remaining_cycles = scanline_cycles[scanline];
to get correct scanline timings.
Nmi() could be ported out of the while, sure.
But I do not understand when you are speaking about leftover cycles.
Excuse my ignorance.
Errata corrige: It is [scanline % 3] for NTSC, % 16 for PAL.
tanoatnd wrote:
Errata corrige: It is [scanline % 3] for NTSC, % 16 for PAL.
The "extra cycle" could be taken care of by doing:
Code:
remaining_cycles += scanline_cycles[scanline];
instead of
Code:
remaining_cycles = scanline_cycles[scanline];
Assuming, of course, that remaining_cycles is signed. That way you won't lose a cycle if you started a 3-cycle instruction with only 2 cycles left in remaining_cycles.
An aside: I wonder if using a float to keep track of cycles is ok? Adding 341/3 to float remaining_cycles would seem to be more "accurate" but is it necessary? Ie. does the += take care of it?
You know, this kind of structure is exactly what I used to use before I rewrote my emulator to get away from it.
Ahhhh, that are LOST CYCLES! I remember about them, take note
about them, and then forget them
Thanks!
Dwedit wrote:
You know, this kind of structure is exactly what I used to use before I rewrote my emulator to get away from it.
What kind of structure?
I divide my into three units.
Emulator - contains the main loop, pause, run, stop function... (know cpu and ppu)
Cpu - running only cpu code context..
Ppu - running only ppu functions...
Each cpu.step() is responsable for execute and add the correct cycles to cycles counter. The same is with ppu.scanline() ... itself responsable for deal with scanline counter...
Code:
while (running) {
while (cpu.cycles < CYCLES_TO_SCANLINE) {
cpu.step();
}
cpu.cycles = cpu.cycles - CYCLES_TO_SCANLINE;
ppu.scanLine();
}
I still don't know where I can put the joystick (press and release) events..
You might have it update the joystick on line 240 (the post-render scanline), just before the vblank interrupt.
tepples wrote:
You might have it update the joystick on line 240 (the post-render scanline), just before the vblank interrupt.
Should I do in each 240th line the Press and so after the second 240th line I Release the buttons?
( I was thinking in use the OpenGL events... however is faster (or opengl or my emulator...) then I need to syncronize it.)
By the way the Standard Joystick has an correct timming to press and release the buttons?
dreampeppers99 wrote:
Should I do in each 240th line the Press and so after the second 240th line I Release the buttons?
You shouldn't be worried about events, because the NES is not event-based, it just wants to know what is pressed when it checks the joypad. So do whatever you have to do in your emulator to read input, use events, whatever, but feed the data to the NES only once per frame, and leave it that way until you update it the next frame.
Quote:
So do whatever you have to do in your emulator to read input, use events, whatever, but feed the data to the NES only once per frame, and leave it that way until you update it the next frame.
Thanks very much!
Actually I just write return 1 to port!
Question: I still don't have nothing about APU implemented, this could raise something which can stop the Super Mario 1?
dreampeppers99 wrote:
Question: I still don't have nothing about APU implemented, this could raise something which can stop the Super Mario 1?
I think SMB1 will run fine without the APU implemented. I believe most games just write data to the APU, and very few rely on DMC or Frame IRQs, or data returned by APU registers. But I'm not 100% sure, so if anyone thinks I'm missing something please say something.
Super Mario 1 won't appear to do anything (it'll look frozen) until you add Sprite 0 Hit flag emulation.
MottZilla wrote:
Super Mario 1 won't appear to do anything (it'll look frozen) until you add Sprite 0 Hit flag emulation.
In my case could be that...
Yeah, but the sprite 0 hit is a PPU thing, it has nothing to do with the APU. Games that rely on sprite 0 hits will most likely get stuck in a loop waiting forever for the hit flag to be set if you don't have sprite hits implemented.
I'm using the Nintedulator and comparing the results of debugger... and in mine I've observed that my scanline numbers (I mean actual Scanline) are quite different from Nintendulator!
Now I'm confused...
I used to set 114 as absolute value... whe the cycle counter reaches 114 then I made the scanline... however it seems to be wrong.
Quote:
It's not 113, it's 341/3, or 113.666 cpu cycles per scanline.
Right! but we count the cycles in integer form...
Quote:
scanline_cycles[] = { 113, 113, 114 }
Following this pattern will bring some differences too... (to low)
Note that using 114 ever... will bring too !
Quote:
Errata corrige: It is [scanline % 3] for NTSC, % 16 for PAL.
I don't even understood this
[#1 mod 3] ???
[113,666 mod 3] ????
What I suposse to made... to say...
Hey now you should execute scanline at # cycle.
ahem
113.66667 is 114,114,113. Not 113,113,114
Thanks Disch
#0 113,66667 114,00000
#1 227,33334 114,00000
#2 341,00001 113,00000
#3 454,66668 114,00000
#4 568,33335 114,00000
...
#18 2159,66673 114,00000
#19 2273,33340 114,00000
#20 2387,00007 113,00000
#21 2500,66674 114,00000
#22 2614,33341 114,00000
#23 2728,00008 113,00000
...
#28 3296,33343 114,00000
#29 3410,00010 113,00000
#30 3523,66677 114,00000
#31 3637,33344 114,00000
#32
3751,00011 113,00000
3751,00000
The pre-render scanline ("scanline -1") is sometimes 340 dots instead of 341, so you might be off a bit when comparing to other emus.
Disch wrote:
The pre-render scanline ("scanline -1") is sometimes 340 dots instead of 341, so you might be off a bit when comparing to other emus.
Since this is the case when rendereing is turned on, wouldn't it be better to measure everything with PPU cycles instead? Might not help much with PAL emulation though.
tokumaru wrote:
wouldn't it be better to measure everything with PPU cycles instead? Might not help much with PAL emulation though.
For PAL emulation, clock the CPU at (3, 3, 3, 3, 4, ...) PPU cycles.
which is why I always suggest you scale up.
PAL CPU cycle = 16 cycles
NTSC CPU cycle = 15 cycles
PPU cycle = 5 cycles
This doesn't solve the prerender scanline problem he's facing though, which is why I didn't mention it before.
Disch wrote:
which is why I always suggest you scale up.
Seems like the best way to do it.
Quote:
This doesn't solve the prerender scanline problem he's facing though
Couldn't he check for the pre-render scanline and whether rendering is on and advance a ghost/fake PPU cycle to pretend the scanline was shorter?
Disch wrote:
The pre-render scanline ("scanline -1") is sometimes 340 dots instead of 341, so you might be off a bit when comparing to other emus.
When is "sometimes"? I seem to recall it is every other frame from reading something else...
Every other frame, if rendering is enabled. Disabling rendering for one frame does not affect the hidden even/odd counter.
Some annoying questions:
As this is my first emulator, I am trying to keep things simple.
So no ppu cycles are involved in my code, what I would like to
do is run cpu for the cycles needed for a scanline, draw the
scanline, and so on.
Now, it is better to trunc or round when I compute the cpu
cycles for a scanline?
For example, for NTSC NES, if I trunc I get 113 cycles for the first,
114 for the 2nd, and 114 for the 3rd scanline, and repeat this loop.
If I round, I get 114, 114, and then 113.
I think that the order is not important, but if not tell me (and please tell me why).
Another issue: when vblank is on, should I continue to run cpu for
every scanline, or could I run for all the cycles vblank is on (i.e.
no stops on every scanline, but run 2273 (NTSC) cycles all in one time)
and restart from the top of the screen?
Last question: Trying to get prepared to implement some mappers support,
I would like to organize prg rom as an array of blocks. What is the
best size I can choose for the size of the blocks? (suppose I choose 1K,
I think no mapper permits swap of a block size less than 1K, or not?)
Thanks,
tano
In the vast majority of mappers, PRG ROM isn't switched finer than 4 KiB, nor is CHR ROM switched finer than 1 KiB.