Sorry for the long post.
Disch wrote:
Even if you optimize it down to the scanline that Sprite 0 is on... checks will still be causing the emu to split those 8 scanlines every 2 instructions.
OK, so a simple version's worst-case performance could be 16 scanlines out of 240 rendered lines, 7% of the total. That leaves 93% of the scanlines potentially rendered in an optimal way. What if you predict exactly when the sprite 0 hit will occur? Then you don't have to switch to fine rendering mode unless this prediction gets invalidated by PPU writes, something that is uncommon.
Quote:
[Pirate Mapper 90] has an IRQ counter which can be driven by PPU reads and CPU writes... efficiently tracking all of that inside an emulator can make things extraordinarily complicated extremely fast.
This is a problem; even simple hardware driven by odd sets of signals can be very difficult to optimize. The general divide-and-conquer optimization strategy will probably help as usual: optimize for the common case and the few exceptions will just be slower.
Quote:
Getting it all "accurate" will lead to a very complicated program. And there's no way such a program will ever be in the same performance ballpark as something that simplified and optimized it all as much as NESticle did.
The question is, are these quirks actually invoked in common NES programs? It would be an interesting study to seek out consequential quirks invoked by NES programs, as a way to prove that such an emulator is impossible (rather than speculate as we are).
byuu wrote:
I won't emulate the same thing two ways and switch between the more accurate and faster version depending on whether or not the game needs it at a certain point. That gets sloppy real quick, and you waste too much time maintaining rather than discovering new hardware quirks and such.
What's being proposed here by tepples and others is not this; the proposal is to use optimized code where it has no side-effects. For example, if a game doesn't touch the PPU registers for the entire frame, you can use optimized tile and sprite rendering, without any effect on accuracy.
There seems to be significant negativity towards designs that optimize performance of an emulator. In the past people had to focus on efficiency, and they often did this in ways that unnecessarily sacrificed accuracy. I think the activity is enjoyable, though it has nothing to do with emulation in specific. In my emulator I've had fun keeping it efficient while still passing some of my most rigorous test ROMs (and not just "passing" them in a hacky way). It's a more general activity of doing software engineering and examining possible tradeoffs.
Quote:
Processors continue to get more and more psychotic. Nowadays, you execute an xchg opcode and it's sixteen times slower than three mov opcodes on Pentium IVs. That kind of thing didn't happen on the 386es ZSNES was designed for. Optimizing for a generic x86 target is a journey into madness. It's best to just follow a few simple rules (don't use obscure opcodes, try and use the full register sizes whenever possible, etc) and go with that.
Correction: the x86 architecture is psychotic. If you've used other architectures, you'd find them infinitely refreshing in their regularity and efficiency (the same way the 6502 and 65816 are). I take it using a compiler for x86 these days is generally a win?
Quote:
I'll still continue to go for simplicity, and my emulator has always been more of a self-documenting reference platform than a true user-friendly emu. And I anticipate it always remaining easier to implement new findings into my design than any emu aiming for speed. I just added on a bunch of UI stuff since I had about 10,000 people using it anyway. Heck, I use it myself since it runs at 2-3x speed on my PC, so why not?
There's nothing wrong with an emulator design that favors ease-of-maintenance over ultimate efficiency. In these discussions there seems to be the notion that only one design is right, and the others are wrong and should be avoided. All designs involve tradeoffs and each one emphasizes some things over others, like programming skill needed to implement it, target platforms, clarity, language of implementation, efficiency, etc. There's no need to trivialize other designs as a way to justify your decisions; the fact that there is a tradeoff means that neither can meet all the demands equally and that each design has its merits and is worth being implemented by someone.
Quote:
I really think this will become less and less of a concern in the future. Once even Pocket PCs run at 3ghz, who will care if an emu eats up 1% or 10% when the backlight eats 20x the battery life either way?
What about special features that require a fast emulator, like arbitrary seeking in a movie, real-time reverse playback, or showing a wall of emulators all running at full speed? (all of which are quite cool features to see)