In my NES 6502 CPU core I read opcodes (instructions) directly from memory rather than using the usual memory read function that handles memory-mapped I/O devices like the PPU and APU. My reasoning is that nothing will intentionally execute from an I/O device and that doing so wouldn't be useful anyway. This approach improves performance by avoiding a function call and the need to keep track of the timestamp for opcode reads.
I divide memory into pages (currently 4K in size) and direct memory access simply goes through a mapping table. The table covers the entire 64K address space, not just the upper half where the ROM is usually mapped. Unmapped and I/O pages are assigned to a special page filled with bytes that aren't a legal opcode, helping to catch any attempted execution of these. I also use this optimization for zero-page and the stack, since I don't handle cartridges with hardware that does anything special when those areas are accessed (and I don't know of any that do).
Any questions, problems, or improvements?
I divide memory into pages (currently 4K in size) and direct memory access simply goes through a mapping table. The table covers the entire 64K address space, not just the upper half where the ROM is usually mapped. Unmapped and I/O pages are assigned to a special page filled with bytes that aren't a legal opcode, helping to catch any attempted execution of these. I also use this optimization for zero-page and the stack, since I don't handle cartridges with hardware that does anything special when those areas are accessed (and I don't know of any that do).
Code:
const unsigned page_size = 0x1000;
byte* pages [0x10000 / page_size];
// the usual memory read function
int emulate_read( unsigned addr );
inline int read_mem( unsigned addr )
{
return pages [addr / page_size] [addr % page_size];
}
void emulate_cpu()
{
int opcode = read_mem( pc );
switch ( opcode )
{
case 0xA9: // LDA #imm
a = read_mem( pc + 1 );
set_nz( a );
pc += 2;
break;
case 0xA5: // LDA zp
a = read_mem( read_mem( pc + 1 ) );
set_nz( a );
pc += 2;
break;
case 0xAD: { // LDA abs
unsigned addr = read_mem( pc + 2 ) * 0x100 + read_mem( pc + 1 );
a = emulate_read( addr );
set_nz( a );
pc += 3;
break;
}
case 0x68: // PLA
sp = (sp + 1) & 0xff;
a = read_mem( sp + 0x100 );
set_nz( a );
pc += 1;
break;
// ...
}
}
byte* pages [0x10000 / page_size];
// the usual memory read function
int emulate_read( unsigned addr );
inline int read_mem( unsigned addr )
{
return pages [addr / page_size] [addr % page_size];
}
void emulate_cpu()
{
int opcode = read_mem( pc );
switch ( opcode )
{
case 0xA9: // LDA #imm
a = read_mem( pc + 1 );
set_nz( a );
pc += 2;
break;
case 0xA5: // LDA zp
a = read_mem( read_mem( pc + 1 ) );
set_nz( a );
pc += 2;
break;
case 0xAD: { // LDA abs
unsigned addr = read_mem( pc + 2 ) * 0x100 + read_mem( pc + 1 );
a = emulate_read( addr );
set_nz( a );
pc += 3;
break;
}
case 0x68: // PLA
sp = (sp + 1) & 0xff;
a = read_mem( sp + 0x100 );
set_nz( a );
pc += 1;
break;
// ...
}
}
Any questions, problems, or improvements?