psycopathicteen wrote:
I wonder if the Super FX chip really is better than the SA-1 for this game. The SA-1 can do character conversion DMA.
It can also execute at nearly full speed from ROM, and has a real divider (MMIO, but it's much better than the S-CPU's divider, and it beats the Super FX's divide-by-two instruction and cumbersome lookup table access). It also supports 8 MB of ROM, vs. the Super FX's 2 MB (yeah, you can add CPU ROM, but that's only useful for stuff the Super FX doesn't need to know about).
It shouldn't be hard to have the S-CPU run mostly in WRAM and only touch BW-RAM (and possibly ROM) for DMA. This gives you 8 master cycles per double pixel, leaving 36 CPU cycles to prepare the data and load the opcode(s) for the write(s) before you reach the Super FX bottleneck. Mind you, this only applies to walls; the Super FX is probably unbeatable at filling floors and ceilings and even drawing backdrops, and objects/enemies can probably be done line by line too.
If the mosaic trick could somehow be got working, it would double the Super FX's speed ceiling for doubled pixel columns, whereas the SA-1 would only see a small benefit...
ap9 wrote:
Possibly. But SA-1 is expensive, and few emulators support it.
What? Every emulator supports the SA-1. It's not accurate, but it's there. And as for it being expensive, every game that used it appears to have used it primarily as copy protection, or possibly as a
laziness budget/schedule buffer. There's nothing about any of them that looks like it needed a coprocessor, never mind one as powerful as the SA-1.
Are you thinking of the ST-018?
ap9 wrote:
I remember a while back stepping through the Mac version of Doom in a debugger to discover a horizontal technique to render more than one pixel at a time by storing two 16 bit portions in a 32 bit register, for the low-res. mode. This method, of course, is not conceivable on a 16 bit system.
That sounds like what the Super FX does for its automatic overhead-free checkerboard dither. I've abused the dither feature to get two pixels out of a single ROM read when drawing unscaled 2D graphics. Of course, since the COLOR register is 8-bit, dither only works for 4bpp or lower...
Quote:
DOOM uses a lot of 32 bit math (esp. fixed point math); a proper port of Doom to SNES would be pretty slow in software because of the 16 bit limit.
How necessary is that precision, though? If you're using a 32-bit CPU, it would seem natural to use 32-bit operations.
On the SNES, we typically find that position can be expressed as a 24-bit fixed-point number (or a 32-bit number, for data integrity reasons) while velocity is fine with 16 bits. This sort of mixed-headroom math can speed things up considerably versus simply doing everything at 32-bit. It's also basically impossible in C...
Quote:
The engine Williams Entertainment used was maintained mostly in C, so there was plenty of room for performance improvements.
You sure about that? I was pretty sure no C compiler existed for the Super FX. Leaving aside the fact that it was something of a niche market, the chip is weird enough that it might actually be worse for C than the 6502.
I can see the S-CPU code being C, but the GSU code would probably have to be assembly.