ARM9 wrote:
93143 wrote:
2) Apparently ROM access in high speed mode (21 MHz) is 5 cycles instead of 3. Is the same true of RAM access? For both reading and writing? And does this impact the answer(s) for (1)? Did this change at all between chip/board revisions?
Since the RAM access is documented to be similar to ROM in most cases (other than where executing in RAM would impact RAM access) I'd think fullsnes is correct on this point.
I'd think so too, but it doesn't seem to be too definite on the subject, what with all the question marks and the caveat about poor documentation...
Quote:
Storing to ram (sm,st,sbk) uses a buffer so the cpu can continue executing opcodes without having to wait (except when running code in ram). If you execute other code while ram is being written you can perform 1-2 cycle writes (when running in cache).
Yeah, but that doesn't change the fundamental fact that the throughput to RAM is one byte every X cycles, which would bottleneck a sufficiently lean continuous write loop.
According to my calculations, the application I have in mind (a port of a bullet hell shooter) is pretty much right on the edge of the chip's capabilities. The difference between 24 and 40 cycles for a 4bpp cache flush with unset bit-pend flags could be the difference between being able to exactly duplicate the original bullet patterns and having to simplify them.
I do
not want to have to simplify the patterns, because that probably means rebalancing the game, which I don't trust myself to do.
I suppose I could leave the chip in low-speed mode and overclock it, but that's cheating (good luck getting Nintendo to agree to let you do that for a commercial release), and might result in errors with the memory used in the original games...
Quote:
93143 wrote:
3) Is the instruction cache on the latest version(s) of the GSU 256 bytes or 512 bytes? I'd like to be sure.
512 bytes, all revisions, it's in the manual, fullsnes and bsnes. And you can test it yourself with $3100-$32FF. Where'd you read that it's 256 bytes?
Well, byuu has used the number
a few times. I figured there had to be a reason...
Quote:
If you want exact timings, consider profiling on hardware.
I guess that would be ideal, but I don't really have the resources (or skills) to do that right now. Ultimately I may well end up running on a real GSU, but I'd rather not have to choose between doing that up front (stalling the whole project until I can get the time and resources together) and potentially getting a nasty surprise after writing a ton of code...
I suppose I could just assume higan is close enough and test it there, but byuu has complained about Super FX timing in the past and I don't know if the current GSU code is as accurate as the core system emulation...