Hello everybody,
we all know that quite some SNES games run in slowrom mode. Games like Super R-Type or Gradius III, which suffer from slowdown problems.
My question is: has anyone so far attempted to patch these games using fastrom, with the chance to make the code execute faster and thus eliminate any speed problems?
 
You should know it doesn't work like that. FastROM enables memory fetches from ROM banks higher than $80 I think it is, to be accessed at 3.58mhz. The problem is Gradius III is an old LoROM game that executes all its code in the lower bank numbers. So to make it get a speed boost from FastROM you'd have to change all the code to run from their mirrored locations in higher bank numbers and also set the FastROM bit in the appropriate register. And then still I think you have to do something with NMI or possibly other jumps or bank changes. I cannot recall exactly the details.
However you should also know that atleast from what I've heard is that the CPU doesn't actually run at either 2.68mhz or 3.58mhz (or whatever the exact numbers are). It constantly changes based on the memory accessed. Internal register operations will be run at the fastest speed. Memory access like opcode fetches or variable fetches will run at the medium speed/slowrom speed of 2.68, and I think controller registers and maybe others happen at the slowest 1.7mhz. But because it changes alot you never actually execute at a precise speed. 
I could be wrong as it's been awhile. But if you want to remove the slowdown in those games it would seem to me the thing to do would be to disasm/reverse engineer them and then try to make it use an enhancement chip like SuperFX, SA-1, or whatever else you could think of to take the load off code in the game that takes too long to calculate when there is alot going on. The SA-1 might be the way to go, but ofcourse the cartridges won't be common if you wanted to build a cartridge.
 
Ummm, did you actually read my post?
I did not made ANY suggestion how a game should be patched to fast rom.
I did not made ANY (wrong) claim how fastrom works on the SNES .
I didn't even make any statement IF such a patch would be efficient at all.
The only question I posted is: has anyone attempted to patch a slowrom game to fastrom so far, which obviously includes reallocating the rom to the upper banks (besides setting $420d). Obviously, you didn't.
Of course you could built any kind of interesting contraptions to make the SNES CPU (or in fact the SNES itself) redundant. My question aims at methods to improve performance in early games which still run on the original hardware, without rewriting the entire game.
 
6502freak wrote:
Hello everybody,
we all know that quite some SNES games run in slowrom mode. Games like Super R-Type or Gradius III, which suffer from slowdown problems.
My question is: has anyone so far attempted to patch these games using fastrom, with the chance to make the code execute faster and thus eliminate any speed problems?
Yeah, I did attempt just that a couple of years back on Super R-Type, coincidentally.
Three basic steps required:
-set $420d.0
-patch all JSL/JML to access banks $80 onwards.
-try to catch all manual data bank modifications, lookup tables with bank entries and long ROM data fetches.
The jump patches can be nicely automated with the help of a logfile, the latter partially requires rather tedious manual editing.
The difference is slightly noticeable when comparing both versions side-by-side, but not significant. Certainly doesn't eliminate slowdowns completely.
For a quick comparison of what to expect, try the contrary and patch a fastrom game to slowrom (clear $420d.0).
[edit]
Super R-Type Fastrom
Also put an invincibility patch in there, just try pressing L/R in pause mode (ship symbol changes color upon cheat activation).
 
Please post an IPS or XOR, not a self-contained ROM.
 
6502freak wrote:
Ummm, did you actually read my post?
No, I didn't read any of your post. Out the blue I pulled that all out of my ass and by chance it just happened to be on a same topic. Isn't that amazing? 
Quote:
I did not made ANY suggestion how a game should be patched to fast rom.
I did not made ANY (wrong) claim how fastrom works on the SNES .
I didn't even make any statement IF such a patch would be efficient at all.
You posted about the topic, I shared my thoughts and knowledge on the topic. If you didn't want any feedback, you shouldn't have posted. If you felt my comments were in some way offensive or negative towards you, rest assured that they were not intended to be so. 
Quote:
The only question I posted is: has anyone attempted to patch a slowrom game to fastrom so far, which obviously includes reallocating the rom to the upper banks (besides setting $420d). Obviously, you didn't.
Actually, I did exactly what I described with Gradius 3 hoping to get an improvement. No such significant improvement was achieved. But I didn't spend a significant amount of time on it. But I don't believe doing so would yield any significant improvement and I think the only way to get one such as the elimination of all slowdown would require a coprocessor or enhancement chip to drastically speed up calculations which would require significant hacking of the game.
 
To simplify matters, you could find the code hot spots in an emulator and focus your patching efforts on them, reducing the scope of what you have to change. This applies no matter what approach you take, be it FastROM enabling or code improvements. But from my experience, FastROM has a disappointingly small effect on overall execution. It only affects opcode read cycles. Many instructions have non-opcode cycles, because they're either internal processing, stack, direct page, RAM, or I/O register. As a quick test to see if this would be sufficient for a particular game, you could modify an emulator to always use FastROM accesses from the cartridge. I too would like to improve these games that suffer from slowdown, but I'm resigned to doing it on an emulator some day, most likely via adding extra vblank scanlines in a compatible way.
MottZilla, I'm with 6502freak on your reply making assumptions and being rude: "You should know it doesn't work like that". His angry reply seems entirely appropriate. Take your lumps, acknowledge its appropriateness, and move on.
 
I can see how it's phrased in a way that you could take an implied tone like that. But I meant it in an informative manner. Context would show that I had no reason to assume that he should already know these things, but that since he is interested in the subject that these were important things to know. Either way I'm sorry if you took offense 6502freak.
I was told older versions of ZSNES behaved in the manner you describe about FastROM/SlowROM where the entire instruction would be executed taking the timing from if the instruction was running from FastROM or SlowROM. I think ZSNES also has or had some kind of overclocking type option.
 
Does seems like it would be easier to just 'overclock' the emulated CPU.  I used to do stuff like that all the time in NESticle, it was great for playing Elite.
 
Some what related, but IIRC (it's been a while) some of the slow down in Super R-Type and Gradius III is more than usual. Most games drop to half the frame rate (60 to 30), but those games have spots were it's worse than simply halved (missed vblank update due to whatever). I always though that was strange. I can't imagine game logic taking up to 3 frames to complete (or lasting into the 3rd frame, not necessarily taking all of the 3rd frame). But, IIRC, these games do just that.
 
@d4s
Thanks for your answer, seems I had the same idea as you. I was wondering how much speed can be achieved at all, because from a developer's perspective, I would try having as much code and data as possible in the 128K ram, while reloading/decrunching between levels.
@Mottzilla
It's alright, nothing to lose any sleep about. 
 
@blargg
Modifying an emulator to execute fastrom cycles in $00-$7f bank might indeed be the best method to evaluate if relocating the code would make any sense. I guess the biggest problem with early SNES games was the lack of time to gain any experience with the hardware. I wouldn't wonder if some early SNES game engines are based on modified 6502 code taken from NES & PC-Engine games.
In case of Super R-Type and Gradius III, both developers sort of "redeemed" themselves later with R-Type III and Parodius, which both run beautifully.
@tomaitheous
Yeah,it's interesting that both games at certain points strangle the hardware so much that the frame rate drops well below 30fps. I seems they didn't have any time at all to fine-tune the game. Once you reach the "Bubble Level" in Gradius III, at certain points you ask yourself "what were they thinking??".
 
But one helpful side effect to the massive slowdown in Gradius 3 is it makes it easier to dodge hazards. =)
I recall hearing that Gradius 3 was one of the very first 3rd party SNES games released. So it would make sense that it wasn't optimized. I haven't played Parodius Da! much myself, but if it gets as much action on screen as Gradius 3 without slowing down that would be pretty telling. The later Parodius game though uses SA-1 which would explain it not slowing down. 
It reminds me of Mega Man Wily Wars on Megadrive because there are plenty of points in it where you see slowdown that makes no sense because hardly anything is going on really. I think QuickMan just being on screen makes the game slow down which is ironic.
 
If it has less going on visually than Recca, and it slows down more than Recca, that can mean only one thing: bad programming.
 
MottZilla wrote:
But one helpful side effect to the massive slowdown in Gradius 3 is it makes it easier to dodge hazards. =)
It would be quite strange if the slowdowns were actually incorporated into the game design. Both games were actually responsible that the SNES was seen as an architecture unsuitable for fast action games.
Quote:
I recall hearing that Gradius 3 was one of the very first 3rd party SNES games released. So it would make sense that it wasn't optimized. 
I think either Gradius III or Ganbare Goemon were the first SNES games from Konami.
Quote:
I haven't played Parodius Da! much myself, but if it gets as much action on screen as Gradius 3 without slowing down that would be pretty telling.
The game has tons of sprites and doesn't suffer from slowdowns (except if you use the blue bell, where a huge detonation is filling the entire screen).
It's so sad that the game is virtually unknown in the US, since it's easily one of the best shooters. It also has a great soundtrack.
Quote:
 The later Parodius game though uses SA-1 which would explain it not slowing down. 
I'm specifically talking about Parodius Da!, which is a simple 8-megabit cart.
Quote:
It reminds me of Mega Man Wily Wars on Megadrive because there are plenty of points in it where you see slowdown that makes no sense because hardly anything is going on really. I think QuickMan just being on screen makes the game slow down which is ironic.
Hope they didn't code a 6502 emulator to port the games over... O_O
 
The reasoning I heard for Mega Man WW's slowness was they coded the entire game in C rather than Assembly. No clue at all if that is true. 
It's sad that a few bad efforts tend to stain a system's record in that way. Gradius 3's epic slowdowns (which didn't bother me that much, as I said it can be helpful) and I also remember Final Fight being complained about because of few enemys on screen at once, no two player, poor music quality, etc. which some people took to mean it was something wrong with the SNES and not that it was the developer's fault.
 
MottZilla wrote:
It's sad that a few bad efforts tend to stain a system's record in that way.
Which is exactly why every Nintendo console has had a lockout chip. Shovelware on the Atari 2600 stained the record of video games in general.
 
tepples wrote:
MottZilla wrote:
It's sad that a few bad efforts tend to stain a system's record in that way.
Which is exactly why every Nintendo console has had a lockout chip. Shovelware on the Atari 2600 stained the record of video games in general.
I think in terms of shovelware, both SNES and Atari VCS can be easily compared. On both platforms, there are lots of great but also lots of bad games. The lockout chip was introduced so that Nintendo would have the absolute monopoly on cartridge production. Everyone had to buy all cartridge stock in advance from Nintendo, with pricing and thus profit margin set by Nintendo. The CIC chip was the key for Nintendo's iron, draconian grip on the market.
 
MottZilla wrote:
It's sad that a few bad efforts tend to stain a system's record in that way. Gradius 3's epic slowdowns (which didn't bother me that much, as I said it can be helpful) and I also remember Final Fight being complained about because of few enemys on screen at once, no two player, poor music quality, etc. which some people took to mean it was something wrong with the SNES and not that it was the developer's fault.
 I think it made even minor slowdown for SNES games, more apparent. People were looking for it or expecting it. Granted, a lot of the first year release games slowed down 
quite often from what I remember. A large majority of them. Yet nobody seemed to care when Genesis or TG16/PCE games slowed down. It was acceptable. But, there were certain expectations about the SNES - being the last console out on the market in the 16bit generation (true/real system, not counting that CD-I and other junk). And the slowdown did help out in some games (I was able to beat Super EDF because of one of the weapons that caused the game to slow down a lot) 

 
Some bullet hell games 
intentionally slow down the frame rate when more than a certain number of objects are present in order to give the player a fighting chance. Even many PC bullet hell games do this, and this number doesn't vary from PC to PC so as not to change the game's difficulty on different hardware. But then this number is often far greater than the number of sprites that the Super NES PPU can show in the first place.
See screenshot 
6502freak wrote:
I think in terms of shovelware, both SNES and Atari VCS can be easily compared. On both platforms, there are lots of great but also lots of bad games. The lockout chip was introduced so that Nintendo would have the absolute monopoly on cartridge production. Everyone had to buy all cartridge stock in advance from Nintendo, with pricing and thus profit margin set by Nintendo. The CIC chip was the key for Nintendo's iron, draconian grip on the market.
I don't see a problem with having an iron grip on the control of software released for your own platform like they did. I'm not sure which SNES games you can think of as shovelware, I'm sure there is a good bit but I don't think it's on the level of the VCS or other platforms. SNES has some really great games and while games like Final Fight and Gradius 3 suffered from various technical issues I don't believe that it ruined the experience completely. I certainly enjoy Gradius 3 despite the slowdown. And I played Final Fight as a kid and honestly I didn't notice the issues that I would notice today. Ofcourse I never played the arcade original when I first played the SNES version.
 
I remember having a chat with Neill Corlett about Gradius 3's slowdown, and based on his disassembly analysis, it appears to be intentional rather than an effective of slow/normal vs. fastrom or CPU speed.
 
koitsu wrote:
I remember having a chat with Neill Corlett about Gradius 3's slowdown, and based on his disassembly analysis, it appears to be intentional rather than an effective of slow/normal vs. fastrom or CPU speed.
Something I've always wanted to say but couldn't find a more perfect time to say it.
I always wondered if programmers ever intentionally programmed slowdown because they 
thought they 
had to.
I blame the slowdown issue responsible for the long delayed Snes homebrew development scene.  Most people simply avoid development for the Super Nintendo with the assumption that everything they do will cause slowdown.
 
psycopathicteen wrote:
I blame the slowdown issue responsible for the long delayed Snes homebrew development scene.
Since 2002 I've been blaming the GBA, a similarly powerful platform with a far lower barrier to entry.
 
koitsu wrote:
I remember having a chat with Neill Corlett about Gradius 3's slowdown, and based on his disassembly analysis, it appears to be intentional rather than an effective of slow/normal vs. fastrom or CPU speed.
 I've heard a few people say this too, but what specific area of code points to this? Is his disassembly made public?
 
tomaitheous wrote:
koitsu wrote:
I remember having a chat with Neill Corlett about Gradius 3's slowdown, and based on his disassembly analysis, it appears to be intentional rather than an effective of slow/normal vs. fastrom or CPU speed.
 I've heard a few people say this too, but what specific area of code points to this? Is his disassembly made public?
You'd have to ask him.  The conversation was held maybe 8 years ago.
 
tepples wrote:
psycopathicteen wrote:
I blame the slowdown issue responsible for the long delayed Snes homebrew development scene.
Since 2002 I've been blaming the GBA, a similarly powerful platform with a far lower barrier to entry.
Agreed tepples. I don't think the slowdown has anything to do with it. While the SNES CPU speed does discourage C programming which might be related to psycopathicteen's point, you can just as well use ASM. On GBA you don't have this issue but you still have a somewhat similar hardware setup.
Another barrier is the Sound system. On GBA I imagine its easy to get audio going where as on SNES you have limited options, very limited.
 
Quote:
I blame the slowdown issue responsible for the long delayed Snes homebrew development scene.
You mean "long 
gone"? Several demos and cracktros where released on the SNES in the 1990s.
 
MottZilla wrote:
Agreed tepples. I don't think the slowdown has anything to do with it. While the SNES CPU speed does discourage C programming which might be related to psycopathicteen's point, you can just as well use ASM.
Let me state the big problem with ASM. There are two parts of a game: the back-end and the front-end. The back-end or 
"model" implements the rules of the game, such as how high a fat plumber can jump, and the front-end or "view" displays the result to the player. If one language compiles on two platforms, then ports of a game to those platforms can share a back-end, and only the front-end must change between platforms. You can see this in the source code for 
Lockjaw Tetromino Game, which has both an Allegro front-end and a GBA front-end (which shares a lot of its code with a DS front-end) calling the same back-end. But sharing the back-end across platforms only works among platforms that support a given language. The Super NES alone has two incompatible assembly languages for the two CPUs, one for decoding music sequence data into a series of DSP commands and one for everything else, as you mention next:
Quote:
Another barrier is the Sound system. On GBA I imagine its easy to get audio going where as on SNES you have limited options, very limited.
On GBA you have the vaguely NES-reminiscent Game Boy APU to get started. You also have a pair of 8-bit PCM audio buffers, and as long as you know the basics of digital signal processing, you can mix those in C for reasonable performance if you're not rich enough to license someone's ASM mixer.
 
mic_ wrote:
Quote:
I blame the slowdown issue responsible for the long delayed Snes homebrew development scene.
You mean "long 
gone"? Several demos and cracktros where released on the SNES in the 1990s.
I didn't know there was ever an SNES development scene in the first place.
Anyway, I have a pretty decent SNES game engine that I've been dying on posting the source code.  I recommend playing around with it if you want a head start at SNES development without starting from scratch.
 
psycopathicteen wrote:
I didn't know there was ever an SNES development scene in the first place.
There was -- and it was quite a closely-knit and friendly community -- but is long over with for a lot of different reasons.  Consider that I wrote the SNESTECH documents *before* the NESTECH stuff.  :-)
 
Quote:
 There was -- and it was quite a closely-knit and friendly community -- but is long over with for a lot of different reasons. Consider that I wrote the SNESTECH documents *before* the NESTECH stuff. 
 
 Man, people really aren't as smart as they were a long time ago.
 
Hi! No slowdown patch more available?
 
Heck of a threadbump. Anyway, if not running on real hardware is okay, an emulator can do this without ROM patches, by simply always returning 6 clock timings for ROM accesses both in the $80-ff *and* $00-7f regions.
If nothing else, it's a quick way to assess whether patching a ROM for real hardware use is worth it.
 
How do I do with the emulator? I'm not a developer, I do not know.
 
I like that thread.  I'm going to make an account on romhacking.net now.
 
whicker wrote:
Nice to see someone making progress on the idea of improving Gradius III.
 
I applied the 2 patches from ROMhacking.net and then started writing my own optimization patch on top of it.
Then I played Gradius III, got to level 2, kept getting hit by bubbles.  I said to myself, "that's funny, I don't remember the bubbles moving this fast.  Oh wait, they weren't."
 
I know super mario world has an sa-1 patch that improves frame rate and performance, fastrom isn't the only solution.
 
Optimization on stock hardware is more impressive.
 
I can think of cases where an SA1 patch would slow down the whole emulator, particularly if you're using a modern emulator (that is, not Snes9x or ZSNES) on hardware that is optimized for power consumption at the expense of raw speed (such as Atom/Pentium N/Pentium Silver or ARM).
 
psycopathicteen wrote:
Optimization on stock hardware is more impressive.
I agree, however given that SD2SNES supports SA-1 now it wouldn't be a bad alternative way to try to eliminate or reduce slowdowns in performance. Of course seeing what is possible with only enabling FastROM and optimizing some code has a special quality since in theory given more development time and slightly faster MaskROMs it could have been that way originally.
Makes me wonder when the first game using FastROM was released.
 
MottZilla wrote:
Makes me wonder when the first game using FastROM was released.
SuperFamicom.org appears to be a rough counterpart to BootGod's NesCartDB. It lists (among other properties) release date, ROM speed, and mapper ($20/LoROM or $21/HiROM) for each game. Unfortunately, its search isn't as thorough as NesCartDB, nor could I immediately find a download.
 
hibber22 wrote:
I know super mario world has an sa-1 patch that improves frame rate and performance, fastrom isn't the only solution.
The spectacular thing is the game benefits of huge enhancement in term of speed and stuffs on screen, and only with a patch,without using the SA-1 specificities.
 
Oh boy, I completely wrecked up the source code for this stupid Gradius III optimization patch, trying to cram in code in whatever blank space the original game had.  It looks like I would need to start over, this time with a larger ROM size.  I guess this should be a rule of thumb to never start an optimization patch without making sure I have a lot of space to work with.
 
A thunder spirit patch which removes all the slowdowns would be good too .
 
I got my patch working again.  I expanded the ROM size and moved several routines into a new bank.
 
Well Gradius III did end up getting the SA-1 treatment.
https://github.com/VitorVilela7/SA1-Roo ... radius-III 
The guy who started the project found out that the bubbles in stage 2 use a grid for collision instead of using circular collision.
 
psycopathicteen wrote:
The guy who started the project found out that the bubbles in stage 2 use a grid for collision instead of using circular collision.
Do you mean that the collisions used several boxes for each bubble ?
 
incredible, thanks for the link .
I was sure this game was coded with the ass, now it's confirmed .   

 
If Konami made a Sega Genesis port of the game, and they used their grid based algorithm the conversation would be like:
Boss: "fix this slowdown immediately"
Programmer: "you didn't tell me to fix it on the SNES version"
Boss: "but this is the Sega Genesis version, people have bigger expectations for the Sega Genesis"
 
TOUKO wrote:
incredible, thanks for the link .
I was sure this game was coded with the ass, now it's confirmed .   :P
Gradius 3 was a first-generation SNES title, RTM'd December 1990 (JP).  The Super Famicom was RTM'd November 1990 (JP).  We can safely assume third-party developers like Konami were given 1 year to work on/make titles, even despite their established relationship with Nintendo by that point.  I would be very surprised if it was more than that.
Thus in conclusion: let me know when you work for a game company that has strict deadlines, is responsible for a console launch title, with only limited documentation provided by the console manufacturer.  "Coded with ass".
 
And back then there wasn't quite as much chance of an optimized "Game of the Year Edition" rerelease featuring more efficient code, more detailed eye candy, and some levels that had been originally cut for space.
 
koitsu wrote:
TOUKO wrote:
incredible, thanks for the link .
I was sure this game was coded with the ass, now it's confirmed .   

Gradius 3 was a first-generation SNES title, RTM'd December 1990 (JP).  The Super Famicom was RTM'd November 1990 (JP).  We can safely assume third-party developers like Konami were given 1 year to work on/make titles, even despite their established relationship with Nintendo by that point.  I would be very surprised if it was more than that.
Thus in conclusion: let me know when you work for a game company that has strict deadlines, is responsible for a console launch title, with only limited documentation provided by the console manufacturer.  "Coded with ass".
I never knew grid based sprite collision was an actual thing.  I always assumed every game just used rectangles, and in some cases circles.  It's just not something I would think about when I think about object collision.
 
My very first colission stuff was based on a grid in one PC game I did long time ago. I had no clue how one was supposed to make things nor did i have access to internet and books on the matter. I was 15 at that time I think.
And as far as the MD comment goes, that probably wouldn't hold any water. Konami had rough past with Sega and if anything they'd intentionally make crap stuff there and get away with it lol.
 
I imagine it's more likely that Konami had more experience with the 68000 processor on some of its arcade system boards. The 65816 may have been as foreign to Konami programmers as the Intel 8080 (and Game Boy's Sharp SM83) is to long-time NES and Super NES homebrew programmers.
 
Grid based collision is slower than circular collision on ANY cpu.  They could've used grid based collision in the arcade version of gradius III, because that game even has slowdown during the bubble level, but it looks like Konami took their time with Contra Hard Corps optimizing the hell out of it.
 
Quote:
Thus in conclusion: let me know when you work for a game company that has strict deadlines, is responsible for a console launch title, with only limited documentation provided by the console manufacturer. "Coded with ass".
Of course, I agree, but if you put an inexperienced coder on a processor that did not know, i will not expect from him which made some good coding practices, and more with the dead lines imposed by companies .
And gradius 3 's code, seems to show this.
Quote:
They could've used grid based collision in the arcade version of gradius III, because that game even has slowdown during the bubble level
i agree, good remark,on arcade systems that are overpowered, you can code almost as badly as you want, your game will be "always" smooth.
 
By "circular collision" you mean a bounding box test followed by comparing a²+b² to a threshold, right? What is this "grid based collision" and how is it slower than circular collision? Could, say, Super Mario World have gained a speedup from circular collision?
 
Grid based as in, instead of the bubbles having a single bounding box, they have multiple 8x8 bounding boxes.
 
Pic from thread linked earlier:
Attachment:
			 8776ekC.png [ 16.15 KiB | Viewed 4100 times ]
			8776ekC.png [ 16.15 KiB | Viewed 4100 times ]
		
		
	 Yeah, I'm pretty sure circular would be faster than that.
To be fair, a lot of them are smaller, but still.  Also, you'd lose a lot of precision for the smallest ones (which may explain why I thought collision with those darn things was so janky)...
 
And that looks substantially more difficult to implement; probably the first idea that came to the programmer's and they just had to go with it.
TOUKO wrote:
Quote:
They could've used grid based collision in the arcade version of gradius III, because that game even has slowdown during the bubble level
i agree, good remark,on arcade systems that are overpowered, you can code almost as badly as you want, your game will be "always" smooth.
Given that the original arcade version of Gradius III uses 2, 10MHz 68000s, it's very possible it's programmed no better than the SNES port.
 
Drew Sebastino wrote:
And that looks substantially more difficult to implement; probably the first idea that came to the programmer's and they just had to go with it.
TOUKO wrote:
Quote:
They could've used grid based collision in the arcade version of gradius III, because that game even has slowdown during the bubble level
i agree, good remark,on arcade systems that are overpowered, you can code almost as badly as you want, your game will be "always" smooth.
Given that the original arcade version of Gradius III uses 2, 10MHz 68000s, it's very possible it's programmed no better than the SNES port.
Really?  Let me guess.  One 68000 to run the game, the other one as a "sound CPU".
 
Nope, it's got a Z80 for that. Pretty damn overkill, especially when you consider Sega used the same CPU configuration for After Burner...
 
I looked around and I can't find any information about it.
 
I found that information here: 
https://www.arcade-history.com/?n=gradi ... ail&id=999 Also, I haven't played the game in MAME before, but it also lists what chips are used in each game.
 
psycopathicteen wrote:
Grid based collision is slower than circular collision on ANY cpu.
No, it isn't.
psycopathicteen wrote:
Grid based as in, instead of the bubbles having a single bounding box, they have multiple 8x8 bounding boxes.
No, they don't.
The 
whole point of a grid is to 
avoid testing bounding boxes, but more importantly to avoid making more tests than you need to.
You test for collision with an 8x8 grid by dividing the coordinate by 8 (i.e. right shift by 3), and looking up a value stored by that index.
You test for collision with a bounding box by making 4 comparisons.
You test for collision with a circle by subtracting two coordinates, squaring two results, then adding the two squares, then comparing that against a squared radius.
There is overhead in filling up the grid when objects move across it. I presume this is the big bottleneck with the bubbles, here. However, once you have built the grid, the actual collision is 
fast. The trade-off is how many collision tests you have to do vs. how much time do you spend rebuilding the grid.
In this case we have a lot of bullets and a player to test for collision against all bubbles. The grid makes individual collision test very fast. A bullet doesn't have to compare against 20 circles it can just look up 1 grid value. Collide 10 bullets against 20 circles and you need to do 200 circle tests, but it would still be only 10 grid lookups. 
That's the point of a grid.
Building the grid is a trade-off for fast collision tests. I don't know what the update pattern for this particular grid is like, but probably the motion of bubbles is slow and each might have to update its grid only every few frames. Potentially this could be a very good solution to the problem, though evidently however they did it wasn't good 
enough to avoid slowdown.
I'm not going to speculate too much about what they 
should have done, but a grid like this is neither unusual, nor a bad technique. It's a very well known and practical kind of collision solution. Whether this particular implementation is optimal, probably not, but the grid itself is a sensible approach.
Anything could be done multiple different ways, but your suggestion to just use circle tests instead seems incredibly naive to me. I'm not going to argue too much about a hypothetical, though, I'm just here to explain the purpose of the grid. My baseline assumption is that it's still better than doing a zillion circle tests, unless there's some additional acceleration/culling structure being used here.
 
So doing point / circle or rectangle / circle tests on SNES is not a good idea
For my part I don't make grid,  that the binary shift are numerous and make 100 * 4 binary shift and clearly not advisable.
So in my case:
1) I always display my bullets and the same for their movement (I do not distinguish between bullet on the screen and off the screen), it avoids a lot of test
2) Same for the test I do them all, (except if the Y position and $ E0 that means that my bullet and off the screen), I do about 182 test per frame (and on both frames I do all my test so a total of 364 tests)
So I have 24 bullet for the ship, 12 enemies and 76 bullet ennemy
3) What also takes me a lot of% CPU, these are also see if balls are available then I have to test my 100 balls and put them in a "pile" (so that enemies can use bullets available quickly)
4)Then you just have to optimize the tests, I think I'm at 40-50 cycles per test
But the most greedy is to display the 100 bullets especially (it takes 24% of cpu : display + movement + limit screen)
I would put a source code, if I have a little time
 
rainwarrior wrote:
... I don't know what the update pattern for this particular grid is like, but probably the motion of bubbles is slow and ...
That's correct.
 
I've played the game. That was a bit of a run on sentence, but the probably was meant to apply to the later part, i.e. probably you don't have to update every bubble's collision grid every frame. (Whether or not the game updated them all every frame, I don't know.)
 
A grid lookup is pretty fast, but it looks like they're doing it ~50 times per bubble. That's probably slower than doing one circle test, to say nothing of the grid update as they move on top of that.
All we can do is speculate, but I suspect the reason why Gradius uses the scheme it does is because the engine was only written to support grid-based collisions (because that works for 99% of cases) and when implementing the bubbles, they just used pre-existing functionality that worked instead of wasting time writing extra code.
Most people who play the slowdown free version say it's unplayably difficult anyway, so it's a little hard to say they made a bad call.
 
adam_smasher wrote:
A grid lookup is pretty fast, but it looks like they're doing it ~50 times per bubble.
No, it's the opposite. The 
grid lookup is 
1 time per bullet and that single lookup encompasses all colliding objects at once.
They add points to the grid 50 times per big bubble, but that's not a lookup. That's a sparse-drawing problem. Drawing 50 pixels on a grid is way lighter than doing 50 collision tests. This is a completely different kind of operation. All the pixels make it a heavy operation in aggregate, but the savings comes from the lookup side where you only have to test the 1 grid cell a single bullet is inside (simultaneously testing against all bubbles at once).
The real question is whether that 50 writes to the grid is better than e.g. the 20 circle tests against every bullet you're proposing to do instead... and I'd say 
probably yes, but the real answer depends on very specific numbers of objects/pixels/bullets/etc. (and other implementation details) so neither of these is a winner in every situation. Also, that's still assuming the circles are drawn every frame, which they might be, but there's another opportunity there to distribute the load that really makes the grid appealing.
adam_smasher wrote:
...they just used pre-existing functionality that worked instead of wasting time writing extra code.
A grid system like this is a very natural solution to doing many collisions at once. There are other kinds of acceleration structures for this problem (e.g. BSP trees) but a grid like this is a really common and very effective solution for 2D collision. It's not some "compromise" due to coders who don't want to write a circle-to-point collision routine, it's a reasonable solution for a hard problem. The circle collision you're proposing 
probably isn't a viable solution to the problem. Without an acceleration structure, you're bogged down in in too many collision tests.
And yeah, there's certainly a more complicated and very specific-to-this-level additional solution they probably could have written, grid or no grid, but my point is that the grid is not a stupid way to do this by any stretch, and a circle test is not some no-brainer miracle they just didn't think of, it comes with its own hard problems.
 
The bubbles also collide with other bubbles.
 
psycopathicteen wrote:
The bubbles also collide with other bubbles.
Yes, that's just more collision tests that can really take advantage of a collision acceleration structure.
 
rainwarrior wrote:
You test for collision with an 8x8 grid by dividing the coordinate by 8 (i.e. right shift by 3), and looking up a value stored by that index.
Okay, that makes more sense.  I thought it seemed unnecessarily crazy, but I didn't really try to imagine what advantage it could possibly have.  I assumed everyone was making fun of it because it simply used multiple bounding boxes for each bubble, in which case it would indeed have been horrendously inefficient.
It's usually a good rule of thumb that if a programmer has done something definite and relatively complicated that required thought and planning (as opposed to, say, a jump instruction targeting the address immediately following itself, which reportedly does also occur in Gradius III and is probably a methodological artifact of some sort), they probably had some reason to think it was a good idea, and ridiculing them without thinking it through is unwise.
Chesterton's Fence, basically.
...
Now, if it turns out it 
does use bounding box tests on all of those "occupied" grid cells...  well, then we can probably make fun of the devs.
 
I need to reread the code again.  There must be some reason the bubble stage lags more than the Squidward's house Easter Island head level.