I don't have the knowledge to make such a project but I always had the interest to do so. If it already been talked in another thread then I apologize in advance.
What I would like to know is, let's say someone would try to make a game console with parts that were common in older generation consoles. By older parts I don't means to salvage from older hardware but use the same hardware from 20~30 years ago since I like those old CPU or other video/sound hardware.
What would be the challenge of doing such a project? For example, are parts still available? Now that most people only use LCD, would it be possible to adapt the signal for new television sets? For video, does chips still exist or you have to make them with some programmable chip (I forgot the name of it but it was used to make some emulated nes hardware).
This is just all "in theory" but those things fascinate me and want to learn more about what is possible and the limitation of what can be done today.
If you do find new-old-stock or pulled
TMS9918,
SN76489, and Z80 chips, you can pair them with off-the-shelf 7400 parts for address decoding and off-the-shelf 62256 SRAMs to make your own ColecoVision or SG-1000.
Most LCDs should still be able to display analog composite or component video. As a last resort, you can modulate analog composite video to an RF signal because TVs still have to decode that (or they can only be legally called monitors, not TVs).
The programmable chip is called an FPGA. The low-end programmable chip is called a CPLD; you might be able to make your own sound chip out of one of those. Video is a lot more complex because although the Apple II video circuit is elegantly simple, gamers demand sprites at 60 fps, and a dumb frame buffer like the Apple II's has trouble delivering that.
So now your question has been reduced to "Are there reliable sources for new-old-stock retro video and audio chips?"
This is certainly possible, but most machines had custom video chips, so you'd most likely have to harvest them. But AFAIK, it should be possible to make interesting combinations, like controlling an NES PPU using a Z80. FPGAs are the programmable chips your talking about, and depending on their capacity you can even implement entire machines in them, meaning you can create entirely new designs inspired by old machines, but with video, audio and CPU created by you from the ground up. Or you can clone existing parts, depending on what your goal is.
As for connecting to newer TVs, some old code chips output signals more friendly than composite video, such as the SMS VDP, which outputs RGB, that you could possibly convert to HDMI and get good results.
The NES PPU outputs composite video, which direct scale well to HDMI, but it also has a digital output that with some trickery can be (and has been!) used to generate pixel-perfect pictures for HDTVs.
Did the Playchoice 10 use Z80+PPU?
tepples wrote:
If you do find new-old-stock or pulled
TMS9918,
SN76489, and Z80 chips, you can pair them with off-the-shelf 7400 parts for address decoding and off-the-shelf 62256 SRAMs to make your own ColecoVision or SG-1000.
Or an MSX, but I guess that's a bit more complex than the two you mentioned.
The TMS9918 is a good example of an off-the-shelf video chip, but making good graphics on it can be a challenge, much more so than on the NES, IMO. It's definitely possible, but the lack of hardware scrolling and the poor sprite capabilities (1 color + transparency, maximum of 4 16x16-pixel sprites per scanline, no priority control against the background) greatly limit the possibilities of what you can do.
Dwedit wrote:
Did the Playchoice 10 use Z80+PPU?
No, it used a discrete logic tilemap display instead of an ordinary PPU, with a giant array of EPROMs for both the tile definitions and palette. (If you look at
http://nesdev.com/Playchoice.pdf , it's basically the entirety of sheet 2. The "nametable" is the RAM 8R (on sheet 1), the rasterizer is 6H through 6L plus 5M and 5Q, the tile definitions are in the 8Kbyte EPROMs 8K through 8P, the palette in the 256x4 EPROMs 6D through 6F.)
Anyway, one should still be able to build an all-new-parts ZX81, but it might count as a bit
too retro.
Similar to the ZX81 idea, this was an actual (throwaway) arcade board that uses a surprisingly low number of logic ICs to generate bitmapped graphics:
http://www.chrismcovell.com/dottorikun.html
You might want to look into the chips that are used in the plug and play consoles. They are not 6502 or Z80 by any means, they are some other 16-bit architecture, but they are cheap and mass produced and still being used.
Who was using them besides Jungletac?
@Tepples
So I guess our recent television that we bought after the switch to digital may be called a monitor instead. The coaxial doesn't react to the old famicom rf but the new famicom with rca jacks works well. As for pulled chips, it would be preferable new ones when possible but I guess from your comment this means graphical chips of that style are no longer available.
@Tokumaru & Tepples
FPGA is the terms I was looking for but I was too tired that I couldn't remember it. Thanks.
If could build something, I would use either a 6502 or z80 since they seems simpler than "newer" chips like the motorola 68000. I searched briefly and there seems to be some solution for the 6502/z80 but they are talking about SOC so I'm deducing that the chip is not in it's original package.
As for video, I guess you could invent your own tile based system (I would want to avoid a frame buffer for a low spec system since I guess it would hinder scrolling) but that would increase the complexity and I would prefer off the shelf parts when possible.
As for sound, something similar to the nes or sms would be fine.
Let say you would have all those chips, is it possible to prototype such a system on a breadboard?
Now if I reverse my stance and say I would decide to develop a system with FPGA, would I need one FPGA per subsystem? How more complex would it be compared to off the shelf parts?
That probably depends on the details of your particular country's transition to digital broadcast. In the USA at least, TVs need to receive both analog and digital RF signals because low power stations were allowed to continue to transmit analog. In addition, "Digital Transport Adapters" (compact cable TV decoders without video on demand) used on Comcast have only an analog RF out.
As far as I can tell, you could put your ghetto ColecoVision parts on a breadboard.
You could fit a whole NES in an FPGA, including 6502, APU, PPU, and mappers. Kev has done it.
Banshaku wrote:
Now if I reverse my stance and say I would decide to develop a system with FPGA, would I need one FPGA per subsystem?
I won't pretend I'm an FPGA expert, but I've seen people build entire systems on a single FPGA. I guess it depends on how complex the system is and how powerful the FPGA is. I'm pretty sure I've heard of people recreating even an SNES using a single FPGA.
This was the idea behind the infamous Retro VGS / Coleco Chameleon: a powerful FPGA would be configured to simulate a different system depending on the game being played.
Quote:
How more complex would it be compared to off the shelf parts?
I'd say ridiculously more complex it you're trying to recreate existing chips accurately, but if you're creating your own custom stuff, without having to adhere to existing specs, I guess it wouldn't be particularly hard if you keep the design simple (more 8-bit than 16-bit), but it would still be more complex than using off-the-shelf parts. Creating your own CPU and video architecture sounds like fun, though.
@Tepples
In Japan we went to full digital in 2011/04 but the regions affected by the earthquake got a delay to help receive information for people that didn't buy a televion that support digital yet. It could be my rf box is bad but I 'm sure it was working. I will try it someday on a older tv once I have access to one.
@Tokumaru
I'm sure it must be great to reproduce a complete architecture on a FPGA but since I don't know verilog or how to use the FPGA properly it feels that it would be quite a steep learning curve. If I want to learn architecture, programming an emulator on a pc would be quite interesting too. If I do start a project like that, I want to go off the shelf first for the learning experience before going deeper. Now I now that 1 FPGA can cover all sub systems.
If you look on the forums at
http://6502.org/ you'll find quite a few examples of home-built computers and discussion about what parts to use. But if we consider what makes a console distinct from a computer, it's mostly going to be the focus on graphics and sound.
For graphics, pretty much the only options are going to be using obsolete parts (maybe Yamaha V9958 or V9990 if you want to go beyond the Colecovision/MSX level), make your own in an FPGA (or LOTS of TTL logic, you can look at early 80s arcade schematics for examples), or settle for something relatively weak (like the Propeller chip by Parallax can output video, but it's definitely not a video processor, more like manually controlled like the 2600's TIA).
Pretty much the same deal with sound, make your own, or use obsolete parts. I kinda like the YM2151 myself, but with that you need a special Yamaha DAC, and those chips are power hogs and run really hot. Maybe there's a more modern (but still certainly obsolete) alternative. Pretty much the only standard sound chip you can buy I think is going to be a standard audio DAC with i2s interface, and you can buy pretty good MCUs that can interface with those (this is the route my Squeedo synth is ultimately taking). Or if you can run your 6502 or Z80 fast enough, you can just synthesize the audio with your main CPU and use a simpler kind of DAC. Or I suppose if you wanted to go modular, just include a UART capable of MIDI-OUT and connect it to your favorite MIDI device.
Thing is with modern parts, there's pretty much only 2 ways that it goes. You've got high-end stuff (which is still pretty cheap considering it's capabilities, look at the CHIP for example
https://getchip.com/), but it's going to be an SoC (system-on a chip). The manual for one of these things (assuming you could actually get ahold of it) would probably be some 5,000 page monstrosity. Then there is the lower end simpler stuff, like you'd see in toys and TV-game type things. Those aren't going to be usable chips, just silicon under an epoxy blob, and the supplier is not even going to talk to you unless you want to buy millions of them.
I'm sure there are other options other than those, I'm not expert, just an enthusiast.
@Memblers
So it seems the consensus is the cpu will be fine but video/audio will be an issue.
I should first try to define my target before starting anything. As for my case, I'm not an enthusiast, just enthusiastic about this subject
I really want to learn about it someday.
These days the programming I do at work is quite dull (IOT related, MQTT messaging/aggregation of streaming, basically the testing of a pipeline, making vm for that etc) that I need to find something to bring back the motivation. Either that or restart to work on some nes homebrew like I did 8 years ago but do something original instead of trying to reproduce something that already exist.
Thank you everyone for your comments, I really appreciate it. If there is other things that I should be careful I'm all ears.
I'm of the opinion that programming software is more rewarding than creating hardware that needs new software, specially if your time to work on such projects is limited. Creating new hardware will be a lot of work by itself, but you'll hardly see any results until you create the software to run on it, so you might just as well code a game for an existing machine and get results faster.
It really depends on your goals, I guess. Creating a new computer/console does sound like fun, I too would like to try that some time, but I imagine I'd be somewhat frustrated for not having any cool games to showcase my newly created machine.
Memblers wrote:
with that you need a special Yamaha DAC, and those chips are power hogs and run really hot.
Because everything uses digital signals now anyway, is there the possibility to not use the DAC? You'd have to convert the signal anyway though to work with any kind of used format, (lump it together with graphics and use HDMI) but I imagine you'd have to use an FPGA for that. At that point, you might as well use the FPGA to generate the sound in the first place though.
Has anyone ever thought of "Frankensteining" parts from various old computers/game consoles together to form something? It couldn't be produced large scale, but I doubt what you're wanting to do would be either. You'd also be destroying whatever you'd be taking the chip from (so it's not like you're going to target a Neo Geo or Sharp x68000) but if it's not working in other ways and it can't be fixed without replacing everything else, then go right ahead. I've always wanted to see if you could kind of create your own Supergrafx type system out of the SNES or something.
Ever since that discussion about the possibility of upgrading the SNES's PPU to 128 KiB from its default 64 KiB of RAM, I've been playing around with the idea of a more arcade-shaped SNES with bankswitching. Maybe combine VRC2-style bankswitching with 4MiB of RAM ('cuz 64 Ki address space ÷ 8 regions = 8 Ki regions ; 8 Ki regions × 256 = banks = 2 Mi address space)
It's be tricky with mode 7, but.
One obvious problem with using old technology is the voltage of parts. Old stuff is 5V, newer stuff is much less.
If I would have to implement the sound/video part with an FPGA, at first I would go with something simple so that I would have something running but something what I would like to do is to have multiple type of sound on it. For example, I would like to have presets like nes, sms, vrc6, namco-163, gameboy, genesis fm etc that you could select via programming but could make custom presets with channels from different implementations. Same thing for video mode, different mode to represent specific console. That way you could simulate a style that you want based on specific console specs.
But... I guess that must be quite hell to develop all those things but that is something that would be great. Basically a console that is not a nes/sms/genesis etc but allows you to select some presets to emulate some of the style/limitation of those.
Maybe for now I should just focus on restarting a homebrew project on any of those old console until I know enough about electronics. Still, I should try to plan that project someday, would love to make it. Old style programming, not an android something micro-console that allow you to do games like previous generation with no OS in the way, just the console as-is.
Dwedit wrote:
One obvious problem with using old technology is the voltage of parts. Old stuff is 5V, newer stuff is much less.
Well, I'm pretty sure you can use resistors to decrease the voltage. I have no clue about increasing it though. What number of volts do most electronics run at now? I had always thought it was 5V still. Is it as low as 1.5V? I'm probably way off.
You can not use resistors for voltage translation. Or at least, can only use it one direction (high to low), and it draws extra power over proper translation. So don't.
Most hobbyist level electronics are 3.3V now. Commercially mass-produced things are often in the 1.8 to 2.5V range. Current CPUs often run somewhere around 1-1.3V.
lidnariq wrote:
can only use it one direction (high to low)
Yeah, so they're only useful for lowering voltage.
lidnariq wrote:
it draws extra power over proper translation. So don't.
What are you supposed to do then?
Proper translation, like I said. There are ICs specifically designed for this purpose, for both up and down translation..
Banshaku wrote:
These days the programming I do at work is quite dull (IOT related, MQTT messaging/aggregation of streaming, basically the testing of a pipeline, making vm for that etc)
Banshaku:
Kind of strange that what you find boring I'm finding quite fascinating. If only I could somehow come up with an MQTT client for the newer Siemens PLC's (S7-1200, 1500) I'd be a hero... for a bit anyway. Yes, I know there's a github of S7-300 code, but that's written in the godawful obsolete Statement List language (STL). Enough digression.
I guess what is your ability level? Have you bread-boarded circuits before? Do you know how to create an address decoder and wire memory off of address lines? A microprocessor (Z80, 68000, 6502, 80x86) has just the Address and Data lines (and its handshaking signals) exiting the chip, with the expectation of wiring up external memory and peripherals like serial, video, sound, and I/O registers, (and interrupts and DMA, for the brave). IMHO the Commodore VIC-20 schematic is super awesome to describe what is needed. One single page shows wiring for the entire system, with off-the-shelf parts (except for the video chip itself, the 6560). Maybe not so many tiny memory chips nowadays, of course.
Do you just want to tinker with software programming? Might I just suggest microcontrollers like the Parallax Propeller or Teensy 3.2 or ESP8266 then? A microcontroller has the CPU and peripherals and memory (RAM, sometimes FLASH) internally wired already inside the chip, leaving mostly just general purpose I/O pins to interface to the outside world.
tepples wrote:
So now your question has been reduced to "Are there reliable sources for new-old-stock retro video and audio chips?"
Actually Yamaha and other manufacturer made many dedicated audio chips which were mass produced, so this is pretty much a given.
On the other hand I do not think there was such video chips, each video chip was specific to a particular game console or computer... unless I'm mistaken. I'm still unsure how arcade games had their graphics rendered, but I think it was mostly ASICs, alas.
Looking around, it seems the TMS9918 and its immediate siblings may have been the only "video game PPU on a single chip" that wasn't explicitly designed for one specific console. ... although there's some wiggle room in that definition given the wide variety of companies that manufactured MSX computers.
@whicker
What I had to do was reseaching if mqtt clients exist for specific platform, build test environment for an api developped by an offshore team with little to no documentation, figure out why it fails because doesn't follow the specs or forgot to implement then try to test the api to see if it does follow the requirements. Not that pleasant compared to if I had to develop code for iot devices and see if they interact well with an already build environment or tests clients for specific embedded devices or something like that.
As for my skill, except for soldering and started to read a book about electronics, I do not possess the skill to make such a project and I'm totally aware of it. When I'm too busy I like to do research on future projects that I would like to work on to know how feasible it is. I find it very fun and refreshing to do so and it helps me deal with my stress by learning more about something I would love to do even though it not possible at the moment. I apologize if it feels like wasting people time but I got a lot of information from that thread and I'm very grateful about it.
My lack of knowledge will be the biggest hurdle for now. I need to learn more about electronic but don't really know were to start. My programming background will hopefully help but I don't know by how much.
Quote:
Looking around, it seems the TMS9918 and its immediate siblings may have been the only "video game PPU on a single chip" that wasn't explicitly designed for one specific console.
Hey, very cool. I had no idea such a chip existed. It's features aren't so exciting or anything, but the concept is cool. I wonder whether it's possible to have multiple copies of such chips and combine their signals somehow in order to get better graphics (more sprites and multiple backgrounds).
TMS chips actually have provisions for master/slave confs for multiple chips environment, that carried on to MSX, SMS and MD (all made by Yamaha) also which can sync to external video source. All these chips have a signal that tells when they're outputting the far BG color also (!YS signal).
There's also some new stuff from them that might be useful :
http://www.yamaha.co.jp/english/product ... ontroller/And perhaps sound too :
http://www.yamaha.co.jp/english/product ... generator/
I wonder why/how it's possible/feasible to use a video signal from another chip as an input to a PPU. For me it sounds much simpler to use a digital palette index as input rather than using an analogic input video signal. This should make synchronization particularly cubersome (but if both chips are identical which would be the case here, it wouldn't be as problematic I guess).
So with 2 or 3 daisy chained TMS9918 I guess it's possible to get pretty decent graphics at minimal cost, the major problem is that fine scrolling is not implemented, and that sprite priority is hardcoded between BG layers and is not flexible.
I'm pretty certain that combining multiple TMS9918s to produce more sophisticated graphics wasn't really on their radar when they designed it... but video overlay from a camera or recorded source had a clear use.
I'm thinking about possible graphics modes for a 16-bit system.
Mode 0:
- 3 BG layers
- BG1 and BG2 always use 16x16 tiles
- BG3 can use either 8x8 or 16x16 tiles
Mode 1:
- 1 rotation/scaling BG layer
Mode 2:
- 2 BG layers (BG1 and BG3)
- BG1 is a windowed rotation/scaling layer (either 160 pixels of 256, or 192 of 320)
- BG3 can use either 8x8 or 16x16 tiles
Mode 3:
- 3 BG layers
- BG1 is a windowed rotation/scaling layer (either 80 pixels of 256, or 96 of 320)
- BG2 uses 16x16 tiles
- BG3 can use either 8x8 or 16x16 tiles
Why would you deliberately inherit the weirdness of the SNES setup instead of designing something more sane and versatile to use?
tokumaru wrote:
and versatile to use?
More versatile? The reason the SNES is so insane is that it's the most versatile 16 bit system I've seen in that the bandwidth can be chopped a million different ways, but out of all 8 modes, there's only half are worth keeping. 64x64 sprites are also illogical, and there's some other stupid stuff that I think only about 1 game has ever used. Like, what practical application does direct color have?
I'd really liked to use a time machine and tell Nintendo to have used the PPU space more reasonably. They had enough room to implement the hi res modes that I think less than 10 SNES games ever used, but they were too tight on space to implement a fifth byte for sprites? Manipulating 4BG layers couldn't have been too easy either, and while it's more commonly used than hi res mode (It would actually be particularly useful for modern "retro" games), it's still not that widely used. How about more palette entries?
Sorry about that Koitsu...
If you want a Game Boy Advance, you know where to find it.
- Like the 68000 in the Genesis, the ARM7TDMI in the GBA has about 16* 32-bit registers, some form of hardware multiply, and a 16-bit bus to most of its memory.
- Like the VDP in the Genesis, the GBA allows writing to VRAM during draw time as well as sprite cel memory bigger than 16K. It also allows rectangular sprites (8x16/16x8, 8x32/32x8, 16x32/32x16, 32x64/64x32), and sprite memory layout can be set to 1D (Genesis style) or 2D (Super NES style).
- Like the VDC and VCE in the TurboGrafx-16, the GBA has sixteen 15-color subpalettes for background and another sixteen for sprites.
* Approximation intended. Out-Aspergering me about weighing the ARM's special-purpose PC against the 68K's D0-D7/A0-A7 divide is beside the point.
Espozo wrote:
More versatile? The reason the SNES is so insane is that it's the most versatile 16 bit system I've seen
Versatility isn't necessarily about offering a ridiculously large amount of settings... it can also mean offering a solid set of essential features that can be efficiently exploited in many different ways. I've never coded for the SNES myself, but every time I see you guys talking about it, it sounds absolutely taxing how you have to deal with all the possible settings and how deeply they affect the most basic aspects of the software design, and hearing about it seriously discourages me from ever trying to code anything for it. It sounds more like a chore than like a fun challenge to me. Don't get me wrong, I love the SNES and acknowledge that it was one of the greatest consoles ever made, but I'd hardly take it as a good example of hardware engineering.
Depends on how literal you mean by retro. If you mean "can't have VRAM faster than 7Mhz" then it gets tricky.
tokumaru wrote:
it can also mean offering a solid set of essential features that can be efficiently exploited in many different ways.
That's definitely true. It's kind of like how the Neo Geo is less complicated in terms of screen modes and transparency and resolution and whatever, but the few things that are there are much, much more flexible, although brute force has a lot to due with this.
tokumaru wrote:
I've never coded for the SNES myself, but every time I see you guys talking about it, it sounds absolutely taxing how you have to deal with all the possible settings and how deeply they affect the most basic aspects of the software design, and hearing about it seriously discourages me from ever trying to code anything for it.
Yeah, it's not for everyone... I've wanted to make a game engine that makes it possible for you to really just forget about handling the video hardware, (which is 90% of the problem with the SNES) but this has proven itself to be too slow for many applications. However, it's still faster than the code in 90% of commercially released SNES games. For very CPU intensive games, you need to hardcode a lot more.
tokumaru wrote:
I'd hardly take it as a good example of hardware engineering.
I think the problem is many features were implemented for how limited the hardware is. Hell, the SNES beats otherwise much more powerful 2D hardware in terms of special effects. I've heard stories that the SNES was originally going to be much more powerful, and then, all of the special features would have made much, much more sense. Then, it would be like the GBA, which is honestly a much better designed piece of hardware.
I can show what I designed for video anyways. A separate "video processor" executes a program called a "display list" during vblank and hblank (it won't execute during rendering). The display list starts executing from the beginning at vblank, and will continue from where it left off during hblank. The display list program must configure the playfield address at the beginning of this scanline, the fine X scroll, the address of the sprite table, the address of the patterns, the sprite Y scroll position, the palette, and some other mode bits (such as controlling interpretation of the playfield data; the four settings are "Tile", "Tile + Attribute A", "Tile + Attribute B", and "Pixel"). By doing this, you can have the tile height to be whatever you want, or you can display tiles upside-down, or other possibilities. You could also try to make a simplified version of this if you want to, I suppose.
One way to share video memory with main memory is by implementing clock-interleave. (This is how I planned to do it.)
Amiga has "copper" with a much simpler instruction set than what I have, but could execute during rendering instead of having to wait for hblank. Amiga also has various modes, such as "dual playfield", "Hold And Modify", "Extra Half Brite", and so on; I don't quite know everything about them, but I somewhat know its working.
Instead of doing something tricky like all these vintage 2d consoles, just give devs a framebuffer and a 2d hw blitter, with alpha blending and common 2d transformations. Unlimited sprites/layers/colors, only limited by memory bandwidth. Just like early PC GPUs, Rage or Matrox.
@tepples
GBA 1D mode is not Genesis-like. It's the opposite order - row vs column. SCNR.
zzo38 wrote:
I can show what I designed for video anyways. A separate "video processor" executes a program called a "display list" during vblank and hblank (it won't execute during rendering). The display list starts executing from the beginning at vblank, and will continue from where it left off during hblank. The display list program must configure the playfield address at the beginning of this scanline, the fine X scroll, the address of the sprite table, the address of the patterns, the sprite Y scroll position, the palette, and some other mode bits (such as controlling interpretation of the playfield data...)
Atari 400/800/5200 and Atari 7800 called, and they want their display lists back.
calima wrote:
2d hw blitter
That ended up used in pretty much everything since the Atari Lynx. Before then, it needed too much fast memory compared to the TMS9918-style tile planes and sprites paradigm.
calima wrote:
GBA 1D mode is not Genesis-like. It's the opposite order - row vs column.
It's still not 2D, about which on the Super NES psycopathicteen has complained several times because it increases setup time for DMA.
tokumaru wrote:
Espozo wrote:
More versatile? The reason the SNES is so insane is that it's the most versatile 16 bit system I've seen
Versatility isn't necessarily about offering a ridiculously large amount of settings... it can also mean offering a solid set of essential features that can be efficiently exploited in many different ways. I've never coded for the SNES myself, but every time I see you guys talking about it, it sounds absolutely taxing how you have to deal with all the possible settings and how deeply they affect the most basic aspects of the software design, and hearing about it seriously discourages me from ever trying to code anything for it. It sounds more like a chore than like a fun challenge to me. Don't get me wrong, I love the SNES and acknowledge that it was one of the greatest consoles ever made, but I'd hardly take it as a good example of hardware engineering.
Most games just used the two 4bpp layers and third 2bpp layer, and didn't really do anything special with them.
Quote:
I'm thinking about possible graphics modes for a 16-bit system.
To be honnest, if anyone were to create a new system, I'd strongly discourage the usage of "graphics modes" alltogether. Just try to make a single useful and powerful graphic mode sounds much more useful than having multiple modes and having to chose between them later when you want to use half the features of each.
As for the SNES, the standard way to do thing is use mode 1 normally, and use another mode whenever you want one of its distinct feature.
If I were designing my own retro video chip, I'll implement just a lot of sprites, and make the user supposed to simulate his BG(s) using sprites. This sounds more efficient than having several BGs and sprites in hardware.
You'd need to have Neo Geo style linking (each sprite has a list of one or more stacked 16x16s, and sprites can be set at an offset from the previous sprite in OAM) in order to make scrolling a background made of sprites as fast as updating a tilemap.
If Super NES devoted all its VRAM bandwidth to sprite slivers the way Neo Geo does, it'd be able to retrieve 170 slivers per line, which is equivalent to 85 16x16 sprites. That's almost Neo Geo level by itself, but then you lose color math and mode 7. Adding a single background to replace the fix layer would remove 32 slivers (if 4 color) or 48 slivers (if 16 color).
tepples wrote:
which is equivalent to 85 16x16 sprites.
Wait, what? How would we all of the sudden be able to cover the screen a little over 5 times with 4bpp graphics using sprites, if no BG mode can even get to covering the screen 4 times with 4bpp graphics? Are you thinking that sprite data would still all be on the PPU, which I assume is much faster to get data from?
Dwedit wrote:
Most games just used the two 4bpp layers and third 2bpp layer, and didn't really do anything special with them.
I never understood the why Mode 3 (8bpp and 4bpp layer) wasn't more often used. I'm guessing a lot of it has to do with rom size.
Espozo wrote:
tepples wrote:
which is equivalent to 85 16x16 sprites.
Wait, what? How would we all of the sudden be able to cover the screen a little over 5 times with 4bpp graphics using sprites, if no BG mode can even get to covering the screen 4 times with 4bpp graphics?
Background maps and tiles are both read from VRAM, and ordinarily, a program can't completely disable backgrounds to gain more sprite sliver reading time.
Espozo wrote:
Are you thinking that sprite data would still all be on the PPU, which I assume is much faster to get data from?
I was assuming placement of a larger OAM either inside the PPU or on a dedicated SRAM.
Espozo wrote:
I never understood the why Mode 3 (8bpp and 4bpp layer) wasn't more often used. I'm guessing a lot of it has to do with rom size.
That and VRAM size; gotta leave room for 16K sprites of and 6K of maps. But the title screen of
Super Mario All-Stars uses an 8bpp layer. So does
Zoop because its playfield is an 18x14 grid of cells, each 12x14 pixels. The usual workaround for 12-pixel-wide cells, as seen in
Puyo Pop and
Luminesweeper, needs two backgrounds, and I imagine Panelcomp (the company that presumably ported
Zoop to the Super NES) wanted to use more than 2bpp for the "Opti-Challenge" background. So instead, it composites the game graphics onto a frame buffer in software.
Okay, I have yet another VDP VRAM setup idea (and yes Tepples, it is based on the GBA).
The VDP accesses 4 8-bit RAM chips at 13.5 Mhz. 2 are used for BG layers, the other 2 are used for sprites. It could do 4 BG layers, or 2 rotate/scaling layers, or 2 BG layers + 1 rotate/scaling layer.
For sprites, it does it's accesses like this. There are 858 cycles per line:
128 cycles spent looking through the Y-coordinates of sprites
192 cycles spent fetching 6 attribute words (X-coordinate, attributes and 4 rotate/scale parameters) for 32 sprites that appear on the scanline
512 cycles fetching the sprites, either 1 pixel at a time (affine sprites) or 4 pixels at a time (regular sprites)
26 open cycles
Because we're really just thinking of random ideas for a video chip instead of actually making one, can anyone think of a custom video chip that someone has made? It would almost certainly be FPGA, because I have no idea how you'd manufacture your own microchip.
Thinking about it though, it's a bit funny to make your own 2D hardware considering how much 2D hardware exists, mainly from different arcade boards. I don't even want to know how long the list of different arcade hardware is. (Hardware with the game built into it is irrelevant for this.) I think it is kind of surprising that no company was manufacturing general purpose video hardware at this time, like the TMS9918, because many systems almost seem copy-paste from another. Just think of how many had 3-4 4bpp BG layers with 4bpp, 16x16 sized sprites and 4096 palette entries, out of a 15-16 color palette. Few seemed to have color math, unless they were dedicated for one game. (For whatever reason, the example of an arcade machine with color math that comes to mind for me is Raiden II, but I'm not actually sure if this is dedicated hardware or not, because I'm pretty sure Raiden DX uses the same hardware. It's always possible the game is still attached to the board though.)
It is really weird when arcade manufacturers would have several different BG and sprite chips that act completely different from eachother, but all custom made for the arcade game. If they wanted more bg layers, wouldn't it make more sense to use two of the same chip?
I'm guessing you're talking about arcade machines that are only designed to play one game? I'd have thought much of the hardware would have been recycled. If I'm not mistaken, Konami had it to where every game had its own board. I don't know why you'd do this (especially if you're making as many games as Konami was) because it seems really cost ineffective. If I were to guess why they did this (while virtually every other arcade developer didn't) it would be that games would be harder to pirate, but I'm not sure it would be worth it. There couldn't be nearly as many pirated arcade games as home console games.
About what you said again though, I'd imagine that you'd need to build the video hardware around being expanded upon to where they could be linked together. I think I heard that the Supergrafx actually has 2 PCE video chips, but I have no clue how they're connected, if it even is the exact same chips.
It still blows my mind how many different video processors there were back then, but how they were not commercially released or used outside of the one device they were put in. The funniest part about it all is how the CPUs were commercially produced, and that there was barely any variety. 90% of the time, it would be a 68000 paired with a Z80. If you're lucky, you'd see a V(XX) (V30, V33, etc.), but that's about it. Oh yeah, and there's the 6809.
I wouldn't underestimate the world-wide market for bootleg arcade games. From what I've heard about Street Fighter 2, there were supposedly more bootleg copies sold than legit ones. If anyone can just buy the chips somewhere, or owners can convert old boards to new games themselves with just a ROM replacement, it would be done if there was demand for the game. Nintendo's VS is another example of this, all using slightly different PPUs so owners would have to buy the games instead of just copying the ROMs.
Quote:
It still blows my mind how many different video processors there were back then, but how they were not commercially released or used outside of the one device they were put in.
I agree. It also makes it difficult to make your own homebrew arcade or pseudo-arcade game, because there's absolutely no standard for graphics, as opposed to sound where a couple of Yamaha FM chips are standard.
Quote:
Konami had it to where every game had its own board. I don't know why you'd do this (especially if you're making as many games as Konami was) because it seems really cost ineffective.
Actually manufacturing a bunch of PCBs is very cheap and probably accounted for a tiny amount of the game's price. Also, designing the said PCBs was probably expensive, but by re-using design and brining small modifications, that wouldn't be expensive.
They did the same for Famicom games using VRC mappers, just changing the wiring a little in order to confuse reverse engineers, and it worked well.
Espozo wrote:
I think I heard that the Supergrafx actually has 2 PCE video chips, but I have no clue how they're connected, if it even is the exact same chips.
Here's how I understand it: The nine signals that come out of the VDC are background/sprite (1 bit), palette index, and subpalette index. The TG16 VCE just uses this as a 9-bit index into CGRAM. The SuperGrafx VCE combines each VDC's background/sprite bit with its subpalette index (0 for backdrop or nonzero otherwise) to produce a 3-way: background, sprite, or transparent. This is fed into a priority encoder, where VDC0 or VDC1 could be given priority for each combination of VDC0 background or sprite and VDC1 background or sprite.
Espozo wrote:
It still blows my mind how many different video processors there were back then, but how they were not commercially released or used outside of the one device they were put in. The funniest part about it all is how the CPUs were commercially produced, and that there was barely any variety. 90% of the time, it would be a 68000 paired with a Z80. If you're lucky, you'd see a V(XX) (V30, V33, etc.), but that's about it. Oh yeah, and there's the 6809.
Perhaps that's because CPUs were full custom designs, while application-specific video processors were made out of rows of
standard cells, or prefabricated gates that could be copied and pasted into an integrated circuit design, as if typing in a proportional font. Kevtris has pointed out how on the 2A03, the CPU is a tight block of custom NMOS design, while the APU is standard cells.
Bregalad wrote:
It also makes it difficult to make your own homebrew arcade or pseudo-arcade game, because there's absolutely no standard for graphics, as opposed to sound where a couple of Yamaha FM chips are standard.
For the past roughly a decade and a half, it's been possible to just put a PC's chipset on your arcade PCB. And lately even ARM SoCs are decently powerful. The biggest problem is attracting enough people away from touch-controlled phone games to insert coin.
Quote:
For the past roughly a decade and a half, it's been possible to just put a PC's chipset on your arcade PCB. And lately even ARM SoCs are decently powerful.
I meant a retro arcade. Sure I could just fake retro graphics and call it a day, but sticking to realistic limitation would help I think.
Quote:
The biggest problem is attracting enough people away from touch-controlled phone games to insert coin
Sadly, attracting people away from touch-controlled phone
at all (games or not) is close to impossible today
It's almost scary.
All your design ideas could be implemented with an FPGA? How does it works? The FPGA does the logic and you just need to connect a port to it for output and that's it?
To my own experience, FPGA design is, well, a complete and total nightmare, and extremely unreliabe (i.e. a design could work or not work just based on romplete randomness). Also you need huge "project" folders with 500 MB of crap in them every time. So I am really sceptical right now. This could change in the future though, if the FPGA manufacturers would focus on ease of use and great tools instead of focusing on pure hardware performance and provinding bloated and shitty tools. I'm not even mentionning the 5000 pages of doccumentation which is so huge that it looks it was made purposedly to discourage people.
@Bregalad
Well, from your post I can guess that you really don't like FPGA but that didn't answer the question though
. Is my assumption correct that you would need the FPGA for the logic then a port for the output? Is something else required?
Graphics chips can probably be done on microcontrollers, even though microcontrollers have to waste most of their time outputting pixels.
Okay, I had an idea today about modding a Super Nintendo as a small upgrade. Disconnect the HALT pin from the ground and hook it up to EXT pin, and maybe it can allow enhancement chips to take over the CPU busses. But, then there's the question of how fast can the sPPU be written to.
The PPU most probably cannot accept stuff faster than its DMA speed.
All accesses go through the internal state machine, it doesn't make much sense for it to work in some other way.
VRAM bus capture could easily tell how many accesses are possible. I imagine jwdonal can say a thing or two about it.
I hope replying is ok if I only read the first page and only scanned the rest. As Memblers said, there are several on the 6502. org fourm (
http://forum.6502.org/) who are strong at programmable logic (FPGAs, which are field-programmable gate arrays, suitable for even making your own microprocessor, and CPLDs, complex programmable logic arrays which are a step down but can do a lot more than say a PAL). A few members are very strong at video, the first one coming to mind being
Oneironaut (see
this topic of his with lots of pictures and links to videos of his builds using off-the-shelf 74HCxx logic), and someone (I can't remember who at the moment) is reverse-engineering the SID. There is someone also selling the SwinSID (I hope I got the name right) which mimics the Commodore SID. Others have their especially strong points, like algorithms (member
dclxvi and others), OSs (member
kc5tja,
TMorita, and others), Forth (
Dr. Brad Rodriguez who's a big man in the Forth world), etc.. Unfortunately our resident arcade man,
Nightmare Tony, died four years ago with cancer, in his 40's; and
Lee Davison who wrote the excellent EhBASIC and was also on the forum died a few years ago, also not very old. I don't know much about video myself, but I have a 6502 primer about many other aspects of making your own 6502 computer, at
http://wilsonminesco.com/6502primer/index.html .
Oneironaut (mentioned above) plans to use
my large look-up tables for fast 16-bit scaled-integer math (including mult, div, trig, log, square root, and others) which, in the extreme cases, make looking up math functions nearly a thousand times as fast as having to actually calculate them, and you'll have all 16 bits correct, not needing any interpolation, because all the answers are there, pre-calculated. Do that with a maximum-speed 65816, and you'll have math performance that's
thousands of times as fast as the Apple II.
The 65c02 and 65816 are very much in production today, actually being made in huge volumes (over a hundred million units per year today, yes, >100,000,000/year), although they're rather invisible, being at the heart of custom ICs made for automotive, industrial, appliance, toy, and even life-support applications, the fastest ones running over 200 MHz.
Western Design Center (WDC) is the IP holder, and they make most of their money from licensing the IP to client companies. They do sell some hardware parts, including the 65c02 and 65816 which are all guaranteed to be able to do at least 14MHz. The '816 is a natural upgrade path to the 65c02, with registers that can operate in 16-bit mode, and has a 24-bit address bus, and has a lot more instructions and addressing modes, making it much more suitable for things that the '02 is either clumsy at or incapable of. The 65816 outperforms the 68000 and 8086 in the Sieve of Eratosthenes benchmark. Bill Mensch, WDC's owner, is still going strong, and intends to keep the technology available indefinitely.
@Garth
This is good to know. This is on topic and will keep that information in mind. Thanks.
Garth wrote:
The 65816 outperforms the 68000 and 8086 in the Sieve of Eratosthenes benchmark.
Do you by any chance have sources for this? I've tried google but all I've found is you making that claim (
) and
this article, which (quoting an old BYTE article) gives the 68k quite the edge over the 65816 (although the 65816 indeed still crushes the 8086).
Edit: (that article seems to come straight from
Programming the 65816 - so even the canonical source claims the 68k is faster)
Edit edit: Also - that probably came across as more argumentative than I intended! I'm not a 68k fanboy, I love both it and the 65816 equally, for different reasons. Just curious to See The Code, as it were.
Adam, I originally got that info from WDC, and I don't remember if it was in paper literature or on their website which keeps getting changed and not all of the info is there anymore. Samuel Falvo repeats it at
http://forum.6502.org/viewtopic.php?p=190#p190 and says it was published in the January, 1983 issue of BYTE magazine. I don't know if the source code was there. It shows the 65816 taking .73 seconds (that was for ten iterations of the Sieve, IIRC) to do what the 68000 did in .49 seconds, both at 8MHz; but then the '816 went on to achieve higher clock speeds than the 68K. The '816 would only have to run at
11.9MHz to match the 68K; but the SuperCPU plug-in addition for the C64 ran the '816 at
20MHz.
Edit: I do have the Eyes & Liechty programming manual referred to in your link. I guess I glossed over that when I read the book many years ago, so I didn't remember it was there. (I do refer to the reference sections in the back of that book frequently though.)
If you're just looking to learn and tinker with making your own simple console it might be worth checking out the
Uzebox if you haven't already. Off the shelf new parts, only two IC's (atmega and video encoder) plus bunch of discrete components. It's open source and you can make it yourself, and add your own improvements or whatever you'd like. It's doesn't fit the frankenstein bill of the thread topic, but I noticed your other topic expressing curiosity with arduino and wanted to point it out. It's kinda like an arduino minimalist video game console.
The only way a 68000 would be faster is if you're only using 32-bit instructions, only using 24-bit addressing, and emulating every 68000 instruction word for word.
I should have mentioned that the 65c02 has much better interrupt performance than the 68000. I've run up to about 140,000 interrupts per second on my 5MHz 65c02 workbench computer. (Obviously the ISR had to be pretty simple.) I have a 6502 interrupts primer at
http://wilsonminesco.com/6502interrupts/ . That said, the 68000 does have an instruction set that's better suited for HLL compilers.
Ahh, I see! Sorry, I misunderstood - you didn't mean that clock-for-clock the 65816 runs the sieve faster, you meant that - at this point in time - you can clock it (more than) high enough to make up the difference. That plus the faster interrupt handling plus the current availability would indeed make them pretty compelling for this kind of project.
(*that said*, I'd be remiss if I didn't mention that apparently
28MHz (!!!) 68000s existed, once upon a time)
psycopathicteen wrote:
The only way a 68000 would be faster is if you're only using 32-bit instructions, only using 24-bit addressing, and emulating every 68000 instruction word for word.
The sources we're looking at claim that, at the same clock speed, the 68k
does in fact outperform the 65816 at this particular task. The 65816 code is given, and while I'm not going to claim it's beyond reproach, it's not doing anything like you're describing; it was written to be fast ("the Sieve program...provides an opportunity to examine performance-oriented programming; since the name of the game is performance, any and all techniques are valid in coding an assembly-language version of a benchmark") by authorities on the CPU.
If you can write a 65816 sieve that actually does outperform the 68k, write it up (in another thread?) and we'll have a shootout
In case of SNES there's the limit of 8bit bus which will reduce all the offchip traffic speed.
Quote:
I should have mentioned that the 65c02 has much better interrupt performance than the 68000. I've run up to about 140,000 interrupts per second on my 5MHz 65c02 workbench computer. (Obviously the ISR had to be pretty simple.) I have a 6502 interrupts primer at
http://wilsonminesco.com/6502interrupts/ . That said, the 68000 does have an instruction set that's better suited for HLL compilers.
From the info I have found it takes 44 cycles to enter an interrupt and 20 to exit on 68000, so 64 cycles just for that (more for Address Error and Bus Error exceptions as they put extra info on the stack), plus whatever time you spend on actually doing anything.
If one is gonna design some hardware with 68K you'll want to fire your line int earlier if possible so that you aren't gonna get too much into the blanking area and can maximize the space fully. Cycle hungry instructions like DIV and MUL will delay interrupt response fair bit too, you don't want to be doing those if you need timing critical code.
TmEE wrote:
In case of SNES there's the limit of 8bit bus which will reduce all the offchip traffic speed.
Which is largely balanced out by the 68000 accessing memory only once every four cycles.
For the algo mentioned earlier I assume the comparison is between 65816 with 16bit bus and 68000. I would think 65802 mentioned in the WDC programming manual is closer to what's in SNES, except you do get 24bit addressing on SNES which you don't on 65208. Now 68008 (68000 with 8bit bus) vs whatever would be a win to whatever, 68008 is very very very slow when it comes to talking to the external world. Big instructions, big data, lots and lots of slow memory cycles.
Baseline 65816 has an 8-bit data bus. Opcodes are 8-bit, and operands and data are read or written at one cycle per byte. As I understand it, the only real difference with the 65802 was that it could only address 64 KB due to 6502 pin compatibility requirements, which is definitely not true of the 5A22.
Going to a 16-bit system bus in the Super Famicom would have required some reasonably sophisticated glue logic, plus either external wait states or greatly complicated on-die timing.
A 6502 successor with a 16-bit data bus, no half-cycle strobe, and no concern for backward compatibility could have been a monster, approaching double the performance of the 65816 at the same clock speed, or quadruple the performance at the same memory speed. (I'm assuming here that processing would remain more or less I/O-bound...)
I just looked at the Sieve benchmark, and it looks rigged to me. If they used word aligned addresses, they wouldn't have needed the sep and rep.
EDIT: Actually, even with the code posted, I don't understand how the 68000 could be any faster, unless they just used a completely different algorithm to begin with.
93143 wrote:
Baseline 65816 has an 8-bit data bus. Opcodes are 8-bit, and operands and data are read or written at one cycle per byte. As I understand it, the only real difference with the 65802 was that it could only address 64 KB due to 6502 pin compatibility requirements, which is definitely not true of the 5A22.
Going to a 16-bit system bus in the Super Famicom would have required some reasonably sophisticated glue logic, plus either external wait states or greatly complicated on-die timing.
A 6502 successor with a 16-bit data bus, no half-cycle strobe, and no concern for backward compatibility could have been a monster, approaching double the performance of the 65816 at the same clock speed, or quadruple the performance at the same memory speed. (I'm assuming here that processing would remain more or less I/O-bound...)
I learned something new then, I always thought a vanilla 65816 had a 16bit bus and it would have been something you described, and 5A22 is severely bottlenecked due to the 8 bit bus.
Here's a slightly optimized version of the Sieve benchmark posted in this thread, compared to a 68000 version of the same code. How can the 68000 be faster at this?
Code:
65816:
top:
tax //2
sep #$20 //3 5
stz flags,x //5 10
rep #$21 //3 13
adc prime //4 17
cmp #size+1 //3 20
bcc top //3 23 cycles
68000:
top:
move.b (d0,a0),d1 //14
add.w d0,d2 //4 18
cmp.w d0,d3 //4 22
bcc top //10 32 cycles
93143 wrote:
Baseline 65816 has an 8-bit data bus. Opcodes are 8-bit, and operands and data are read or written at one cycle per byte. As I understand it, the only real difference with the 65802 was that it could only address 64 KB due to 6502 pin compatibility requirements, which is definitely not true of the 5A22.
Going to a 16-bit system bus in the Super Famicom would have required some reasonably sophisticated glue logic, plus either external wait states or greatly complicated on-die timing.
A 6502 successor with a 16-bit data bus, no half-cycle strobe, and no concern for backward compatibility could have been a monster, approaching double the performance of the 65816 at the same clock speed, or quadruple the performance at the same memory speed. (I'm assuming here that processing would remain more or less I/O-bound...)
I don't know if this is getting too off-topic. Let us know if so, OP and moderator. I'm pretty new on this forum.
The '816, in spite of its 8-bit data bus, handles 16-bit quantities far more efficiently than the '02, in ease of program writing, and in code compactness, and in execution speed. (See my post comparing 6502 to 65816 efficiency at
http://forum.6502.org/viewtopic.php?p=9705#p9705 .) But rather than just extending the 6502 to greater widths, a successor processor needs to address the needs that will become more and more glaring as the door is opened to multitasking, relocatable code, and other things that the '02 was poorly suited to. The '816 does some of this, but I would like to see it taken further.
There's the 65Org32 proposed processor, with the discussion at
http://forum.6502.org/viewtopic.php?f=1&t=1419 . Basically it's a 65816 extended to 32-bit in almost every way except maybe the status register. 32-bit data bus, address bus, A, X, Y, "direct page" register (although it becomes just an offset, able to address the entire 4 gigaword address space, and not required to align with any particular boundaries), "data bank" and "address bank" registers (although again they just become offsets, able to address the entire address space, with no actual banks), stack pointer, program counter, ALU, and anything else I might have forgotten. It would not execute legacy 6502 code, but would nevertheless be very much a 65-family processor, not having lots of registers, nor deep pipelining, nor branch prediction, nor merged instructions (ie, operands will not be merged with op codes). It would add a barrel shifter, hardware multiply, and a few other things. The HDL enthusiasts have not started tackling this one yet. I've been tempted to emulate it with a microcontroller. The performance doing it this way would be terrible, but it would let me experiment with the instruction set and write applications.
There is also a 65Org16, but that's basically just a double-wide NMOS 6502, with no extra capabilities. Sam Gaskil (ElEctric_EyE on the 6502.org forum) was doing this one, and had made a lot of progress, but I have not heard any updates in quite a while.
Then there's Michael Barry's 65m32 which might have the best chance of becoming reality at this point. See
http://anycpu.org/forum/viewtopic.php?f=23&t=300 . It has some things in common with the 6502, but a lot of divergence too, having more registers and merging most operands with the op code so you can fetch an op code and an operand up to 23 bits all in one cycle. Code density is excellent.
I remember earlier efforts, particularly the 65832 design which was finished by WDC but never put into production since they didn't get a big order like from Apple, and Gideon Schweitzer (sp?) and friends' 65GZ032 which was actually running but still needed various things worked out when life changed for the lead engineer and progress came to a stop. These had design goals of still being able to run already-assembled 6502 code, a requirement I hope we can drop because of the complexities and limitations that come with it. The 65GZ032 in native mode was hardly a 65-family processor, but rather a RISC.
The attraction for any of these, over other processors available today, is the easier assembly language (being just a grown-up 6502) and being able to take advantage of one's extensive experience in 6502 and one's 6502 way of mentally forming solutions.
psycopathicteen wrote:
Here's a slightly optimized version of the Sieve benchmark posted in this thread, compared to a 68000 version of the same code. How can the 68000 be faster at this?
Code:
65816:
top:
tax //2
sep #$20 //3 5
stz flags,x //5 10
rep #$21 //3 13
adc prime //4 17
cmp #size+1 //3 20
bcc top //3 23 cycles
68000:
top:
move.b (d0,a0),d1 //14
add.w d0,d2 //4 18
cmp.w d0,d3 //4 22
bcc top //10 32 cycles
Not that it's material, but isn't ADC PRIME 5 cycles rather than 4?
But anyway, you're right. I tracked down the original BYTE magazine article (it's on archive.org
here) and unfortunately there's no actual code given for the 68k assembly version (readers submitted dozens of programs in different languages/platforms, so only the runtimes are shown; in fact, the times were reader submitted too and weren't verified).
I tried a bunch of stuff, thinking maybe the outer loop - which is executed almost as many times as the inner one, it turns out - could be implemented fast enough on the 68k to overcome, but as far as I can tell, you just can't; in the end the 65816 comes out a little bit faster every time. So unless someone else can come up with a speedier 68k implementation than me, I'm convinced that the 65816 does indeed beat the 68k at this algorithm, and that the time given for the 68k in BYTE back in '83 and reprinted in
Programming the 65816 is a myth.
FWIW, here's my code. It's untested, so I make no claims about correctness:
Code:
; FLAGS = a0
; PRIME = d0
; REMAINING = d1
; I = d2
; ONE = d3
; COUNT = d4
moveq #1,PRIME
moveq #1,ONE
move.w #SIZE-1,REMAINING
moveq #0,COUNT
main:
addq #2,PRIME ; next candidate (4)
tst.b (FLAGS)+ ; is this a prime? (8)
dbeq REMAINING,main ; (10/12/14)
cmp.w #-1,REMAINING ; are we done? (8)
beq.s .end ; (8/10)
move.w PRIME,I ; (4)
bra.s .test ; (10)
.top:
move.b ONE,-1(FLAGS,I) ; mark the non-prime (14)
add.w PRIME,I ; move forward (4)
.test:
cmp.w REMAINING,I ; are we done? (4)
bcc .top ; (8/10)
addq #1,COUNT ; we found a prime (4)
subq #1,REMAINING ; do what dbeq didn't (4)
bra.s main ; (10)
I found this:
http://www.keil.com/benchmarks/sieve.asp Quote:
Continue until the next remaining number is greater than the square root of the largest number in the original series. In this case, the next number, 7, is greater than the square root of 25, so the process stops. The remaining numbers are all prime.
Yeah, that is NOT what the 65816 code does.
Reviving something pages ago
bregalad wrote:
For me it sounds much simpler to use a digital palette index as input rather than using an analogic input video signal.
I was thinking something like bregalad did earlier in this thread, but jacking it into the NES PPU ext. It would require severing the ext-gnd connection if i'm not mistaken?
psycopathicteen wrote:
I found this:
http://www.keil.com/benchmarks/sieve.asp Quote:
Continue until the next remaining number is greater than the square root of the largest number in the original series. In this case, the next number, 7, is greater than the square root of 25, so the process stops. The remaining numbers are all prime.
Yeah, that is NOT what the 65816 code does.
It's not what any of the sample code in BYTE does, either; it's possible that the contributor implemented an optimization like that on his own (which would've been disingenuous, I think), but unfortunately we'll never know.
The 8086 cannot possibly be 4 times slower than the 68000 either.
adam_smasher wrote:
isn't ADC PRIME 5 cycles rather than 4?
It's a direct-page instruction in the original (6586h, where PRIME is $86), and I don't see any instructions modifying DP (which starts at zero), so no. It's 4 cycles.
So I just thought about this thread and the idea again. It's fun because back when it was created I thought the idea wasn't interesting (i.e. there's so many exisitng retro systems, while create your own ?), and then I had more and more thoughts, and finally I think it's an interesting idea. The problem is that you have to know what exactly you want to do, and there's several ways to make a system "retro" :
- A) By using actual chips that existed in 1980s and restrict yourself to use that only
- B) By using modern chips (microcontroller, CPLD) chips to simulate chips existing in the 1980s or chips that could have been designed in the 80s, while still have an old style system architecture
- C) By using a FPGA development kit and end up with a system-on-a-chip, simulating an entiere system architecture that could have been built in the 80s.
A) means you'll have to desolder chips from existing hardware. Not a huge deal, but you'll need to find old computer, arcade boards or whathever as a source for chips, as I do not think you can buy modern replacement for video and sound chips of the 80s. So the availability of old, mass manufactured hardware that is not shameful to destroy will be an important part of design choices. Even if you restrict to the most common chips, you'll probably have to stack up several of them in order to not have too much graphical and sound limitations.
GRAPHICS : The TMS9918 chip seems to the only "general purpose" graphical chip ever made, all other chips being very specificly made for one particular system. To get something analog to the NES in quality of graphics, you'd need at least 2 TMS9918 for backgrounds and 3 for sprites, but then you'll probably want your sprites to be always above the backgrounds. So you'll have to use at least 4 TMS9918 chips, the first one doing only background, the 2nd one doing background + sprites and the last 2 doing only sprites. Also even with that configuration, you're still limited to 8-pixel granularity scrolling and 16 colour palette, so graphics will still be slightly inferior as opposed to the NES. The only other realist option is to reuse graphical chips from a specific system that happens to be widely available and not shameful to destroy.
SOUND : For sound you have a large array of YM sound chips available, which is very nice. You could get them from old sound cards, arcade boards, video game consoles or keyboards so they're widely available. You can also team smaller and widely available sound chips such as the SN76489, alone it is quite shitty but 2 or 3 of them can create plenty of sound channels and you can combine them to generate more interesting sounds and do effects such like chorus and echo.
CPU : While you could use a 6502 or Z80, I think the most reasonable would be to use the 68000, because it is supported by GCC, and hence there'd be no need to code anything in assembly. The only reason I see to use a 6502 or a Z80 is if you are specially found of those CPUs or if you're porting a game you already wrote in assembly for those processors, for example if you're porting a NES game to your custom system, then it makes sense to keep the 6502 CPU and change only the graphics/sound/input code, in order to avoid a complete rewrite. But a complete rewrite in your favourite high level language is probably less a hassle than a complete rewrite in assembly, for example when translating from 6502 to Z80.
RAM/ROM : I don't have strong opinion but if you can it's probably better to do what the NES didn't : Keep VRAM and RAM on the same chip, it worsen system performance as a whole but you have only one bus, one adress space and it makes the overall design simpler and less sensitive to tight timing issues. You need a bus that interleave access between the CPU and the graphics chip.
B) Basically the same as above, but you can replace the RAM and CPU by a modern microcontroller, reducing part count. Or keep a genuine vintage CPU if you prefer. You can use a real vintage sound chip, and pair it up with either a CPLD or a second microcontroller to generate graphics. The ability to mix modern and old technology at will makes building such a system less a hassle, especially when it comes to generating graphics since dedicated graphics chips aren't widely available. I guess even if you use a modern chip just for graphics and keep everything else retro, it'll make everything a lot simpler and less limited already.
C) Is pretty much open and doing a game console based on a FPGA devkit wouldn't be very hard. However I do not think it's terribly interesting by itself, basically you could do the same with a lot less effort just by coding a game that looks and sound retro using the Raspberry Pi, it'd be the exact same results but a lot less hassle.
It's something else since the parts aren't vintage, but I'll mention Uzebox which is pretty much a video game console made out of a single microcontroller - it seems impressive ! However the graphics are lower resolution as the NES. I'll try to get one someday.
Bregalad wrote:
The TMS9918 chip seems to the only "general purpose" graphical chip ever made, all other chips being very specifically made for one particular system.
I think there might possibly be some argument to be made about the AY-3-8900 video IC used by the Intellivision. But I might be being misled by the widespread subsequent success of the -8910/2/4 sound ICs.
Quote:
You can also team smaller and widely available sound chips such as the SN76489, alone it is quite shitty but 2 or 3 of them can create plenty of sound channels and you can combine them to generate more interesting sounds and do effects such like chorus and echo.
I thought I already posted this somewhere here, but there's a series of arcade machines (Mr. Do's Castle, Mr. Do's Wild Ride, Do! Run Run) that uses four.
Listening to the in-game music on youtube is underwhelming, unfortunately.
I wonder how fast an ARM and a frame buffer can be used to draw graphics, if you make circuitry to ignore byte writes if the pixel is color 0. With an ARM you can write 4 pixels at once.
lidnariq wrote:
I think there might possibly be some argument to be made about the AY-3-8900 video IC used by the Intellivision.
It's graphical possibilities are extremely shitty, though.
Quote:
I wonder how fast an ARM and a frame buffer can be used to draw graphics, if you make circuitry to ignore byte writes if the pixel is color 0. With an ARM you can write 4 pixels at once.
Sure but it's not retro at all. If you use an ARM chip to simulate a tilemap and sprites or something in the like, then it's like solution B).
The 6845 CRTC can be use to do the timing for video signals; you can then add the other logic to do other stuff. They need a input clock in tiles, since 6845 is doing all of the timing in tiles, not in pixels. The registers to program 6845 are then used to program the address, the height of tiles, how many tiles per row, how many rows, the timing offsets, and the address of the cursor. You can then add the other logic to take the address, row number in the tile, whether or not it is the cursor, whether or not the visible part of picture (these things are output from 6845), to make the picture.
That's pretty dang interesting. Yeah I think that's a good idea. The only limitation is that lines would have to be an even number of pixels so you can't have 341 pixels/cycles per line.
The 6845 is ... not obviously useful to me? Its big advantage is that it lets you switch between multiple pixel clocks, or multiple monitor scanrates, and cleanly let you pack RAM when your active screen isn't a power of two wide.
The contemporary version of it supported a maximum tile rate of 2.5MHz, and a maximum scanline length of 256 tiles. Obviously it could be sped up (as the VGA does..) 341 has two prime factors, so you could use 11-pixel-wide tiles, and fit 23 on-screen on a scanline of 31 tiles. But 11 is a pain...
For a fixed-rate system, it really feels like it just complicates things without obviously helping. Building a simple system to draw out a tilemap display is quite easy (e.g. the Sprint 2 arcade hardware or the PlayChoice 10 help screen are both quite simple)
Motorola 6845 datasheetpsycopathicteen wrote:
That's pretty dang interesting. Yeah I think that's a good idea. The only limitation is that lines would have to be an even number of pixels so you can't have 341 pixels/cycles per line.
The CRTC needs to be driven at character rate by dividing what the datasheet calls an "external dot counter". This means there already needs to be a divider in front of it, and the hsync signal from the CRTC could reload this divider to shorten or lengthen a character, allowing a wider choices of scanline period.
The NTSC standard specifies a total scanline period of 227.5 cycles of the color subcarrier. Assuming a 21.47 MHz master clock (six times color subcarrier), the standard scanline is 1365 master clocks. So for a TMS9918-style 5.37 MHz dot clock and 256 to 280 pixel* picture width, you could program the CRTC for a horizontal total of 42 characters (336 pixels; 1344 master clocks) and use some circuitry triggered by hsync to "stretch" one character by an additional 21 master clocks. Normally in a 240p system, you get 262 lines per frame, or 59605 cycles. But this would produce an annoying Neo Geo-style still pattern of chroma dots. That can be fixed by alternating the subcarrier phase from one frame to the next, which can be done by shortening the three vsync lines from 1365 to 1364 pixels, giving a dot crawl that's as perfect as it'll get.
The other way to do this is to divide the master clock by 3.5, giving 288 to 320 square pixels* out of a 390-pixel line. Program the CRTC for 48 characters (384 pixels) for the same 1344 master clocks, and use the same clock stretching.
* First pixel count is action safe area; second includes overscan area.
Are 227.5 cycle scanlines really the best for pixel art? I think a comb filter would blur diagonal details too much.
You can probably do sprites on a microcontroller, but tile maps with discreet logic.
A lot of older arcade machines do sprites in discrete logic too. (e.g. look at the Sprint 2 or Pac-Man hardware: both provide a small number of non-multiplexed sprites).
The biggest difference between the NES and older hardware is the automatic multiplexing of sprites to ... uh, "blitters".
I've been kinda wanting to build an all-new-parts copy of the Pac-Man arcade hardware, but can't quite justify it.