Second topic
"I want to see Chip-8 on NES so that I can extend an emulator stack one higher. Imagine Chip-8 emulator in PocketNES in VisualBoyAdvance GX in Dolphin in Wine in Virtual PC or VMware." - tepples
2 of the main problems I can think of are:
1) Screen resolution- Easy to solve, just have 16 tiles with all possible values of 4 bits to simulate 4x4 tiles
2) Memory- Without expansion RAM, the NES only has 2KB of RAM and CHIP-8 has 4KB. However, we can get an extra kilobyte from the unused nametable, and almost 512 bytes from the fact that some of the main nametable is unused. We can solve the garbage data with forced blanking (timed with sprite 0 hits and timed code) and get more time to read and write to the to the PPU midframe and use OAM memory (disabling sprites) or attribute tables (duplicating palettes) for the rest. Hopefully this will give us 3.5 KB, and 512 bytes are reserved for the interpreter on CHIP-8 so that should be enough. I haven't worked out the numbers so I may be wrong. We can get more memory for the interpreter by duplicating tiles 16 times, so we can used the unused 4 bits to get more RAM for the interpreter. We only need to emulate 9 cycles per frame (540Hz), so there should be enough time.
Thoughts?
Another problem:
3) Input- The CHIP-8 (or is Chip-8?) has a 16-key hex keyboard whereas the NES controller only has 8 buttons. An obvious solution would be to have a screen for mapping NES controller buttons to CHIP-8 keys, or use 2 controllers. Another way would be to use button combinations or modifier button (if NES A is pressed, NES B means CHIP-8 5, but otherwise it means CHIP-8 F).
I can't seem to find information about it (maybe the original pages are gone?), but a long time ago I remember reading about a CHIP-8 emulator for the Vectrex, which has even less RAM, and it ran into the lack of memory problem too.
The author of that emulator made the realization that most games treat most of the address space as read-only, aside from some games like the Pac-Man clone (Blinky), and you could put the addresses actually treated as RAM in RAM while the rest went in ROM. That requires looking at how each game works, but CHIP-8 games are so simple that it's probably not very hard.
Edit: found
some information on that via Internet Archive, though it's mostly just confirming what I remembered
This convoluted RAM layout is not something I'd feel comfortable working around... I'd much rather just add extra RAM to the cartridge and have a nice contiguous block of memory that can be used without special tricks.
NovaSquirrel wrote:
treat most of the address space as read-only
But that wouldn't be completely accurate and is less fun.
Tokumaru, again for fun. I suppose it would help with a competition, too?
Is anyone interested in making this? I don't have enough experience yet- I know a lot about how the NES works in theory from browsing the forums but can't do much assembly. I could do it myself one day, probably.
I have considered making a CHIP-8 emulator for the NES many times, but the lack of interesting CHIP-8 games to play ends up discouraging me. Most of the games look too glitchy to be fun, IMO. If there was a cool platformer with smooth physics for the CHIP-8, I'd feel much more compelled to write an emulator. The original CHIP-8 looks fairly easy to implement, but I definitely wouldn't use this absurd memory layout, I'd just use extra RAM.
That's sort of what I was interested in. Is there a well-known repository of CHIP-8 games distributed as free software with which to test an emulator?
Quote:
Doing things like this in a machine you KNOW can handle it is not fun, it's just convenient. The fun is all about trying the unusual. Emulating a GB on the NES is not about the goal (playing Game Boy games), but the means (NES simulating another machine).
- tokumaru
Sorry for all of the short posts but will most CHIP-8 games work well with overscan? Also why not make a Mario game? Would it work? I could try to learn to make it. Would it be possible to make music with the single-pitched beep? How does it work with 9 cycles per frame?
orlaisadog wrote:
Quote:
Doing things like this in a machine you KNOW can handle it is not fun, it's just convenient. The fun is all about trying the unusual. Emulating a GB on the NES is not about the goal (playing Game Boy games), but the means (NES simulating another machine).
- tokumaru
Haha OK, an emulator running on the NES is already novel enough, I don't see why avoid using something that was common back in the day (extra RAM) and make the development significantly harder or even impossible, since even all the sacrifices won't get you all the RAM you need. Remember that you don't need just the RAM for the CHIP-8 program, the emulator itself will need a decent amount of RAM to function.
orlaisadog wrote:
How does it work with 9 cycles per frame?
I don't get what you mean by 9 cycles per frame... 540Hz aren't enough to do anything that vaguely resembles a game... Where did you get that number from?
EDIT: OK, I've seen this 540Hz figure thrown around, but I really don't get how that relates to clock speed.
tokumaru wrote:
orlaisadog wrote:
Quote:
Doing things like this in a machine you KNOW can handle it is not fun, it's just convenient. The fun is all about trying the unusual. Emulating a GB on the NES is not about the goal (playing Game Boy games), but the means (NES simulating another machine).
- tokumaru
Haha OK, an emulator running on the NES is already novel enough, I don't see why avoid using something that was common back in the day (extra RAM) and make the development significantly harder or even impossible, since even all the sacrifices won't get you all the RAM you need. Remember that you don't need just the RAM for the CHIP-8 program, the emulator itself will need a decent amount of RAM to function.
orlaisadog wrote:
How does it work with 9 cycles per frame?
I don't get what you mean by 9 cycles per frame... 540Hz aren't enough to do anything that vaguely resembles a game... Where did you get that number from?
EDIT: OK, I've seen this 540Hz figure thrown around, but I really don't get how that relates to clock speed.
Here. How much should it be? As fast as we can?
P.S. I'm saying "we" because someone else may do it
Yeah, I've seen the number being mentioned, but I'm not sure what it means. It definitely isn't CPU cycles as we're used to measuring on the NES, because 540 instructions per second (assuming each instruction is one cycle) isn't nearly enough to make a game with any sort of real time interaction.
tokumaru wrote:
Haha OK, an emulator running on the NES is already novel enough, I don't see why avoid using something that was common back in the day (extra RAM) and make the development significantly harder or even impossible, since even all the sacrifices won't get you all the RAM you need. Remember that you don't need just the RAM for the CHIP-8 program, the emulator itself will need a decent amount of RAM to function.
As I explained you can get 3.5KB, and at least an extra 256 from 4 bits of the tile data and maybe more from OAM. I just realised that attribute data is not extra data as it is included in the name table RAM. Anyway it's fun, cheaper to make a cartridge out of and more impressive that way.
tokumaru wrote:
Yeah, I've seen the number being mentioned, but I'm not sure what it means. It definitely isn't CPU cycles as we're used to measuring on the NES, because 540 instructions per second (assuming each instruction is one cycle) isn't nearly enough to make a game with any sort of real time interaction.
That's what I was saying. What is it then?
In
this page someone mentions using a clock speed between 500Hz and 1MHz, which doesn't sound right, so maybe they actually mean KHz rather than Hz? If that's really the case, then we might be in trouble... 9000 instructions per frame really wouldn't be possible.
tokumaru wrote:
In
this page someone mentions using a clock speed between 500Hz and 1MHz, which doesn't sound right, so maybe they actually mean KHz rather than Hz? If that's really the case, then we might be in trouble... 9000 instructions per frame really wouldn't be possible.
CHIP-8 was designed for old computers with 4k of RAM, so...
That page wrote:
a function we can tick at 500Hz
So it is 500Hz?
I don't know, 500Hz sounds like too little, 500KHz sounds like too much... I have no idea.
I think there are more CHIP-8 emulators out there than CHIP-8 games!
I've been reading about the CHIP-8 and it does seem like games can run at fairly slow speeds...
This is an emulator that lets you select the number of cycles per frame, starting at 7 and going all the way up to 1000. That seems very doable on the NES, which could probably even handle an SCHIP implementation.
tokumaru wrote:
I've been reading about the CHIP-8 and it does seem like games can run at fairly slow speeds...
This is an emulator that lets you select the number of cycles per frame, starting at 7 and going all the way up to 1000. That seems very doable on the NES, which could probably even handle an SCHIP implementation.
1000? OK, I found out that there are 29780.5 CPU cycles per frame. That's about 30 cycles for 1 cycle, or about 15 the way I'm thinking because there can't be PPU memory access in rendering time. Is that enough? I don't know much assembly, either CHIP-8 (none at all) or NES (a tiny bit).
1000 would be too much for the NES to handle, because even though many of the operations are simple, you still have to fetch the instruction, decode it, update the VM state, and so on, for every instruction. Also, drawing sprites is a fairly slow process, since it involves bit shifting, bitwise operations, and even more operations to check if any pixels were erased, all of this between several bytes, but the whole thing still consumes 1 cycle, apparently.
Most examples on that page work fine at 100 cycles per frame though, or even less, and that's fairly realistic for a possible NES version. That should be made configurable on the NES as well, IMO.
tokumaru wrote:
1000 would be too much for the NES to handle, because even though many of the operations are simple, you still have to fetch the instruction, decode it, update the VM state, and so on, for every instruction. Also, drawing sprites is a fairly slow process, since it involves bit shifting, bitwise operations, and even more operations to check if any pixels were erased, all of this between several bytes, but the whole thing still consumes 1 cycle, apparently.
Most examples on that page work fine at 100 cycles per frame though, or even less, and that's fairly realistic for a possible NES version. That should be made configurable on the NES as well, IMO.
It could end up variable. If we can put the most-used variables in non-PPU RAM it could give a speed boost.
Edit: As NovaSquirrel said, it would be possible to analyse the ROM to see what is never changed, but my RAM layout could act as a fallback?
I've just realised that OAM memory will decay if forced blanking is used. Could sprites be turned on occasionally during emulation to keep it refreshed or would there not be enough time?
Edit: Offtopic but I just read that the decay is affected by temperature. Could there be a NES thermometer?
For it to make any sense (to me) i think you need a method to enter s/chip-8 programs into the interpreter. That makes WRAM (battery backed?) all the more important.
you could either transfer the programs to RAM via an usb to nes port cable and an app on your pc, or it could work like family basic, complete with a suitable interface. Most likely something T9-inspired since keyboards aren't an option. 35 opcodes would be no match for a d-pad hexnumpad + action button.
https://en.wikipedia.org/wiki/T9_(predictive_text)
Unless it's like PocketNES for Game Boy Advance, where you choose a bunch of NES ROMs when you build a GBA ROM.
I thought it would be like PocketNES, maybe with a GUI with options on graphics (affects CHR ROM data so cannot be changed at runtime). However, an IDE would be interesting. It would probably need the Family BASIC keyboard though. Oh, I didn't read your post properly.
FrankenGraphics wrote:
Most likely something T9-inspired since keyboards aren't an option. 35 opcodes would be no match for a d-pad hexnumpad + action button.
What about an assembler with the attached layout (I haven't made anything properly yet)? The categories come from
https://en.wikipedia.org/wiki/CHIP-8 and the names from
http://devernay.free.fr/hacks/chip8/C8TECH10.HTM. Start and select change the active category. These are listed on the left, and the active one is brighter. A graphics editor would also be possible. There would also be a PocketNES-like version with no save RAM. The labels would be named automatically but the names could be changed.
I wouldn’t hardcode specific opcodes to the controller - you need it to be able to edit code!
Rather, i’d keep 16 soft buttons on the lower portion of the screen. They’d change title on context. So, the 8th button would be labeled MATH. if you push it, you then set x id (buttons now labeled 0-F), then y id (same), then operation to be carried out between them (OR, AND, XOR, ADD and so on).
D-pad = cursor
A = confirm
B = go back a step and erase
Sel + A or B = step forward or backward without editing
Sel + d-pad = move edit cursor freely.
// remember to update the soft key readout contextually/accordingly.
Start = takes us to a menu from where you load, save and run programs and change file metadata like emulation speed, if possible.
edit: you could perhaps let the player use 2nd controller for user defined hotkeys. maybe initialized as
the most common opcodes used as per frequency count in public domain games.
edit2: you can speed up this input procedure even more by predicting likely followup instructions, and set the cursor to the corresponding soft button after one opcode is completed. For example, a chip-8 branch instruction (which always increases PC by an absolute 2 if taken) will mostly be followed by a jump or jump to subroutine, so in a LUT, it makes sense to let all branch cells point the cursor to jump. Other follow-up predictions can be derived from a frequency analysis of pairs in the freely available chip-8 library, just like how we can sort the frequency of the most used single opcodes.
An analogy is
letter frequency analysis. A
table of (opcode) pairs where one axis represents the first of the two could then be frequency sorted in order to generate the LUT to be used.
I suppose cursor positions for X and Y register ID:s are well enough predicted to often be the same or neighboring ID:s, which also makes the editor simple in this respect. Just keep the last used x and y ID:s in RAM.
edit3:
sel + start = toggle insert mode editing
orlaisadog wrote:
I've just realised that OAM memory will decay if forced blanking is used. Could sprites be turned on occasionally during emulation to keep it refreshed or would there not be enough time?
What would you be using OAM for here? The CHIP-8 "sprites" seem to be more accurately called a bitblit operation...
Turning on sprite evaluation for just a scanline or two will corrupt its contents unless you turn it off at the exact right time, or let it turn itself off naturally at the bottom of the screen redraw. It's almost certainly easier to just save the 256b shadow copy of OAM and re-upload it.
Quote:
Edit: Offtopic but I just read that the decay is affected by temperature. Could there be a NES thermometer?
Probably! But it'd require a lot more characterization than we've done yet.
Perhaps instructions' cycle counts in the original interpreter for RCA 1802 might inform roughly how long we have for each instruction in an interpreter on NES or Game Boy, including sprite blits, though we probably don't need to be anywhere near cycle accurate. Is that interpreter archived anywhere accessible?
lidnariq wrote:
orlaisadog wrote:
I've just realised that OAM memory will decay if forced blanking is used. Could sprites be turned on occasionally during emulation to keep it refreshed or would there not be enough time?
What would you be using OAM for here? The CHIP-8 "sprites" seem to be more accurately called a bitblit operation...
Turning on sprite evaluation for just a scanline or two will corrupt its contents unless you turn it off at the exact right time, or let it turn itself off naturally at the bottom of the screen redraw. It's almost certainly easier to just save the 256b shadow copy of OAM and re-upload it.
Quote:
Edit: Offtopic but I just read that the decay is affected by temperature. Could there be a NES thermometer?
Probably! But it'd require a lot more characterization than we've done yet.
It could be as as extra RAM if we do forced blanking most of the time.
Well, that's why you wouldn't use the OAM shadow, but why do you need OAM at all?
Also, you can store the OAM shadow in ROM, although that's wasteful and constrains what you can show.
lidnariq wrote:
Well, that's why you wouldn't use the OAM shadow, but why do you need OAM at all?
Also, you can store the OAM shadow in ROM, although that's wasteful and constrains what you can show.
I mean the OAM itself as extra RAM. We can do forced blanking during game logic.
Oh, gosh. That's certainly clever, but given that OAM isn't readable on older PPUs (2C02A,B,C,D,E, most Famicoms) and OAM can only be accessed sequentially on the 2C02G (writes to OAMADDR cause corruption), I doubt it's practical.
lidnariq wrote:
Oh, gosh. That's certainly clever, but given that OAM isn't readable on older PPUs (2C02A,B,C,D,E, most Famicoms) and OAM can only be accessed sequentially on the 2C02G (writes to OAMADDR cause corruption), I doubt it's practical.
Oh, OK.
Is anyone considering making this?
Why don't you give it a shot?
rainwarrior wrote:
Why don't you give it a shot?
orlaisadog wrote:
I don't have enough experience yet- I know a lot about how the NES works in theory from browsing the forums but can't do much assembly. I could do it myself one day, probably.
That one day could be any day you choose, even today.
rainwarrior wrote:
That one day could be any day you choose, even today.
But I have to learn first, and that could take a while.
This project would take more than one day even if you were already an NES expert. Learning will add more time, but consider it still part of the project.
Anyhow, if that's not the answer you were hoping to hear, I will just say that most people have their own projects they're interested in, and have no reason* to want to fulfil your wish. There are always a lot more interesting ideas around than there are capable developers to implement them. If you're hoping to get a wish, you need to wish for something a lot smaller than this project. You'll find a lot of people here are very willing to help you learn by answering questions.
* I'd do it for a suitable wage.
While this isn't the simplest thing to do on the NES, it isn't the most complex either. There's actually very little NES-specific stuff involved... once you figure out the screen drawing and input, the rest is basically pure 6502 logic.
Ditch the crazy RAM layout though, there's no way that'll work. OAM isn't even a full 256 bytes you can use, since the unused bits of the sprite attribute bytes aren't even stored.
tokumaru wrote:
While this isn't the simplest thing to do on the NES, it isn't the most complex either. There's actually very little NES-specific stuff involved... once you figure out the screen drawing and input, the rest is basically pure 6502 logic.
Ditch the crazy RAM layout though, there's no way that'll work. OAM isn't even a full 256 bytes you can use, since the unused bits of the sprite attribute bytes aren't even stored.
I worked it out. We have 3.5 KB, no OAM needed.
tokumaru wrote:
Ditch the crazy RAM layout though, there's no way that'll work. OAM isn't even a full 256 bytes you can use, since the unused bits of the sprite attribute bytes aren't even stored.
But that's the fun part...
Anyway, 2KB of onboard RAM, plus 1KB for the unused nametable. I need 16 full-width rows of tiles out of 30 for the screen so I have 14 left, so (960/30)*14=448 bytes. Add 64 because we can use the whole attribute table, and... 512 bytes. Add that, and we have 3.5KB. No OAM needed. I'll explain how I have interpreter RAM later, but I already have.
And software is free. Hardware is not.
orlaisadog wrote:
We can get more memory for the interpreter by duplicating tiles 16 times, so we can used the unused 4 bits to get more RAM for the interpreter.
orlaisadog wrote:
Add that, and we have 3.5KB.
That's just not enough... Even if you could somehow identify what's constant and what's variable in any given CHIP-8 program, you'd still have to map this somehow, in order to know what goes in ROM and what goes in RAM. In addition to that, you need RAM for the emulator itself, there's a lot of state to maintain.
You're only doing the math and adding up as much RAM as you can, but like you said, you don't have much programming experience, so you're probably not thinking of practical ways to allocate and access all this scattered RAM. Trust me, this won't be fun, or efficient.
If you insist on using a "bare bones" cartridge configuration for this, you could go with a CHR-RAM cartridge (e.g. UNROM), and you'd have an entire pattern table (i.e. exactly 4KB) to work with, since only one would be necessary to draw the screen. Being able to access it only during vblank or forced blanking would still be pretty limiting, but if so many CHIP-8 programs work fine with so few instructions per frame, that might just work.
But we only need 3.5K. Anyway it should only take a few instructions to access and I can program, just in Scratch (very well) and Python (not so well). Scratch is like any other language, but people think it's simple as it's block-based but if you try hard enough you can even do proper polygon-based realtime lit 3D.
tokumaru wrote:
crazy at the software level, which makes me much more confident
And CHR-RAM would need PPU access anyway. I need to read posts all the way through...
If you decide to go through with the project, you're obviously free to implement this in any way you want. I'm just warning you, as an experienced programmer, that the kind of memory mapping you're trying to do in real time is extremely impractical, and will add a lot of unnecessary complexity and computation overhead.
I do understand the fun aspect of a programming challenge, but there's a difference between fun and masochism. Simulating another machine inside the NES is fun, but it's a known fact that to simulate a machine, you need a superset of said machine, and trying to do it with less resources than that will only lead to frustration and/or subpar results.
Maybe it's just me, but I'd much rather claim that "I made a proper CHIP-8 interpreter on the NES that runs all programs well", than that "I made a CHIP-8 interpreter that runs some programs and does it at a fraction of the speed the NES CPU allows because my memory layout is so limited". If a "challenge" significantly compromises the workflow and quality of the final product, that's not fun, it's unnecessary masochism.
I'll stop arguing about this now. This is not my project, so if you're really going to pursue this, you're free to do it however you want. I'm just trying to give you GOOD advice, since as an unexperienced NES programmer you might not have considered all the ramifications of the crazy idea that is simulating a contiguous block of memory using all the bits and pieces of memory the NES has, and that's without even considering that you don't need only the memory the CHIP-8 uses, the interpreter itself needs RAM to function. You're just looking at the raw numbers and going "yeah, 3.5KB is almost 4KB, that'll do it" without actually considering HOW you're gonna make that work.
Well, 512 bytes are reserved so it's fine The CHIP-8 only has 3.5KB of usable RAM, so the figures are perfect. Anyway, isn't that what NesDev is about? By the way, I'm not saying this is the best way, I'm just discussing how it could be done.
I think it'd be a lot easier to start with NROM with expansion RAM (and solve all the "how do you emulate CHIP8 at all"),
followed by porting to something with CHR-RAM (and then solve the "how to emulate CHIP-8 with limited bandwidth to the emulated CPU's RAM)
and only then port to this "no cart memory" model (and solve the "how to deal with fragmented memory)
This process of breaking it into manageable chunks is basically what one learns in first-level classes in project management.
^
Sound advice. It's a logical three stage rocket.
Note that figuring out how to get it to work with wRAM is the most feature-light version. Beginning with identifying the simplest thing that could work is always a good starting point.
orlaisadog wrote:
2KB of onboard RAM, plus 1KB for the unused nametable. I need 16 full-width rows of tiles out of 30 for the screen so I have 14 left, so (960/30)*14=448 bytes. Add 64 because we can use the whole attribute table, and... 512 bytes. Add that, and we have 3.5KB. No OAM needed. I'll explain how I have interpreter RAM later, but I already have.
Tokumaru's response mentioned that it's not practical, but maybe it should be spelled out:
The PPU's RAM is not equivalent to the CPU's RAM. You can't treat them interchangeably. You can only access PPU RAM from the CPU during vertical blank (~10% of the frame time), and that access is much slower than with CPU RAM, so whatever you're planning to do here could easily end up cutting the speed of emulation down 50x.
Even with CPU's onboard RAM, probably 3 pages (768 bytes) of this aren't really usable for your emulation. (The three relevant concepts are: the Zero Page, the Stack, and the OAM buffer.)
orlaisadog wrote:
No OAM needed.
You might think the internal OAM is another 256 bytes of storage, but in reality this is even
less practical as general purpose RAM than PPU RAM is. The constraints for this one are kind of insane, so I won't go into it, but under normal circumstances this is considered "write only" memory. (This is also related to why we usually reserve a 256 byte page of CPU RAM as a buffer to be copied into OAM.)
orlaisadog wrote:
And software is free. Hardware is not.
Software is not free. It costs time.
The hardware for 8k WRAM will add <$1 to a board. The software constraints you're putting on will add many, many hours of extra work to the project, or more likely just make it impossible. That $1 per board would normally only matter if this was a mass market venture where you could sell enough copies to overcome the extra work it necessitated.
According to bootgod's databse
17% of NES games had WRAM, (and 25% had CHR-RAM) so it's really not an unusual thing to have in the cart at all.
Another thing to consider is how to get your CHIP-8 program into the cartridge. That 8k WRAM could be treated as battery backed RAM, allowing you to use it to store the individual program, which is probably a lot more convenient than having to recompile each program directly into the ROM.
rainwarrior wrote:
You might think the internal OAM is another 256 bytes of storage, but in reality this is even less practical as general purpose RAM than PPU RAM is. The constraints for this one are kind of insane, so I won't go into it, but under normal circumstances this is considered "write only" memory. (This is also related to why we usually reserve a 256 byte page of CPU RAM as a buffer to be copied into OAM.)
The OAM isn't even a clean 256-byte block, since the unused attribute bytes are not saved anywhere. And there's also the fact that in some revisions of the PPU (only used in early Famicoms, it seems) the OAM isn't readable.
Quote:
The hardware for 8k WRAM will add <$1 to a board. The software constraints you're putting on will add many, many hours of extra work to the project, or more likely just make it impossible.
Another very good point.
Quote:
That 8k WRAM could be treated as battery backed RAM
Being able to save your program so you can work on it between play sessions is definitely a plus, and would also simplify the process of loading other people's programs when using emulators and Flash carts.
As I said (many times) I'll use forced blanking for more PPU communication time. What you said about RAM is a good point, but I enjoy programming so the time thing isn't relevent for me.
orlaisadog wrote:
As I said (many times) I'll use forced blanking for more PPU communication time.
Vblank is only one of several factors though. Even you could access it during the whole screen I would still say it's impractical. Each random PPU RAM access takes at least ~7x as many instructions as an equivalent random CPU RAM access (and potentially much more). That's without even considering extra instructions needed to implement your complicated memory layout decoding. (Serial access is not that bad, but you need random access for this application.)
orlaisadog wrote:
I enjoy programming so the time thing isn't relevent for me.
Time is relevant for us all, you're not exempt. ;P It's yours to use, though, so have fun.
But what else would I do in that time? This teaches me skills do I can become a programmer when I'm older.
I have other things to do in computer time (yes I'm a child and I'm only 13) but they are no more useful than this. Anyway I don't know enough. Yet.