I've just had a new idea (while it's not totally new, it's still rather unexperienced) way to handle object's programm, and it's really great, because it's much more comfortable, stable and optimised than the old way.
When coding objects "normally", the programm should call a routine that does the frame of the object. Since that routine is actually called every frame (and supposed to return), the code is palced in a sequential structure, it's just done every frame and have no control over itself. The only way to handle complicated object is to have variables that tells "what the object is doing", and possibly a jump table from this. Since you'll have to repeatly change this variable, and create a jump table, code is very inneficient. Even worse, it's hard and confusing to write code in this fashion.
However, there is a way to highhack it and write object code "as if it were main code". The object have it's own programm, with its own "reset" vector (the code that will be executed when the object is first created), etc...
Then the object just calls a subroutine when its frame is done, and this can be done from many places, so that code is much easier and optimal. This subroutine actually pull stack adress and store it in variables, and returns to main code. On the next frame, the main code will restore those variables to the stack and "return", effectively executing next frame of objet just where it left. This leads to a very effective, and more confortable, pseudo-linear object code.
Since it's also possible for object code to return (rts) instead of calling the subroutine mentionned above, if it does this the return adress won't be modified, and on next frame the same piece of code can be done in the original sequential fashion, so that both sequential and linear code types are possible (I don't know if that's the correct way to call them by the way).
The only limitation of this is that no additional data should be on stack when the frame is done, so the object code can only call it's pseudo "frame wait" routine on the first level of code, and cannot save any data on the stack at this point. A way to fix it would be to assign different stack pointers to each objects, but overall it's not needed, exept if you have VERY complicated objects. Since an interrupt could trigger during object code, it's stack have to be big enough to hold everything for it, so one doing this have to be carefull.
Nifty, a simplified form of
multithreading where you just save the PC of each thread. So instead of having your code keep track of what "step" it's at in a complex enemy movement, your scheme does so by saving the PC. That would definitely simplify coding, with very little complexity. Here's a little demo I made, which moves 64 objects in rectangles:
minithreads.zip
Code:
obj_thread:
; first frame action
jsr yield
; second frame action
jsr yield
; etc.
jmp obj_thread
run_object:
; Push last return address back on stack then return
; to where object left off
lda obj_pc+1,x
pha
lda obj_pc,x
pha
rts
; Save return address for object
yield:
pla ; Object JSRs here when done
sta obj_pc,x
pla
sta obj_pc+1,x
rts
It also means your game crashes when the object table gets corrupted? So instead of getting a harmless glitchy object that goes away when you leave the area, you get a crash?
Yes, you both get the idea exacly.
The main good point of it is that it allows LOOPS to be performed in your object code, instead of an anoying high amount of checks to only perform a tiny action and return.
However, I don't really like the "multithread" word, because you'd want to think a game engine would end up something like what Linux or Windows is, which definitely isn't. Objects are supposed to interact with the player and still have basic function independant on their very own, they're not an entiere programm in themselves. Also, it's still on the main programm to decide whenever each object is perfored, and up to the object to decide when it's done, there is no automatic switching, RAM alocating or anything, this makes it perfect and simple for NES games.
Also, I like the idea to have the object also able to just rts if it doesn't want to do anything complicated, so that the adress is constant.
As long as you put the object initial code adress-1 to it's own variable set when creating it, there is no reason to get any crashes.
This demo really is a perfect exemple of this principle, I knewn I haven't completely invented it, but for some reason I just haven't figured how this would simplify my life when coding objects for my current game until yesterday.
Your demo remembers me Gradius, when the second boss where a huge amount of boxes are moving at once without lagging. And when the game lags, it's status bar doesn't move a single bit even tough the game uses only sprite zero-hit to get its timing.
I don't know if this is such an advantage over the jump table method.
I have a set of pointers for all the actions an object can perform. When the object is called, it's "state" is used as an index into that jump table. So there is the "walking" state, the "attacking" state and so on, and setting a new state from the current one is not trouble at all.
As for maintaining their variables and such, I have a big chunk of RAM dedicated to the currently loaded objects, and they use their index (among the currently loaded object, that is, the index of the slot they occupy) to access it. This means that each object has a few bytes of RAM to use freely while they are loaded (although some of the locations have specific meanings, such as the coordinates used by the sprite drawing routine, that then becomes able to draw an object by knowing only it's slot index).
I think objects are rarely animated as in blargg's example, with different frames every time... It's more common that the same block of code repeats over and over, such as the "walking" code, and I feel that the jump table works better for this. Although you could simply jump back after a call to "yield" in your example, achieving a similar effect.
However, I still feel that the jump table method is more organized, and the tables don't waste much more space either, because most objects have very few different actions.
Dwedit wrote:
It also means your game crashes when the object table gets corrupted? So instead of getting a harmless glitchy object that goes away when you leave the area, you get a crash?
No different than if you corrupt something else critical, like the return stack, other addresses in memory, critical player information like position or stage number. Even with a jump table, the index could be made invalid so that it reads a junk address (unless you check the range on every lookup, slowing things down). Besides, getting a crash would make you more likely to find the bug.
Bregalad wrote:
However, I don't really like the "multithread" word, because you'd want to think a game engine would end up something like what Linux or Windows is, which definitely isn't.
Multithreading doesn't imply bloat. It's used in tiny real-time operating systems that run on virtually everything these days. Or the demo I posted, which is multithreaded. The reason to use the term is that it helps others more quickly understand what you're doing.
tokumaru wrote:
I don't know if this is such an advantage over the jump table method. I have a set of pointers for all the actions an object can perform. When the object is called, it's "state" is used as an index into that jump table. So there is the "walking" state, the "attacking" state and so on, and setting a new state from the current one is not trouble at all.
The advantage is more natural coding, and the lack of a need for setting up a jump table. But I am not sure how this multithreaded method handles external actions on an object. If the player hits an enemy and he gets knocked back and flashes, how is this handled? It seems there would still need to be some kind of jump table to know how to cause this state change. Bregalad, any ideas?
Quote:
I think objects are rarely animated as in blargg's example, with different frames every time...
The code I pasted above was just to show the essentials of the threading implementation. Download the demo for a realistic example, where each object goes into a loop for a while. Each is written as if it's the only thing running on the NES, which is the central advantage of this approach.
Quote:
As for maintaining their variables and such, I have a big chunk of RAM dedicated to the currently loaded objects, and they use their index (among the currently loaded object, that is, the index of the slot they occupy) to access it. This means that each object has a few bytes of RAM to use freely while they are loaded
Exactly how things are done in the demo code.
I must say that I do find this idea very interesting. I'm just trying to find the real advantage here.
I looked at the demo, and what I said about sequential actions still apply. This works fine for the little boxes that always move right, then down, left and up and start all over again. Since there are a chain of actions performed one after the other, remembering where execution stopped is a useful thing.
Now, imagine if in this demo the next movement was random. After you finish moving right, instead of moving down you picked a random direction. With the "yield" way (BTW, I don't even know the exact meaning of this word... I must remember to look it up in the dictionary), you'd have to JMP to the location that performs the selected movement (defeating the purpose of saving the space used by a jump table, because the address will be there, just in the instruction itself instead of in a table) and then yield.
So, you not only didn't save the space that a jump table would use (because you must have different JMP or BRANCH instructions for each possible case), but you also made some spagetti code (by jumping and branching here and there), instead of having clean, separate routines for each action.
Instead of jumping to the next action and calling "yield", which uses quite a few more cycles to save the PC, wouldn't it be much simpler and faster to just set the new value for the jump table pointer, so that the appropriate action is executed the next time? Does that make any sense?
blargg wrote:
tokumaru wrote:
I don't know if this is such an advantage over the jump table method. I have a set of pointers for all the actions an object can perform. When the object is called, it's "state" is used as an index into that jump table. So there is the "walking" state, the "attacking" state and so on, and setting a new state from the current one is not trouble at all.
The advantage is more natural coding, and the lack of a need for setting up a jump table. But I am not sure how this multithreaded method handles external actions on an object. If the player hits an enemy and he gets knocked back and flashes, how is this handled? It seems there would still need to be some kind of jump table to know how to cause this state change. Bregalad, any ideas?
You could solve this by sending messages to the objects. Basically, you'd have two addresses associated with each object. One is the initial entry point of the AI (which ends up getting saved in the object's memory), and the other is a pointer to a routine to handle messages (doesn't need to be saved).
The theory is that somehow, collisions would be detected, and the colliding objects would be sent a message like "Mario, you collided with a goomba", and then a "goomba, you collided with a Mario". The idea here is that they're going to exchange a message with each other, and this handler just wants to know which message to give to them. Mario looks at his local variables and determines whether he should send a "die" or send nothing. He determines that he's falling, so he sends "die". Goomba sends "bump", which theoretically would cause other enemies to turn around, and would hurt Mario. The messages are exchanged, goomba dies, and Mario discards the "bump" message, because he's hurting the enemy. You could do this by having Mario check his local variables again, or have Mario set some kind of "ignore the incoming bump message" flag.
I'm not sure if this is the most optimal way, but the theory seems like it'd work.
Yes, I know my name's not Bregalad.
Yes, this seems to be a problem. An object's response to external events must sometimes depend on its state. If some of the relevant state is encoded into the saved program counter, it won't be easy to decode and act on. If this were solved by having each yield call return any event that occurred during the call, the object would have to check this after each yield call, which would litter code with lots of similar branches. The basic problem seems to be that you really have
two axes of execution: the behavior of the object in absence of external events, and the response to external events. The response to the same external event might be different at different points in the first axis. Execution only goes along one axis, so you need a state machine to handle two.
As a concrete example, consider an object which is shielded on one side and vulnerable on the other. If it were moving left and right, an attack from the left side would have a different effect depending on its direction. If movement were coded something like
Code:
loop:
lda #20
sta counter
left:
jsr yield
dec sprite_x
dec counter
bne left
lda #20
sta counter
right:
jsr yield
inc sprite_x
dec counter
bne right
jmp loop
how would you determine the result of an attack? The encoding of the direction it's facing would be tied up in the program counter value.
In that case, the only way I can think of to do it is to provide some kind of indication as to what behavior is necessary. I mean, this code still is a state machine, it just uses a different method of determining which state it's in (entry points, instead of a single variable).
Code:
loop:
lda #20
sta counter
lda #00
sta state
left:
jsr yield
dec sprite_x
dec counter
bne left
lda #20
sta counter
lda #01
sta state
right:
jsr yield
inc sprite_x
dec counter
bne right
jmp loop
You only have access to what you provide, and even if you're not using a state variable to determine which state to use, you can still use one to indicate which state is active, for the benefit of other code.
Right, so you have to explcitly mark states, losing some of the advantage of the threading. Next, how do you transition between states like this? The normal routine handles these transitions directly, for example the change from left to right. If this object gets hit, say it gets knocked back a bit and flashes for a while. Does this mean the main routine is interrupted while this action occurs? It seems that would require two program counter values: one for the main, and one for this external action. Then you need a way to transition back to the main routine.
It'd be great if we had a game with fairly sophisticated enemy objects to try rewriting in this style, to see what issues really come up.
I really thing there is almost only advantage in this multithread point of view as opposed to the jump table.
1 - This save the space needed from the jump table which is big for complex objects like bosses
2 - This saves the bytes and time to do a million of checks to know in which state the object is before actually acting
3 - This save the bytes needed to write new states and sometimes new values of counters that goes with them
4 - You'll end up with more natural code, easy to write instead of messy headache code
The only true adventage of the jump table approach is that if you want an object to always do something regardless of its state, but want only one object to do it, you can just place it before the variable jump. If you do the linear multithreading approach you'll have to call a routine doing it possibly a couple of time just before or after the "yeild" call, which can waste bytes.
To handle collision I have not this much problem, each object is free to do anything, so it will just call the "standard" collision code, which check for collision with player and that differently if he's attacking or not, etc...
Each object is free to call it's own variant of collision code, which can make it invinsible but still able to hurt the player (not calling it would result in a totally transparent enemy), or to have a shielded side as said above.
When an enemy is hurt, it sets a counter which freezes the enemy for a little while (making it's AI code not run at all for a couple of times), which effecively made it look "stunned". Then the enemy periodically check if it's damage is more than a certain ammount, and deletes its own if so.
As the "no pushed data on stack limitation" was still too limitating for bosses (enormous waste of memory doing the same thing again and again because thread stupping inside a subroutine call were forbidden) I took some measures to improve the multithreading programm :
- A individual stack is simuled from each objects, allowing more flexible subroutine calling and data pushing in their object code
- The stack is restored from object memory to the real stack before running object's thread, and pulled from the real stack to object memory after the object's frame is done
- When an object is created, it's always createed with a stack of 2 bytes containing the object's "reset" entry point adress (minus one)
- By default, objects can stop their thread only if 8 bytes or less are pushed on the stack. Howver, during their thread, they have standard 256-byte stack limit, and so can do heavy subroutine call and be interupted ad nauseum.
Object thread jump subroutine :
Code:
lda ObjectActive.w,X
beq ++
tsx
sta ObjectStackPointer
lda ObjectStackLenght.w,X ;Add stack size (2-8) to pointer
tay
clc
adc ObjectStackIndexTbl.w,X
tax
- lda ObjectStack.w,X
pha ;Push bytes from object's stack
dex
dey
bpl -
ldx DataCtr
++ rts ;Go to object thread OR return to main code
When the object thread is done it can either "rts" to do the same frame again, or call the ObjectWait routine shown below :
Code:
ObjectPrgmWait ;An object should either jsr to here
tsx
txa ;Computes difference between initial stack pointer
eor #$ff ;And actual stack pointer
clc
adc ObjectStackPointer
ldx DataCtr
sta ObjectStackLenght.w,X ;Save lenght of Object's stack (minus one)
tay
lda ObjectStackIndexTbl.w,X ;Get stack index of our object
tax
- pla ;Pull all bytes from the CPU stack
sta ObjectStack.w,X ;To the object's virtual stack
inx
dey
bpl -
rts
ObjectStackIndexTbl
.db $00, $08, $10, $18
.db $20, $28, $30, $38 ;Allow 8 bytes per stack per task (by default)
Another alternative would be to assign different stack pointer to each objects. This would decrease execution time as only SP have to be modified, but it would be limitating to a small amount of stack on each thread, and would dissalow the thread to just "rts" in order to do the frame again.
Next step would be to put the "ObjectStackIndexTbl" to RAM and modify the values in function of how many stack bytes the thread actually needs, so that a small thread could share it's stack with a big thread if needed. Also, a message buffer from thread to thread could be implemented.
However I don't have any idea how to implement those. Any ideas ?
I just wanted to share my experience with this concept of what I call "virtual threads", but is generally called (in higher level languages) "coroutines".
When programming on the GBA, I was bored to use "structs" to store parameters and functions to be called each frame when managing my objects.
So, I wrote a simple context switcher in ARM, and allocated a stack for all my objects. I could then create routines, which were yielding to the main "thread" via "wait_frame()". This main thread had to resume these routines at each frame.
It worked quite well, and I had re-written a lot of parts of my engine with this design (namely fadings, transitions, objects collisions, script system, a lot of complex animations, etc). But some problems arose:
1) Memory usage: allocating multiple stacks IS a real memory issue (at least, on GBA, where a lot of data is pushed to memory). Indeed, you have to allocate an arbitrary amount of memory, even if the routine is not going to use half of it. You can also easilly run into stack overflows. This may or may not be a problem on NES, depending on functions calls, and ... on NMI, which could push a lot of things on the stack by itself (PC and status register), then usually A,X and Y in the implementation (so, expect a minimum stack of 6 bytes when NMI is enabled).
2) Halting these couroutines in a clean way is not that easy. Even worse: knowing WHEN they had been killed/halted or have finished is no easy task, and need a lot of pointer work. Perhaps it can be simplified on NES to work around these problems.
3) Context switching added a little overhead for object processing.
Finally, tired of slowdowns, data aborts, infintes loops, and other funny things, I decided to get rid of this system in many places. I'm still using it only when it makes the code a LOT more simpler.
EDIT: Sorry, I hadn't read the entire thread when I wrote that, so I didn't see the solutions you used to solve some of the problems I reported, and I must say they look very interesting !
Thanks for sharing.
Yes, you have to allocate an arbitrary number of stack for an object, even if it's not going to allocate half of it. There should be a way to dynamicly change stack size, but this sounds complicated.
Stopping a process isn't hard when it's an enemy, it's just stopping whenever it goes off-screen (automatically) or when the enemy is defeated and has finished exploding.
For non-enemy process (script, shading, ?) this would seems harder to know when they start and when they finishes.
Quote:
Finally, tired of slowdowns, data aborts, infintes loops, and other funny things, I decided to get rid of this system in many places. I'm still using it only when it makes the code a LOT more simpler.
Could you please explain how you got rid of it and still get the features/ease of code it allows ?
Bregalad wrote:
Stopping a process isn't hard when it's an enemy, it's just stopping whenever it goes off-screen (automatically) or when the enemy is defeated and has finished exploding.
For non-enemy process (script, shading, ?) this would seems harder to know when they start and when they finishes.
Yeah, when graphical/game engine objects are directly tied to their coroutines, it should be easy.
Problems come when coroutines are only generic functions. More problems come when they need to allocate memory, or make changes that will have to be reversed when they end (may they end in unexpected conditions), etc.
In my case, handling fadings (for example) with this system was a complete pain, because I needed to know the state of fading before starting another one: so I had to keep a pointer of the fading thread, but that way, I couldn't automatically delete the memory allocated for thread, or I will lose its state information. Moreover, I had to take care it was correctly restoring temporary palettes, to free the stack AFTER it has really finished, etc. Perhaps my design was bad after all... mmm.. perhaps I should give this method another try ?
Bregalad wrote:
Could you please explain how you got rid of it and still get the features/ease of code it allows ?
I got rid of it at the cost of the ease of code, indeed...
So in summary, this method is good for enemies or other graphical events, but bad for things that should normally be done by the core game engine, right ?
Quote:
I got rid of it at the cost of the ease of code, indeed... Confused
So eventually you decided that handling objects with a jump table in fuction of their "state" was better than doing some kind of multithreading ?
Personally, the more I think about it the more I find the jump table approach horrible. However that's what Konami and Nintendo seems to do, they LOVE using jump tables and do it all the way arround. Their game engines are all done inside the NMI routine (no main thread) and they use jump table to do different things in frame, something I tried earlier in my coding life and found absolutely horrible to do.
Probably the programmers at Konami did found a programming approach to make jump tables more efficient/easier to manage, because they coded the game with the most complex gameplay at their time, containing enemies which a shielded side, enemies that follow other enemies and act togeter, and often complex interaction with the background (typically seen in games from the
Gradius series).
I feel like I missed something about jump tables.
Bregalad wrote:
So in summary, this method is good for enemies or other graphical events, but bad for things that should normally be done by the core game engine, right ?
Initially I planned to use the object system only for game entities such as player, enemies, items and so on. Things like fading were hardcoded to the main loop, if not only to spare the RAM an extra object would use (I can have 32 active objects).
But recently I read about Another World, because of all that talk about vector FMVs (revolving around Celius' engine). Well, it seems to use objects for everything, as well as a simple made up scripting language (the only programming the objects are allowed to have). Take a look at the
author's page and you'll see (relevant part
here). That seemed like a pretty simple way to code a game, although I don't think I'd have the guts to try something so extreme.
However, that still inspired me to make more use of the object system, having it active at all times (not only during the main game), so that the title screen, title cards and other screens could benefit from it. And it seems like it's a real benefit to give tasks to individual objects instead of hardcoding loops, recycling RAM locations (since each object receives a slice of RAM, they have quite a bit to work with) and so on.
So now I plan on having my title card elements as objects, the level loader as an object, and that will make the main code smaller and simpler. Of course there are many things that remain hardcoded. Anyway, I'm still figuring out if this will work as well as I expect. I'll let you know.
Quote:
So eventually you decided that handling objects with a jump table in fuction of their "state" was better than doing some kind of multithreading ?
The Mega Drive Sonics use jump tables, and the games work quite well. I'm still not sure about the best method. The multithread way is good for objects whose code flows naturally from one part to the other. If it's not like that, You'll need a lot of JMP instructions followed by addresses that would use as much space as a jump table when added up.
Quote:
they coded the game with the most complex gameplay at their time, containing enemies which a shielded side, enemies that follow other enemies and act togeter, and often complex interaction with the background
Doesn't seem terribly complex to me... Following enemies could be done with a shared function that changes the coordinates of an object based on another one's, and each enemy holds the index of the one it follows in order to be able to call this routine. Interaction with the background should also use shared routines, often used by the player (usually the most complex object), but an occasional complex enemy could make use of them too.
Well, using multithread/object system outside of gameplay really sound weird to me. Then you're somewhat just coding an operating system and programms running into them.
Yeah that would be an advantage for RAM usage, but the RAM I allocate to object have predefined uses (like Position, FrameAnimation, HitPoints, etc...) that only make sense with enemies or projectiles. I plan to use some other non-enemies and non-projectiles objects like a door that would only open on some conditions or something like that.
Also the routine that draw sprites is NOT in the objects themselves for me, it's in the main code and looks at object's variable to know what to draw. I like this system, but objects that aren't supposed to be visible are forced to use an "empty" sprite, and to have a dummy value for their HP, reward score, etc... But this isn't really an issue.
Bregalad wrote:
Yeah that would be an advantage for RAM usage, but the RAM I allocate to object have predefined uses (like Position, FrameAnimation, HitPoints, etc...) that only make sense with enemies or projectiles.
I just give each object a number of bytes, how they are used is up to the objects. Of course, there are a few supporting functions that expect certain things to be on certain locations... like, an acceleration routine must have access to the object's speed, but if an object doesn't move it may use the byte normally used for speed for anything else, as long as it doesn't call the acceleration routine (and it won't, because it doesn't move).
Quote:
Also the routine that draw sprites is NOT in the objects themselves for me, it's in the main code and looks at object's variable to know what to draw. I like this system, but objects that aren't supposed to be visible are forced to use an "empty" sprite, and to have a dummy value for their HP, reward score, etc... But this isn't really an issue.
I don't have this problem because although the actual sprite drawing is not part of the object code, each object that wants to be displayed calls the routine that draws sprites sending the index of the object slot as a parameter. If they don't call the routine, they don't get drawn, it's pretty simple.
I have to give objects the power to use their RAM and call supporting routines at will because there are so many kinds of objects. Some are rendered with sprites, some with background tiles, some with both, and some aren't rendered at all. Some are relative to the level, some relative to the camera (HUD elements). Each one has to know what they are and make use of the appropriate supporting routines.
Maybe this is why I don't think it's weird to have the object system active during the whole game, because my objects are already pretty generic, while yours were created very specifically for in-game use.
Today I coded a bit and was already able to see the benefits of incorporating some previously hardcoded behaviors as objects. Before I used some really complex code to load the level and draw the first screen, but now I use objects that take care of the heavy work of loading and drawing levels, and I can do it all inside the main loop, without a bunch of weird, repetitive surrounding code.
Who needs C++ for object-oriented programming? You're doing it in assembly!
Yeah somewhat. If it can be more efficient or easier to code I have nothing against it altough I love assembly. I personally just use "objects" for enemies and simpler handling of them, not anything to do with multi-taking OS, or object oriented or anything. It's true that complex BG bosses should interact with background as well.
I still find that object-oriented programming is SO different to regular assembly programming it is almost impossible to convert from one type to another (I haven't much experence with object-oriented but enough to have a small idea).
tepples wrote:
Who needs C++ for object-oriented programming? You're doing it in assembly!
Well, there are some similarities... for example, the shared routines used by the objects could be methods of a class the objects belong to. But there are still a lot of object oriented concepts missing, such as encapsulation. I'm trying to group all the routines and variables related to each subsystem isolated from everything else, because this is where I screwed up last time. I was messing with variables when I shouldn't, reusing RAM that wasn't yet free... So I'm trying to recode everything as object oriented as possible to avoid such problems.
Like I said, it is helping a lot. But I can't have everything in the game be isolated objects, it's just not practical in assembly. The camera, the level map, and stuff like that (mostly things of which there is only one instance) must remain hardcoded into the main program.