yaros wrote:
I am able to write simple parser, lexer or interpreter. When it comes to bytecode/machine code generation my brain shuts down. Could you advise some resources you used and learned from?
Me too. I kind of made up the bytecode as I went along. Maybe the best way to go about it is with an example? This is an event that gets called when the player successfully hacks the keypad.
Code:
function on_game_event(event, param)
if event = GAME_EVENT_KEYPAD_SUCCESS then
set_timer(60)
end if
end function
You can make your compiler simply generates the code as it parses (a real compiler would do multiple passes, use an intermediate representation to optimize more but we dont need that). You parse a statement (if, assignment, addition, etc.) and you output the bytes right away to an array. Sometimes, for jumps, you might need to patch some jump addresses and stuff, but for the most part, you can do it as you parse in one go.
This is the bytecode I use right now, and the disassembly next to it. There are generally two types of bytecode: stack based and register based, this is definitely register based.
The first byte is always the opcode.
Code:
; on_game_event
.byte $0e, $01 ; 0: get_arg0 event
.byte $05, $01, $82 ; 2: compare_equal_byte event 0
.byte $0a, $0a ; 5: jump_if_false 10
.byte $13, $05, $8a ; 7: invoke_r0 set_timer(60)
.byte $1a ; 10: end
- $0e = reads argument 0 passed from the engine, puts in variable slot $01 (i have 32 variable slots). Note that the compiler detected we never access the second argument, so it ignores it. Minor optimization.
- $05 = equality comparison. Compares whatever is in variable $01 (what we just read) to constant $02 (when the minus bit, $80, is set, it means that its not a variable, but a constant stored in a table somewhere else. Whenever I encounter an hardcoded constant at compile-time, I look to see if I've seen it before and add it to that table if its new. Constants like "0" comes back quite often, so this is efficient).
- $0a = jump to the location indicated (10, which is the end) if the last test condition was false. My conditional statements are optimized for "ORs of ANDs" (or "Disjunctive normal form" if you want to sound pretentious).
- $13 = calls an engine function (engine function $05 is "set_timer") that takes a single argument and puts in in "r0" (a zero page variable in my engine), which is the constant table at location $0a.
- $1a = end.
I hope this example can make it clear how easy it is to output this king of bytecode as you parse. Interpreting this is super easy, simple jump table, and a 32-byte array for your "RAM".
-Mat