Nessemble - new NES assembler

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Nessemble - new NES assembler
by on (#202847)
Hey all. Long time lurker on these forums, first time poster.

I'd like to let you all know about a new NES assembler that I just completed called Nessemble.

It functions very similarly to nesasm, but with a bunch of extra bells and whistles including:

  • `.incpng` pseudo instruction to include images without first converting to CHR data
  • Built-in disassembler to better analyze existing NES ROMs
  • Support for scripting to extend existing functionality

I'd love to know what everyone thinks. Regardless of its usefulness alongside existing assemblers, it was fun to create and might be useful to someone.

Please feel free to check out the website and contribute/raise issues on GitHub.
Re: Nessemble - new NES assembler
by on (#202860)
Welcome.

Some quick comments on the documentation:
  • Operator precedence and associativity (left/right) are not defined.
  • NOP and SBC are listed under illegal/undocumented mnemonics.
  • The syntax seems to have been influenced by NESASM, which I'm not a fan of (especially the need to explicitly mark zero page operands with "<").
  • .org stands for "origin", not "organize".
  • The following pseudo-instructions would be useful: define 16-bit big-endian word, define 32-bit little-endian dword, define 32-bit big-endian dword.

Sorry if I sound too negative. Those were just details that caught my eye.
Re: Nessemble - new NES assembler
by on (#202867)
Not a big fan of the similarities to NESASM (LOW()/HIGH(), < for ZP, [] for indirection), but there are some cool additions. There are some errors in the documentation (I don't mind you calling .org "organize", it's your assembler after all), such as saying 2KB == 0x2000 and 4KB == 0x4000.

Also, it's not clear to me how the BANK() function works... In NESASM, banks are always 8KB, so if the mapper you're using works with 16 or 32KB banks you have to divide the value provide by NESASM by 2 or 4. Also, in NESASM, it doesn't look like BANK() is useful for referencing CHR banks, since the bank counter doesn't reset after the PRG-ROM and CHR banks can be as small as 1KB. Can your assembler get around these problems?

Can Nessemble generate NES 2.0 headers? Can you disable headers altogether, allowing it to be used for developing games for other 6502 platforms (such as the 2600)?

As I see it, the main drawback of NESASM is not its quirky syntax, but the annoying imposition of having to divide the whole program in 8KB chunks. The main drawback of ASM6, on the other hand, is not having any sort of built-in bank management (i.e. BANK() function), which makes inter-bank references that much harder to manage. If you can offer something that doesn't impose such arbitrary limitations and can make the management of multi-bank ROMs easier, Nessemble has the potential of becoming the go-to assembler for creating ROMs without hassle. That is, as long as it's featured in a newbie-friendly tutorial that we can recommend, otherwise I can only see Nerdy Nights producing more and more NESASM users.
Re: Nessemble - new NES assembler
by on (#202910)
[FR]
Bonjour,
vous avez un bugs à l'adresse: http://www.nessemble.com/documentation/
Le lien du texte : "Edit on GitHub " pointe vers la cible : https://github.com/kevinselwyn/nessembl ... s/index.md
Je pense que la cible devrai être: https://github.com/kevinselwyn/nessemble

[ENG]
Hello,
You have a bug at: http://www.nessemble.com/documentation/
The text link on the web page : "Edit on GitHub" target to : https://github.com/kevinselwyn/nessembl ... s/index.md
I think the target is: https://github.com/kevinselwyn/nessemble
Re: Nessemble - new NES assembler
by on (#202936)
Thanks for the feedback so far!

qalle, great tips on the documentation. I'll note that `NOP` and `SBC` both have undocumented opcodes, which would probably make the undocumented versions unreachable because the assembler would use the documented ones. Also, the additional pseudo-instructions you mentioned could easily be added with a little bit of scripting. And tokumaru says I can define `.org` as "organize" ;) Although, I've looked at other assemblers, which ones define it as "origin", out of curiosity?

tokumaru, I'll look into some of the banking functionality. I'll admit that I haven't tried to make anything substantial with the assembler yet, but I'm sure pain points will emerge when I do. Also, I'll definitely add NES 2.0 support. If you use `--format RAW` when invoking the assembler, it will output a headerless ROM making 2600 development (theoretically) possible.

klr128, good catch. Merci.

Thanks, all!
Re: Nessemble - new NES assembler
by on (#202937)
I'm curious why indirection brackets () conflicts with mathematical brackets on some assemblers but not others? Or are they separate for clarity reasons? Personally I prefer () for indirection because [] are very tedious to type on many non-English keyboards (as we just discussed in another thread).

Also while I'm not totally against NESASM's zeropage addressing notation, I think CA65 is doing it the best way. It automatically chooses zeropage addressing when a zeropage address is used, which can be overridden by a prefix before the address if one needs to.
Non-automatic zeropage addressing could be annoying if one needs to move lots of variables from the zeropage.
Re: Nessemble - new NES assembler
by on (#202941)
snarf2888 wrote:
And tokumaru says I can define `.org` as "organize" ;) Although, I've looked at other assemblers, which ones define it as "origin", out of curiosity?

Probably just about every single one of them, including emulators for architectures other than 6502. Google assembly org origin returns this page, for example.

I'm curious why indirection brackets () conflicts with mathematical brackets on some assemblers but not others? Or are they separate for clarity reasons?
I imagine it makes the parser simpler. Just matching lda (some_arbitrary_string),Y won't work; you have to match lda (some_full_expression),Y, which means it has to be applied after the stage of the parser that recognizes expressions. And the expression part of the parser has to be smart enough not to swallow the outermost parentheses.
Code:
; These should assemble the same way, using aaaa,Y addressing mode
lda 3*4+2*5,Y
lda (3*4)+(2*5),Y

; This should assemble differently, using (dd),Y addressing mode
lda (3*4+2*5),Y

Perhaps the author of the MagicKit assemblers (PCEAS and NESASM) thought brackets would simplify the parser and wasn't considering the francophone minority, as more developers were in Japanese- or English-speaking countries when the TG16 and NES were popular.

The other issue is that the 65816 uses brackets for "indirect long" addressing modes, which use the data bank in the pointer instead of the data bank from the data bank register.

Pokun wrote:
Non-automatic zeropage addressing could be annoying if one needs to move lots of variables from the zeropage.

This feature arises from MagicKit's early focus on the HuC6280, the CPU of the TG16. Its direct page is at $2000-$20FF instead of $0000-$00FF. The direct page of the SPC700 (Super NES audio CPU) is at $0000-$00FF or $0100-$01FF depending on P bit 5. The direct page of the LR35902 (Game Boy CPU) is at $FF00-$FFFF, though that one's based on an 8080 rather than a 6502. And the direct page of the 65816 can be moved to start at any location in bank 0 ($000000-$00FFFF), though if direct page doesn't start on a 256-byte boundary, each direct page instruction incurs an extra cycle of index addition penalty.

On the HuC6280, for example, which address should lda $02 read? Should it read $0002 in absolute mode? Or should it read $2002 in direct page mode?
Re: Nessemble - new NES assembler
by on (#202944)
I had parsing problems with [] vs (). I _wanted_ to use () for indirect addressing et al., but it was simpler for the assembler to parse [].

Regarding the zeropage syntax, I also dislike having to use <, but automagically figuring out if an address is zeropage was difficult for my assembler, especially when it came to using labels (since label addresses are gathered during the first pass of the assembler). In the end, I wanted people to opt in to optimizing and using zeropage addressing instead of trying to think for the programmer.

Turns out, it's difficult to write parsers.
Re: Nessemble - new NES assembler
by on (#202947)
I do have a suggestion about the extending. If a JavaScript code returns a ArrayBuffer, Uint8Array, or DataView, it should accept that too rather than only a string. (Similar things can be supported in Scheme and Lua if they have similar features, but I don't know if they do.)

The = syntax for defining constants within the assembly code does not seem to be documented, although that is what seems to be the case by looking at the source codes; apparently also a label name by itself with no colon or anything else defines a constant 1, which should also be documented if that is in fact the case.

Also, I do happen to like the MagicKit-style addressing modes, even if some people do not.

Adding additional functions for use with JavaScript-based extensions can also be helpful, such as (could be added into a "Nessemble" global object; you could use a code such as const N=Nessemble; if you wanted an abbreviation):
  • include(filename) for including another JavaScript code from another file.
  • loadText(filename) for reading another file as text.
  • loadBinary(filename) for reading another file as a ArrayBuffer.
  • save(filename,data) for writing data (either a string or a ArrayBuffer or typed array or DataView) to another file.
  • addSymbol(name,value,type) to add a symbol.
  • pass() to determine which pass of the assembler it is.
  • address() to tell you the current address.
  • romArray() to retrieve a Uint8Array for the ROM.
(and so on; possibly not all of them will be implemented at once, but some versions might implement sone.)

Another suggestion can be ability to omit the word .macro when calling a macro in cases where it is unambiguous, like other assemblers will do too.
Re: Nessemble - new NES assembler
by on (#202958)
snarf2888 wrote:
I had parsing problems with [] vs (). I _wanted_ to use () for indirect addressing et al., but it was simpler for the assembler to parse [].


Ophis uses [] as mathematical brackets and () for indirect addressing modes: lda ([1+1]*3),y (which I like).

snarf2888 wrote:
...automagically figuring out if an address is zeropage was difficult for my assembler, especially when it came to using labels...

The reason why I abandoned my assembler project.
Re: Nessemble - new NES assembler
by on (#202970)
tepples wrote:
Pokun wrote:
Non-automatic zeropage addressing could be annoying if one needs to move lots of variables from the zeropage.

This feature arises from MagicKit's early focus on the HuC6280, the CPU of the TG16. Its direct page is at $2000-$20FF instead of $0000-$00FF. The direct page of the SPC700 (Super NES audio CPU) is at $0000-$00FF or $0100-$01FF depending on P bit 5. The direct page of the LR35902 (Game Boy CPU) is at $FF00-$FFFF, though that one's based on an 8080 rather than a 6502. And the direct page of the 65816 can be moved to start at any location in bank 0 ($000000-$00FFFF), though if direct page doesn't start on a 256-byte boundary, each direct page instruction incurs an extra cycle of index addition penalty.

On the HuC6280, for example, which address should lda $02 read? Should it read $0002 in absolute mode? Or should it read $2002 in direct page mode?

Wouldn't LDA $02 simply read $2002 in zero page mode and $0002 in absolute mode since only in direct page mode the upper byte is implied? But I guess that there might not just be one right answer. The relocatable direct page of 65816 and SPC700 would require an assembler directive where the programmer tells the assembler where the DP is for automatic direct page addressing I guess.

I'm not sure how Gameboy assemblers normally handle the LDH instruction though. But on the Gameboy the "direct page" is used for a relocated OAM-DMA subroutine, stack, hardware registers and as a fast direct page-like RAM, so there's not a lot of direct page variables to move around in the first place. I think RGBASM uses LDH as forced direct page addressing no matter if the upper byte ($FF) is used or not, but I haven't tested it.
Re: Nessemble - new NES assembler
by on (#203046)
what about copying the "auto zero page" thing from the nesasm 2.51 so people dont have to type the < thing for zero page stuff ??

http://www.2a03.jp/~minachun/nesasm/nesasm_x86.html

I personally only use this type of NESASM for my megaman odyssey. Cause i hate the whole idea of typing those annoying < things in front of zero page addresses and label names.

This autozp version does not make you do that. so i only use that.

Why was it "never" added to the version 3 anyway ??
Re: Nessemble - new NES assembler
by on (#203068)
snarf2888 wrote:
I wanted people to opt in to optimizing and using zeropage addressing instead of trying to think for the programmer.


Programmers should be accountable for writing code optimized to their liking, even at such a low level.

I can't speak to `.autozp`, but nesasm is an entirely different assembler than Nessemble. Feel free to check out the code and submit a pull request for new features.
Re: Nessemble - new NES assembler
by on (#203071)
The problem is that 99.9% of the time you DO want ZP addressing (who doesn't want faster code?), and having to type "<" for every freaking ZP variable is incredibly annoying, considering that, at least for me, the bulk of the variables are there, while arrays and things that are referenced less often occupy the other pages. The only time you intentionally need to slow down the access to ZP variables is in timed code, which doesn't have much use on the NES outside of raster effects (and even then there are often other choices to waste 1 extra cycle). On the 2600 it's more common for me to need to access variables using 4 cycles (all of the RAM is in ZP on the 2600!), but it's still the exception.
Re: Nessemble - new NES assembler
by on (#203083)
tokumaru wrote:
and having to type "<" for every freaking ZP variable is incredibly annoying, .


yea exactly my point earlier. there's some bugs apparently with this 2.51 that were fixed in the version 3.whatever .. but i'll stay with this older one, cause of the autozp, sole reason only.
Re: Nessemble - new NES assembler
by on (#203088)
snarf2888 wrote:
snarf2888 wrote:
I wanted people to opt in to optimizing and using zeropage addressing instead of trying to think for the programmer.
Programmers should be accountable for writing code optimized to their liking, even at such a low level.
Well if you do implement auto-zeropaging you'd want to offer options to manually override it. CA65 does it with the "a:" prefix:
Code:
  LDA a:$0001  ;force absolute addressing

Likewise it can force zeropage addressing with the "z:" prefix.

ASM6 offers no way of doing this AFAIK, being one of its flaws. Although a macro can get around it.
Re: Nessemble - new NES assembler
by on (#203094)
A more common way of forcing 16bits is the ~
so LDA ~$0000 <- forces 16 bits and not 8 bits. But I don't recommended it going forward.

using @b, @w tends to be the new way so
LDA@b $0000 <- ZP
LDA@w $0000 <- 16bit

This is basically a 68K like syntax, and extends better when you get to 65816 where you add
LDA@l $0000 <- 24 bit
and then on immediates
LDA@w #0000 <- 16 bit immediate mode

Some assemblers use the ## form.
LDA ##0000 <- 16 bit
LDA #0000 <- 8 bit

with (),y
if you make your parse collapse everything in brackets but not remove them, say given
lda (3*4)+(2*5),Y
lda (12)+(10),Y
then you can detect if there is a ",y" and if there is, to a scope check to count the ( and ) and see if you end up with a pair that encapsulates from the opcode to the final comma ,y.
Which is count the ( and then count down the ) until you hit 0, if you hit 0 and the next 2 non whitespace chars is not ,y then its not (ZP),y
If it does have full encapsulation and ",y" , you collapse and resolve until only the outer pair remain.
say
lda ((3*4)+(2*5)),Y
lda ((12)+(10)),Y
lda (12+10),y
lda (24),y
Then you can do a check to see if the contents of the last brackets are <= 255 and then switch zp post indirect or abs. But in most cases I would say having the case above 255 is an error on the coders part and giving an error that it is over 255 is the more common use case. With a work around if they must do that way being
lda ((8*4)+(50*5))+0,y
Re: Nessemble - new NES assembler
by on (#203196)
One of the features I like in NESASM is that it does not do auto zero paging.
Re: Nessemble - new NES assembler
by on (#203197)
zzo38 wrote:
One of the features I like in NESASM is that it does not do auto zero paging.

I personally think you're crazy (I mean, why turn down an automatic 25% speed increase when accessing variables?), but I guess that the best solution would be to make this optimization optional.
Re: Nessemble - new NES assembler
by on (#203201)
Actually, as a happy user of ASM6 myself, I may migrate back to using NESASM because of the non-auto ZP thing. I'm interested in PCE development, and may reuse some of the codes if possible, so not being able to automatically force ZP addressing is a necessity (as the ZP on PCE Hu6280 starts from $2000, not $0000, and also, stack is at $2100).

The best of both worlds, though, is to have a switch (either in command line or an assembler directive) to turn on or off whether ZP address modes are auto. I think there're some unofficial versions of NESASM that actually do this.

Or even better, someone takes the time to add Hu6280 (thus automatically also 65C02) support to ASM6, and add the switch mentioned above also.
Re: Nessemble - new NES assembler
by on (#203207)
If you switch to Tass64 ( I know know broken record ) you can use the .dpage option to make it automatically convert the right region for you. or if you want to do it manually you can do .dpage ? then use the ,d instruction mode as so LDA $02,d to force it into ZP mode.

Code:
; PCE test
.cpu "65c02"
.dpage $2000

lda $2002
sta 2
rts

.dpage ?
lda #$02,d
lda 2,d
lda 2
generates
Code:
; 64tass Turbo Assembler Macro V1.53.1515? listing file
; 64tass.exe -o D:\GitHub\test\pce.prg --no-caret-diag --dump-labels -l pce.tass -D BDD=0 -D CRT=0 -L pce.list --verbose-list --line-numbers -Wimmediate D:\GitHub\test\pce.asm
; Fri Aug 25 18:52:28 2017

;Line   ;Offset   ;Hex      ;Monitor   ;Source

:1   ;******  Processing input file: D:\GitHub\test\pce.asm

1                  ; PCE test
2                  .cpu "65c02"
3                  .dpage $2000

5   .0000   a5 02      lda $2002   lda $2002
6   .0002   8d 02 00   sta $0002   sta 2
7   .0005   60         rts         rts

9                  .dpage ?
10   .0006   a5 02      lda $02      lda #$02,d
11   .0008   a5 02      lda $02      lda 2,d
12   .000a   ad 02 00   lda $0002    lda 2


;******  End of listing

It sadly doesn't support the exclusive HuC6280 opcodes, I think the 6280 is a Rockwell 65c02 extensions so using R65c02 is probably a better match, and then you have to use Macros to get the others. I've asked about getting the extras in, but the author's code assumes a 3 letter opcode in a few places so its not a trivial thing to add sadly.
Re: Nessemble - new NES assembler
by on (#203210)
Yeah the Hu6280 is like 65C02 with the additional Rockwell instructions (BBS, BBR and SMB etc) but missing STP and WAI that WDC added. It's too bad because STP is an easy way to make breakpoints, and WAI is useful when waiting for interrupts.


Gilbert wrote:
Actually, as a happy user of ASM6 myself, I may migrate back to using NESASM because of the non-auto ZP thing. I'm interested in PCE development, and may reuse some of the codes if possible, so not being able to automatically force ZP addressing is a necessity (as the ZP on PCE Hu6280 starts from $2000, not $0000, and also, stack is at $2100).

The best of both worlds, though, is to have a switch (either in command line or an assembler directive) to turn on or off whether ZP address modes are auto. I think there're some unofficial versions of NESASM that actually do this.

Or even better, someone takes the time to add Hu6280 (thus automatically also 65C02) support to ASM6, and add the switch mentioned above also.

Yeah a 65816 fork for ASM6 already exists, so adding Hu6280 as well can't be that hard. Using the set direct page directive to set zero page to $2000 should do the trick. But I think ASM6 could also do with a way to force absolute addressing.

Currently the best assembler for PC Engine is Elmer's fork of PCEas (the PCEas exe is in the bin directory). It has done away with the 8 kB bank nonsense and recently got an option to pad the ROM to make 384 kB HuCards.