Forcing syntax for address access in CA65

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Forcing syntax for address access in CA65
by on (#165128)
After stumbling over the old problem of passing a constant as an address instead of a value (i.e. I wrote LDA CONSTANT_VALUE instead of LDA #CONSTANT_VALUE, I need to ask:

Is there a way for the compiler/asembler to force a special character for address access?

What I mean is that I want LDA VALUE and LDA 1 to be a syntax error. I want to be forced to write LDA #VALUE and LDA #1 for a value access and something else, for example LDA &VALUE and LDA &1 for an address access.

This way the error cannot happen anymore. Because it's unlikely that I confuse # and &. But using a constant as an address instead of a value, this happened to me so many times now.

Is there a way to force this in CA65?
Re: Forcing syntax for address access in CA65
by on (#165137)
The assembler doesn't know the difference between a "immediate constant" and an "address constant", unfortunately. I don't think there's any way to hack that functionality in it; this is a form of type safety that's just not in the language.

You could create two macros to replace each load instruction, making your own pseudo-instruction:

LIA - load immediate to A
LMA - load memory to A

Then just use these instead of the regular instructions. That's basically just an alternative syntax for LDA # vs LDA, but you might find it less error prone if it's part of the mnemonic?

I found forgetting the # was a problem at first when working with 6502 assembly, but after a while it just became unconscious second nature and I'd almost never make that mistake. (Kinda like the old "if (pointer = NULL)" mistake beginners make in C.)
Re: Forcing syntax for address access in CA65
by on (#165169)
Yeah, it's a pity that the language isn't designed in a way that it requires an explicit character before both access versions.

rainwarrior wrote:
You could create two macros to replace each load instruction, making your own pseudo-instruction

Apart from the fact that I would have to create macros for various instructions then, the whole intention was to prevent accidental address access. With a macro, you can still forget to use the macro in the first place.

Also, I prefer to use the actual instructions because a source code that's full of macros for well-known instructions just looks wrong.
Just like you would expect a cout in C++ and it would look weird if someone created a macro #define WriteLine(value) cout << (value) << "\n" and used this throughout the program.

rainwarrior wrote:
I found forgetting the # was a problem at first when working with 6502 assembly, but after a while it just became unconscious second nature and I'd almost never make that mistake.

Since the majority of my program code is in C and only the low level stuff (NES-specific things, array copy functions, randomizer and FamiTone wrapper) is in Assembly, I don't work with it enough so that I become fluent in it.

Interestingly, the problem never happens when I write numbers directly. Only with constants.
I.e. I never write LDA $01.
Probably because addresses are always represented as labels, so this specific version never occurs anyway. Therefore I cannot confuse LDA $01 and LDA #$01 because LDA $01 is just something that I never write anywhere. But LDA Label and LDA #Constant, yeah this is a serious source of error.
Re: Forcing syntax for address access in CA65
by on (#165207)
I've gotten in the habit of always using all capital letters for constants. I find that it helps make it obvious when an immediate label is used.
Re: Forcing syntax for address access in CA65
by on (#165208)
Unfortunately, this doesn't work for me. In my case, the error from yesterday was writing LDX FT_SFX_CH0 instead of LDX #FT_SFX_CH0.

That's why I thought that there might be compiler options to force a separate character. Stuff that doesn't rely on personal style guides.

I don't know why they didn't think about this when designing the language.
I mean, in C, it makes sense that you only have the special character & when you access pointer values, but not when accessing normal values. Because the compiler will warn you if you try to assign a number to a pointer without an explicit cast. So, you can't accidentally make the error.
But in Assembly this can really be a source of frustration, even though defining that LDA value is never a valid statement (and that it always requires a special character, either # or &) should not have been a big deal.
Re: Forcing syntax for address access in CA65
by on (#165209)
My constants are all in capital letters too, but so are the labels I assign to memory mapped registers, so that alone isn't enough to tell them apart. It's been a while since I last used the wrong addressing modes by mistake though... I don't really consider this a problem.
Re: Forcing syntax for address access in CA65
by on (#165210)
I hope that this won't be a problem in the near future anymore for me because as far as I can see it, this should have been the last part of code that I wrote in Assembly. (I had already included the sound, but it started to malfunction when I tried to add a second channel.)
Well, unless I decide to include a digitalized voice sample for my main character. Then I have to go into the deep depths of low level code again. :mrgreen:
Re: Forcing syntax for address access in CA65
by on (#165220)
DRW wrote:
I don't know why they didn't think about this when designing the language.

Assembly doesn't normally have type safety of any form. That's pretty much all assemblers, though the "missing #" mistake you're complaining about is kind of particular to 6502 mnemonics.

It can't distinguish between a constant pointer (i.e. "label") and a constant value. It can't distinguish between a signed character and an unsigned one. It can't distinguish between a pointer to const data and pointer to non-const data. It can't distinguish between a pointer and a 16 bit integer. Et cetera. These things could be useful, potentially, but type safety isn't the traditional job of an assembler.

Funnily enough, they did make two mild type-safety additions to CA65. One is the ability to annotate imports and reservations with a "zeropage" type property that allows the assembler to know when it can optimize 2-byte addresses into 1-byte ones. The other is the ability to check constants for values above 255, to give a warning about accidental out of range values (though unfortunately this makes signed constants unusable, which is why I usually disable this feature with .feature force_range).

If you want the feature, though, CA65 is written in C, and open source. That's your other option, of course.
Re: Forcing syntax for address access in CA65
by on (#165242)
rainwarrior wrote:
Assembly doesn't normally have type safety of any form.

Yes, I know. That's the thing: If the language itself doesn't have type safety and everything is just a number, there's even more need to enforce at least the distinction between value and address in the syntax.
This wouldn't really be a distinct feature, just a way to parse the source code.

rainwarrior wrote:
The other is the ability to check constants for values above 255, to give a warning about accidental out of range values (though unfortunately this makes signed constants unusable, which is why I usually disable this feature with .feature force_range).

If you have a signed constant, can't you just use the low byte of it and then it works again?
Code:
LDA #-1 ; Error
LDA #<-1 ; No error


rainwarrior wrote:
If you want the feature, though, CA65 is written in C, and open source. That's your other option, of course.

Yeah, but if I ever publish my code, it won't be usable with the regular download version.
Re: Forcing syntax for address access in CA65
by on (#165262)
DRW wrote:
If you have a signed constant, can't you just use the low byte of it and then it works again?

Yes, but that defeats the idea of range checking anyway, because it also passes values that are > 127.

The more onerous case is when I have a table of signed values, which gets really stupid looking with the < operator:
Code:
.byte -3, -2, -1, 0, 1, 2, 3
; vs
.byte <-3, <-2, <-1, <0, <1, <2, <3


DRW wrote:
rainwarrior wrote:
If you want the feature, though, CA65 is written in C, and open source. That's your other option, of course.

Yeah, but if I ever publish my code, it won't be usable with the regular download version.

What you're proposing is just some kind of error checking for #. If your code is finished, all the #s will be in the right place. The code should still be valid for the normal assembler. (If you did have to add some special control command to annotate your constants, you could search and delete them, but you could do what you want here simply with a capitalization convention, really.)
Re: Forcing syntax for address access in CA65
by on (#165263)
tokumaru wrote:
My constants are all in capital letters too, but so are the labels I assign to memory mapped registers

I'm curious why you would use a different convention for memory mapped registers?
Re: Forcing syntax for address access in CA65
by on (#165265)
Well, memory mapped registers are traditionally all uppercase (e.g. PPUMASK, PPUSTATUS, PPUDATA). The official Atari 2600 manuals also have registers names in uppercase (e.g. RESP1, COLUPF, GRP0). So even though I don't use the exact same register names as the NesDev Wiki does, mine are still uppercase.