Namespacing symbols used by an ASM6 library

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Namespacing symbols used by an ASM6 library
by on (#237394)
A translation unit means a source code file, the source code files it includes, and all source code files thereby recursively included. For example, if "bg.s" includes "nes.inc" and "global.inc", and "global.inc" further includes "enemytypes.inc", the translation unit consists of those four files. In the C programming language and in the ca65 assembler, each translation unit is compiled or assembled to one relocatable object code (.o) file, and once all translation units are compiled, the .o files are linked together into an executable.

In ca65, as in C, a function or variable can be marked as private to a translation unit. In C, a function or global variable is made private if marked static; in ca65, it is made private by not naming the function or variable in an .export statement. Two different translation units can have private functions or variables with the same name, reducing the likelihood of an accidental name collision.

ASM6, by contrast, allows only one translation unit in a program. Every function or variable can see every other function or variable, so long as it isn't defined within the body of a scoping structure (macro or rept).

A representative snippet from an audio driver:
Code:
    lda (conductorPos),y
    sta musicPattern,x
    iny
    lda (conductorPos),y
    sta patternTranspose,x
    iny
    lda (conductorPos),y
    sta noteInstrument,x
    jsr startPattern


If I were to modify it to avoid name collisions by namespacing all private symbols, it would appear as follows:
Code:
    lda (_pently_internal_conductorPos),y
    sta _pently_internal_musicPattern,x
    iny
    lda (_pently_internal_conductorPos),y
    sta _pently_internal_patternTranspose,x
    iny
    lda (_pently_internal_conductorPos),y
    sta _pently_internal_noteInstrument,x
    jsr _pently_internal_startPattern


Likewise, music pattern data would change from this, where each measure occupies one line of code:
Code:
PPDAT_bf98_melodyA:
.byte REST|D_D8,N_D|D_8,REST,N_F|D_8,REST
.byte N_DS|D_8,N_F,N_DS|D_8,REST,N_C|D_8,REST
.byte N_D|D_8,N_DS,N_F|D_8,REST,N_AS|D_8,REST
.byte GRACE,5,N_GS,N_AS|D_D8,N_GS|D_4,N_TIE,REST
.byte PATEND


to this:
Code:
PPDAT_bf98_melodyA:
.byte PENTLY_PAT_REST|PENTLY_PAT_D_D8,PENTLY_PAT_N_D|PENTLY_PAT_D_8,PENTLY_PAT_REST,PENTLY_PAT_N_F|PENTLY_PAT_D_8,PENTLY_PAT_REST
.byte PENTLY_PAT_N_DS|PENTLY_PAT_D_8,PENTLY_PAT_N_F,PENTLY_PAT_N_DS|PENTLY_PAT_D_8,PENTLY_PAT_REST,PENTLY_PAT_N_C|PENTLY_PAT_D_8,PENTLY_PAT_REST
.byte PENTLY_PAT_N_D|PENTLY_PAT_D_8,PENTLY_PAT_N_DS,PENTLY_PAT_N_F|PENTLY_PAT_D_8,PENTLY_PAT_REST,PENTLY_PAT_N_AS|PENTLY_PAT_D_8,PENTLY_PAT_REST
.byte PENTLY_PAT_GRACE,5,PENTLY_PAT_N_GS,PENTLY_PAT_N_AS|PENTLY_PAT_D_D8,PENTLY_PAT_N_GS|PENTLY_PAT_D_4,PENTLY_PAT_N_TIE,PENTLY_PAT_REST
.byte PENTLY_PAT_PATEND


Is the latter in each pair still readable? Is this sort of insistent namespacing considered a good practice in ASM6 libraries?

I admit that in the specific case of music pattern data, the data is most commonly created with pentlyas.py, which compiles the score in an MML-like language to a bytecode score. But I can envision a situation in which the composer chooses not to run pentlyas.py (because of past "coiler" frustration with Python interpreter version incompatibility). In this case, the composer will be manually editing this bytecode in the file.
Re: Namespacing symbols used by an ASM6 library
by on (#237396)
Yeah that makes it illegible (mostly for the music data constants)

They all look very unlikely to collide, in case they do, maybe just add a lower case p

pConductorPos
pMusicPattern

or, where it would be more needed if a constant is colliding:

pREST
Re: Namespacing symbols used by an ASM6 library
by on (#237399)
As I answered on Discord: the latter code block example is 100% acceptable and readable. Anyone who feels otherwise has obviously not spent time in kernel land where large data structs are incredibly hard to read (data intermixed with comments intermixed with all sorts of symbols for bitwise operators) or with source code/data structures written (some generated using other programs) in olden days. This is purely a "new-age programmer whinging" type of thing, and I say that with respect (I can't be bothered to think of a better way to phrase it right now).

If you want to clean it up to be more "aesthetically pleasing", then my recommendations are the same as what I said on Discord:

1. Stop using , between bytes, and instead put each on their own line (and yes you explained why you like it on one line), and then "paragraph-ise" sections
2. If you absolutely must keep it on one line, try adding spaces after the commas, and around bitwise | operators,
3. If possible (I don't think so in this case), try using fixed-size formatting (ex. printf %-30s) on each piece of data to "column-ise" them, with commas between

I would strongly suggest #1, despite you not liking it. More lines is not necessarily bad, especially considering this data is being generated on the fly at assemble-time. Be practical please! Don't make me try to find old source code to real-world programs for this architecture from days of yore and show you exactly how "cryptic" stuff can be in comparison.

If you were to combine some of the above ideas, you end up with this:

Code:
PPDAT_bf98_melodyA:
.byte PENTLY_PAT_REST | PENTLY_PAT_D_D8
.byte PENTLY_PAT_N_D | PENTLY_PAT_D_8
.byte PENTLY_PAT_REST
.byte PENTLY_PAT_N_F | PENTLY_PAT_D_8
.byte PENTLY_PAT_REST

.byte PENTLY_PAT_N_DS | PENTLY_PAT_D_8
.byte PENTLY_PAT_N_F
.byte PENTLY_PAT_N_DS | PENTLY_PAT_D_8
.byte PENTLY_PAT_REST
.byte PENTLY_PAT_N_C | PENTLY_PAT_D_8
.byte PENTLY_PAT_REST

.byte PENTLY_PAT_N_D | PENTLY_PAT_D_8
.byte PENTLY_PAT_N_DS
.byte PENTLY_PAT_N_F | PENTLY_PAT_D_8
.byte PENTLY_PAT_REST
.byte PENTLY_PAT_N_AS | PENTLY_PAT_D_8
.byte PENTLY_PAT_REST

.byte PENTLY_PAT_GRACE
.byte 5                         ; maybe this was a typo in your example?  Not sure
.byte PENTLY_PAT_N_GS
.byte PENTLY_PAT_N_AS | PENTLY_PAT_D_D8
.byte PENTLY_PAT_N_GS | PENTLY_PAT_D_4
.byte PENTLY_PAT_N_TIE
.byte PENTLY_PAT_REST

.byte PENTLY_PAT_PATEND

I do not agree with the p prefix. While I strongly agree with prefixing (we talked about this in detail on Discord), I think p is a very bad shorthand proposal: it tells the programmer/debugger/troubleshooter nothing. Also, I will point out that p as a prefix is an *incredibly* common letter indicating "pointer". Do not do this please. Pick something else. PENTLY_PAT_ to me is perfectly fine.
Re: Namespacing symbols used by an ASM6 library
by on (#237403)
Quote:
Anyone who feels otherwise has obviously not spent time in kernel land


I'll just note here that while the asm6 rewrite could be used by anyone, the target kind of is NESmaker users. So yes, they most certainly have not. So, is it then a good idea to assume the usecase of people that have been in the game for decades and are likely using ca65 or some commercial assembler?


the most common shorthand for pointer ought to be ptr. i see it everywhere. but if you've seen p being used frequently, i'll take your word for it.

Not saying the p prefix is the best idea or even a good one, just that something like it would be short and sweet.

The more letters in a const or var (or label) the more likely you'll typo it or forget some uppercase. And as we know since recently, unless it qualifies as a more explicit error like a missing argument or the like, asm6 will treat typos as labels and go along happily.
Re: Namespacing symbols used by an ASM6 library
by on (#237405)
I guess it depends on whether programmers and composers are willing to use Python. Someone who uses Python never has to actually see any of the prefixes. Perhaps I overestimated the number of people who are unwilling to download, install, and use Python, or I overestimated the remaining interpreter version incompatibility problems now that Python 2 is nearing its official end of life and Python 3 ships with pip for installing extensions. Ideally, people who cannot or will not use Python should be a tiny minority. Is this the case?
Re: Namespacing symbols used by an ASM6 library
by on (#237406)
A bit of an indication on that.. my impression from the comments on the nesmaker forums is that it looks like ggsounds' own py converter began seeing wide use only after it was packed in a .exe by someone. Before, they used the built-in converter of NESmaker despite it having some serious bugs you needed to read up on and avoid.

btw i do like the more rows over more columns approach koitsu showed. It is easier to read that way if constants must have long names.

When looking at data, i mostly want it to be evenly formatted. if consts can't be kept the same size or they mess up my columns, i resort to line breaks. so for example, an OAM entry might look like this:

.byte ypos, tileID, palette | OTHER_ATTRIBUTE
.byte xpos

while the rows are sawtoothed, all the relevant data lines up in orderly columns and the one place where i might OR things end up last, which also helps it stand out. It makes it easy to know what a magic number represents regardless if attributes were OR:ed, and you can immediately tell where those attributes are when quickly scanning through with the scrollbar.

edit: fixed the example.
Re: Namespacing symbols used by an ASM6 library
by on (#237407)
FrankenGraphics wrote:
Quote:
Anyone who feels otherwise has obviously not spent time in kernel land

I'll just note here that while the asm6 rewrite could be used by anyone, the target kind of is NESmaker users. So yes, they most certainly have not. So, is it then a good idea to assume the usecase of people that have been in the game for decades and are likely using ca65 or some commercial assembler?

My comment was specifically with regards to people whinging over the "readability of these lines of code". If the above .byte lines are "hard to read" then the viewer seems to not know what's worse. In contrast, those lines are quite readable. It just goes to show that age and experience play a huge role in interpretation of what's considered "unreadable".

FrankenGraphics wrote:
the most common shorthand for pointer ought to be ptr. i see it everywhere. but if you've seen p being used frequently, i'll take your word for it.

What you want (re: "ought to be") isn't what history has shown. Where do I begin? A good start would be to look at literally anything Microsoft writes especially for Windows. They adhere almost exclusively to what's called Hungarian notation. Let me know if you need further evidence (ex. source examples from other projects, e.g. Linux, FreeBSD, etc) as I can drop tons of it fairly quickly. Likewise, s prefix (sometimes str) is often used to indicate string, and a _t suffix is common for typedefs (here's why).

FrankenGraphics wrote:
Not saying the p prefix is the best idea or even a good one, just that something like it would be short and sweet.

Okay let me be very clear here: "short and sweet" is not always the best choice for naming conventions. What's really needed here it a clear way to convey to the person looking at the code (which could be either a person trying to improve Pently, or -- and I suspect this is much more likely -- someone trying to troubleshoot something) that "this thing relates to Pently". You cannot do that with a single byte. Or even two or three.

Furthermore, you are not gaining anything "technically" by using strict shorthand convention like this. This is an assembled program, not an interpreted or scripting language that's spending excessively more CPU time the longer the label/string/thing. This has been talked about extensively over the years.

That said, I am an equally strong opponent of excessivelyStupidFunctionNamesThatAlsoUseCamelCasingAndAreLongAsHell. This is just as bad as the former. People who do this seem to be driven by fear of using comments, and attempt to convey the entire purpose of the world in the function name alone. I don't know how this became a strangely commonplace thing (is it education? Lack of? Or maybe just a lack of someone telling them why this is bad), but it's terrible.

But there's another reason, often overlooked, to use appropriately-descriptive names for labels/variables/things: the likelihood that an author won't understand or remember the code in the future. I cannot stress the importance of this one. This is not me projecting personal experience, it's a very real situation. Just because those variables and the "why" behind a function/thing makes sense today doesn't mean a year from now that you'll remember. Naming things "sanely" is a benefit not only for others (for collaborative projects), but also for yourself.

Thus, PENTLY_PAT as a prefix seems pretty damn good to me.

Else I direct you to https://thedailywtf.com/ or https://www.reddit.com/r/badcode/ for further programming horror stories.

FrankenGraphics wrote:
The more letters in a const or var (or label) the more likely you'll typo it or forget some uppercase. And as we know since recently, unless it qualifies as a more explicit error like a missing argument or the like, asm6 will treat typos as labels and go along happily.

By that line of logic, you'd love brainfuck. And ironically, Loopy -- author of asm6 -- has a well-established history of using pedantically short names for his variables (often 1 or 2 letters) -- oh, and p for pointers too (example) -- not to mention his "skinny on NES scrolling" which while revolutionised accurate PPU emulation, was -- and still is -- one of the most indecipherable documents to date. There's a reason this Wiki page is so long in contrast, and why there's a colourised summary: because it's really damn hard to understand with single-letter things.

I had to deal with this exact same kind of nonsense in the webshit world with the TomatoUSB project -- that's a consumer router firmware, by the way -- whose author decided that exact approach in JavaScript is perfectly reasonable; parts of the C code are just as bad. Go ahead and skim that code and let me know your thoughts on its legibility.

TL;DR -- Less character means less typos is not inherently true, use sanely-named variables/labels for things not excess in length but not too short, write code that's legible operating under the assumption others will need to look at it *and yourself in the future*, and programming in general is terrible.
Re: Namespacing symbols used by an ASM6 library
by on (#237410)
Well i believe you, it's just that i don't feel that how for example microsoft does things is relevant. Completely different context, there's no overlap. Most source i've looked at for nes homebrew (loopys' examples aside, apparently - guess i didn't pay attention), which is what matters to me, has used the convention of ptr.

i'm not advocating for using 1 or 2 letter vars either. I don't think the idea has crossed my mind even, and find it unlikely that people would want to use it in their own assembly code. Handling the bean counting through the AXY registers reduce the needs for names like "i", though that one is so common i wouldn't blink if i saw it. My case: often reused housekeeping stuff is like ptr, tempY, timer. Where i could have used py and px, i use ypos and xpos. More specific things might be playerHasUpgrades. I call the base of a playthrough-persistent general purpose memory block elephantMemory, which might not make sense to anyone else, but oh well. i'm just proposing a simple mnemonic prefix for pently might look more tidy and readable. You'd understand it directly from being found consistently on each constant only in pently, and not in the nesmaker source or your own. It's just like you don't need to know the exact meaning of .jpg to understand that all .jpg:s are the same type of file. We don't need a full readout here, it just adds clutter to the information we actually want to read.

Quote:
It just goes to show that age and experience play a huge role in interpretation of what's considered "unreadable".

Sure, but why raise the entry threshold for those who need it the most and are most likely be using it in numbers? There are more active nesmaker/asm6 users than active non-nesmaker/asm6 users. Many of them are modifying or writing code for the first time. Others prefer to browse scripts uploaded to the forum or facebook group.

But all this might not matter all that much. Many of those might not even open a converted song project for examination, let alone hand typing raw song data in this format.

Another thing to consider though is to ease for common functional variations. Some need visual clarity to concentrate. Others need relevant information right in front of them to be able to visualize. Yet others have dyslexia, in which case redundant verbosity does no good.

Quote:
excessivelyStupidFunctionNamesThatAlsoUseCamelCasingAndAreLongAsHell. This is just as bad as the former. People who do this seem to be driven by fear of using comments

Yes. Whenever i regret a label name or whatever, the regret is usually "i should put that in the comments instead".
Re: Namespacing symbols used by an ASM6 library
by on (#237413)
I think a shorter prefix would be fine, and a lot less onerous than PENTLY_. The only need here is just trying not to use an identifier name that someone would normally want to use. That criteria can be met with relatively few characters. It doesn't have to be foolproof, just enough to make it an easy to remember convention for anyone that wants to use it.

Just keep the scope of this in mind. This isn't zlib. It doesn't need an industrial strength prefix, it just needs not to be inconvenient.

Even just _ as a prefix might be sufficient for this purpose (though I wouldn't suggest that if you were applying it to the ca65 version as well, for a reason that should be obvious to any cc65 user). A _ suffix convention could also work. As long as your identifiers aren't super generic beyond the prefix/suffix (e.g. i_) the likelihood that you get an inconvenient conflict is probably pretty low?


The other thing you might consider is not dumping binary music data as named constants by default. Instead you could just use hexadecimal unless the user requests it?

That kind of thing is useful if you want to debug the data output, or maybe construct some data by hand, but for normal use it's auto-generated data feeding into the assembler without a human needing to be involved.

...or just allow it to be obtuse. I don't think it really matters if this is human readable or not for the majority case. Debugging music data is a specialized need. Part of the reason someone would want to use your engine is so they don't have to debug the music.

Another option, unsure if ASM6 has this or not, might be to define those constants only temporarily for the duration you need them, then undefine them once the end of the music area of the code is reached?
Re: Namespacing symbols used by an ASM6 library
by on (#237423)
Since my ASM6 projects are rarely that big, I tend not to use any kind of scope prefix if the variables or constants are unique enough. Why would I prefix a musical note constant if I know there really is no reason to name anything else in the project a "D_2" or anything similar. So my custom music drivers are perfectly ok to use in their raw "type the bytecode in" format, at least for me. If I had data where the elements are all word-sized, that would encourage me more to try to come up with a converter, but as things stand, programming a tex output parser/converter would take more time than it takes for me to manually "convert" my songs + I have features added that are simply not part of Famitracker or any other similar tool, so many effects are best configured by hand anyway. So overall, a single note in my music data never looks more complicated than this (OR the constant with the upper 3 bits for duration, with the constant specifying the note in the low 5 bits, relative to the currently set "octave"):
Code:
.db row1|C_1


There are quite a few effects, and I tried to follow the 6502 mnemonic convention of three letters only. So you have effects intermixed with the notes, but I'm free to start new lines when I feel it's better for the overall readability. It helps the most when there is at least a word-sized effect argument (common with jumps and loops).

Eg.:
LP0 xx yy yy (to LP3) - Loop the channel back xx times to address yyyy (4 levels of nesting possible)
BR0 xx xx (to BR3) - when encountered, and the corresponding loop counter is 0, set the channel pointer to xxxx
EC0 - Echo note. The next note's volume will be shifted right once.
EC1 - Echo note 2. Same as EC0, but shifts the volume right twice.
INO (or DEO) - Increment (or decrement) the octave variable of the channel.
JPC xx yy yy - Channel-specific jump. The jump only happens if a particular channel (xx) is being processed. Good for reusing parts of other data for echo effects, and then returning to the channel's own data at address yyyy.
... There are tons more, like auto duty cycle increment per note, auto duty cycle increment per frame (gives you that Bad Dudes-type trill), trigger sound effect, and other conditionally executed music effects.

ASM6 has enough "scope levels" where I can feel very comfortable with the header of the song data having a normal label, and then all channels, and then all the sections for each channel using local labels with a convention of @PU1_0, @PU1_1, @PU1_2, ... same with PU2, TRI and NOI.
Re: Namespacing symbols used by an ASM6 library
by on (#237425)
Sometimes just to be sure, i put a label at the end of a routine or block to make sure the @ scope is properly terminated. Might matter sometimes when reusing @loop and @skip a lot, but probably never elsewhere.
Re: Namespacing symbols used by an ASM6 library
by on (#237426)
I would use a short prefix. pt, ptl, or even ptly.
Re: Namespacing symbols used by an ASM6 library
by on (#237470)
I always use long prefixes in my code. Programs are written once but read many times, so readability trumps any and all savings in typing.

And longer names giving more chance of typos is a good thing, because they're less likely to accidentally collide with another valid identifier and silently produce broken output.
Re: Namespacing symbols used by an ASM6 library
by on (#237477)
I definitely get that in a general situation. I think my slight* discomfort here is that the namespacing doesn't add any information a human would want to read. We already know it's pently because we just opened the pently source files, or baked the song data file. It's just a seatbelt meant for asm6 and gets a bit in the way for human inspection.

*most users won't need to read it.
Re: Namespacing symbols used by an ASM6 library
by on (#237519)
That still sounds like an argument for long prefixes to me. If asm6 lacks any other means of namespace collision avoidance then the prefix should be as long (and consistent, in case module names collide and one needs to be renamed) as sensibly possible, to make such collisions as unlikely as possible.

EDIT: I will admit that prefixes within a module aren't as important, since you generally have full control over the whole set. So PENTLY_PAT_* could easily be reduced to PENTLY_*, as long as tepples could arrange for there to be no collisions or confusion with other constants used within Pently.
Re: Namespacing symbols used by an ASM6 library
by on (#237522)
I think just PENTLY_ would work nicely for anyone, especially if using koitsus' line breaks and groupings.
Re: Namespacing symbols used by an ASM6 library
by on (#240133)
za909 wrote:
Since my ASM6 projects are rarely that big

I'm pretty sure users of NESmaker expect to use ASM6 for projects as big as Mega Man 4 and Dragon Warrior IV. I assume this based on the fact that ASM6 is the required assembler for NESmaker, and NESmaker targets a 4 Mbit board.

za909 wrote:
I tend not to use any kind of scope prefix if the variables or constants are unique enough. Why would I prefix a musical note constant if I know there really is no reason to name anything else in the project a "D_2" or anything similar.

Different libraries by different authors that are included in an ASM6 project much larger than your own could define D_2 to mean completely unrelated things.

za909 wrote:
as things stand, programming a tex output parser/converter would take more time than it takes for me to manually "convert" my songs + I have features added that are simply not part of Famitracker or any other similar tool, so many effects are best configured by hand anyway.

I tried this at first with Pently. But then I ran into a project where the composer that my producer hired insisted on the FamiTracker module being the authoritative version (in data modeling lingo, the single source of truth), which a game's build process converts to the format used by the driver and automatically re-converts whenever the composer changes the module. And then I ran into another project where the producer insisted on the FamiTracker module being the authoritative version in order to be able to switch to a different composer mid-project. Because I was a programmer on the former project and a candidate for composer on the latter, not the producer, I lacked the authority to make the Pently score the authoritative version.

za909 wrote:
So overall, a single note in my music data never looks more complicated than this (OR the constant with the upper 3 bits for duration, with the constant specifying the note in the low 5 bits, relative to the currently set "octave")

A 5:3 bit split between pitch and duration appears common in these libraries.

za909 wrote:
ASM6 has enough "scope levels" where I can feel very comfortable with the header of the song data having a normal label, and then all channels, and then all the sections for each channel using local labels with a convention of @PU1_0, @PU1_1, @PU1_2, ... same with PU2, TRI and NOI.

My converter already namespaces the IDs and definition labels of sound effects, instruments, drums, patterns, and songs. Namespacing internal labels, internal constants, and internal macros, such as your LP0 and JPC, is where I got hung up. It'll take a while to actually build, and I'm not yet being paid to work on Pently, whereas I am being paid to work on other projects.
Re: Namespacing symbols used by an ASM6 library
by on (#240735)
Prefixing song data constants and macros is filed as #42.
Re: Namespacing symbols used by an ASM6 library
by on (#241079)
So I've spent a couple days on prefixing driver-internal function and variable names and prefixing macros and constants in song data. I'd like to verify that all unprefixed symbols are hidden by a rept 1 scope block. I tried having ASM6F write a Mesen label file, but I didn't see a way to exclude rept-scoped labels.

Is there a way to determine what symbols are visible at the top level after having assembled a file in ASM6?