Programming in C - questions

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
Programming in C - questions
by on (#216876)
1. The structure.member or structure->pointer?
Code:
typedef struct {
 int value1, value2;
} STRU_T;

//this
static STRU_T mystru;

//OR...

static STRU_T tmp;
static STRU_T *mystru = &tmp;

2. memcpy() or memset() VS "manual" set to zero.
Code:
memset(buffer,0,8);

//OR...

unsigned char *buf = buffer;
*buf = 0; buf++; //1
*buf = 0; buf++; //2
*buf = 0; buf++; //3
*buf = 0; buf++; //4
*buf = 0; buf++; //5
*buf = 0; buf++; //6
*buf = 0; buf++; //7
*buf = 0; //8

3. Buffer I/O
Code:
unsigned char *buffer = malloc(0x100);
...
buffer[index] = data;

//OR...

*(buffer+index) = data;
Re: Programming in C - questions
by on (#216877)
In terms of maintainability, never unroll your loops manually. Modern compilers and modern CPUs will know better than you if it's beneficial. (Memset, in particular, is often already a specially optimized function. You're not going to be able to write something faster, especially not in C, and probably not even in asm)

For numbers 1 and 3, "it depends".
Re: Programming in C - questions
by on (#216878)
Zepper wrote:
Code:
//this
static STRU_T mystru;

//OR...

static STRU_T tmp;
static STRU_T *mystru = &tmp;

Definitely the first one. Why would you want to use indirect access when you can access the variable directly? Direct access is smaller and faster (at least for global variables. Local variables are internally accessed via stack pointer anyway, but even here there's no reason to add yet another pointer.)

Pointers only make sense here if the pointer is the parameter variable of a function:
Code:
void SomeFunction(MyStruct *pStruct)


Zepper wrote:
2. memcpy() or memset() VS "manual" set to zero.

memset if you want to set all bytes to the same value.

Take care if your array is of type int (or any other non-byte type) and you want to initialize with anything non-zero: The value that you set refers to every single byte, not every array item:
Code:
int array[5];
memset(array, 1, 5 * sizeof(int));

This will not set all five values to 1. It will set all separate bytes to 1.
So, if the int consists of two bytes, then each array item will now have the value 257 (0x0101).

memcpy if you want to copy data from another array.
(Using memcpy to initialize with 0 or any other single byte value makes no sense. Where do you take the source data from? Do you have some constant array with all zeroes in it that's only there to initialize other arrays with 0 by using it with memcpy?)

And for everything else (for example if you really have an int array that you want to initialize with a certain value per entry and not per byte), you should use a for loop and not initialize every entry individually:
Code:
int i; /* Use unsigned char instead of int if you program for the NES
          and your array is less than 255 entries. */
for (i = 0; i < 5; ++i)
    array[i] = 1;

But even if you do initialize each item individually (because each item gets another value that cannot be calculated somehow), the best method is
Code:
array[0] = 25;
array[1] = 7;
array[2] = 130;

etc. and not
Code:
unsigned char *pointer = array;
*pointer = 25;
++pointer;
*pointer = 7;
++pointer;
*pointer = 130;
++pointer;


An additional suggestion (at least for NES programming. It isn't necessary in PC programs): If your arrays are never larger than 255 bytes, write your own CopyData and FillData function in Assembly where the size parameter is of type byte instead of size_t. Makes the function smaller and faster than the official memcpy and memset functions.

Zepper wrote:
Code:
buffer[index] = data;

//OR...

*(buffer+index) = data;

The compiler shouldn't make a difference here, but the typical style in C is to access array entires with the brackets, not by adding the index to the pointer address.
At first glance, the latter part looks like you're doing some hacky pointer tricks while the brackets immediately show that you simply want to access an array item in the regular way.
Re: Programming in C - questions
by on (#216879)
Are you programming an NES game in C, or something else?
Re: Programming in C - questions
by on (#216880)
Devkitarm has shipped with suboptimal memset and memcpy before, so I had to rewrite them in asm.
Re: Programming in C - questions
by on (#216883)
Zepper wrote:
2. memcpy() or memset() VS "manual" set to zero.

In most modern compilers memset() is an intrinsic rather than an actual library function. Where the sizes of data are known, it will optimize to whatever equivalent is small or fast, appropriately, rather than use a function call.

In general you'll find that with optimizing compilers it makes little difference how you initialize when the initializers are relatively simple, so you should just do the one that feels best for you to write, because the only differences are usually cosmetic.

However, if the performance of an initializer is critical for your application there is no general advice anyone can give for how to write it the "best" way. Every compiler has different foibles. You will have to check the assembly output yourself. If it's not critical, don't worry because it's going to do a very optimal thing most of the time.

If you're asking for advice regarding a specific compiler (e.g. cc65), then there could be a more definitive answer to this.


Also, any incomplete initializer for an array or struct will be padded out with zeroes, so this also conveniently initializes buffer to 0:
Code:
buffer[5] = {0};
Re: Programming in C - questions
by on (#216885)
My responses are with regards to general use on x64/x86 architecture today, with gcc or clang. If this is for NES or some other architecture, I have no experience with cc65's compiler.

1. It depends. Neither is better or worse than the other. If you don't need the additional indirect addressing (2nd example), then don't use it. If you do plan on changing mystru to point to a different allocated STRU_T somewhere, then the 2nd example is a better choice. The 2nd example is also necessary if you need to dynamically allocate structures using malloc/calloc and then referencing them (example: a linked list of structs).

2. You didn't declare buffer. I'm going to assume you either were using char buffer[8]; or char *buffer; buffer = malloc(8);.

In either case: memset is recommended. For small-ish buffers, however, allocating them on the heap is fine. For large-ish, always allocate dynamically, and use calloc() if you know definitively the contents need to be zeroed.

If you're allocating it on the heap though, you could just do char buffer[8] = {0}; to have it pre-allocated to zero by the linker/loader.

3. It depends, but going entirely off the example you gave, I would prefer the former. This question distantly reminds me of char *argv[] vs. char **argv; I always prefer the former.

If you're using clang/LLVM, and you're really concerned about performance (cycle counts) or size (number of instructions), then I strongly suggest using -S -mllvm (and you might be interested in --x86-asm-syntax=intel if you prefer Intel syntax; I do) then looking at the resulting .s files. You might be very surprised the difference things like -O vs. -O2 vs. -O3 vs. -Os makes.

The reason I mention this: last month I saw the FreeBSD project commit this. My reaction: "why are they using that temporary variable t? They can do this without a temporary variable: just use the XOR swap method!" So I replaced their version with mine: p[0] ^= p[1]; p[1] ^= p[0]; p[0] ^= p[1];. The result was slower and longer than using a temporary variable. -O2 greatly helped but it was still longer. My point is that compilers sometimes generate code that a human writing assembly wouldn't. I generally trust compilers to make optimal decisions these days, but there are always cases where humans do better.
Re: Programming in C - questions
by on (#216889)
Some compilers (GCC in particular) will even optimize some loops into calls to memcpy/move/set functions.
Re: Programming in C - questions
by on (#216928)
I hope I'm not hijacking this thread too much, but I didn't think it was worth starting a new thread over. Anyway, does anyone know of any good recourses for learning C from Java? (As in, something that highlights the differences, because they seem to share a lot.) I want to write my own tools, but I'm not at all enthusiastic about Java, and I assume C would be better. All the resources I've seen for C online assume you have zero programming knowledge, and are just gigantic.
Re: Programming in C - questions
by on (#216930)
Espozo wrote:
I hope I'm not hijacking this thread too much, but I didn't think it was worth starting a new thread over. Anyway, does anyone know of any good recourses for learning C from Java? (As in, something that highlights the differences, because they seem to share a lot.) I want to write my own tools, but I'm not at all enthusiastic about Java, and I assume C would be better. All the resources I've seen for C online assume you have zero programming knowledge, and are just gigantic.

I don't know of any resources for people who know Java that want to learn C, but I do know of a top-notch resource for C in general: C Programming: A Modern Approach by K. N. King. Don't let anyone tell you the K&R book (also see Wikipedia) is a great starter book -- it isn't. I'm not going to derail the thread, nor am I going to get into a discussion about PL advocacy, but for general tooling you might also consider Go (it sort of makes me think of a "C-like" language that results in a single self-contained binary -- no libraries/frameworks/runtimes/other crap to install).
Re: Programming in C - questions
by on (#216934)
Going from Java to C. Forget everything java has taught you, its all bad and will lead you to ruin. Start from scratch and a learn the proper-way of doing things with C.

C++ is a bit different, and has some concepts shared with Java(objects, methods, static, public, etc ) , but you need to learn pointers, pointer arithmetic, how to delete things, when you delete things, and when you use a pointer,a reference or actually just have the object/data type. So basically forget everything Java taught you and go back to C from scratch ;) there is a lot of fundamentals that Java just avoids that are critical to a C/C++ program.
Re: Programming in C - questions
by on (#216944)
Yeah, Java's strength is in its strongly typed syntax and object oriented structure. C++ lets you do what you want, requires you to manage your own memory usage, and has OOP awkwardly tacked on (though the last part is somewhat subjective I guess).

I wouldn't say you can't use what you've learned from Java, though. It does force some sensible patterns that a lot of C++ programmers tend to forget/willingly ignore.
Re: Programming in C - questions
by on (#216946)
koitsu wrote:
If this is for NES or some other architecture, I have no experience with cc65's compiler.


Quote:
The structure.member or structure->pointer?

If I'm understanding question correctly (which I'm not sure I am) -- for 6502 with cc65, structure.member will give a lot better performance than pointer->member. You might likely get

Code:
lda _structure+3 ;3 being the number of bytes offset to "member"

which is just 4 clock cycles.

Doing it with pointers will first copy the address of the pointer into a temporary location in zero page, to do an indirect indexed lookup, like:
Code:
lda _structurePtr
sta $10 ;or some other temporary place in zero page
lda _structurePtr+1
sta $11
ldy #3 ;3 being the offset
lda ($10),y

Which is like 21 clock cycles -- more than 3 times slower. This really falls apart if you do it in a loop where it has to repeatedly copy the pointer(s) to zero page over and over again.

(Edit: realized the first version was wrong, fixed)
Re: Programming in C - questions
by on (#216963)
The C compiler is GCC.
Re: Programming in C - questions
by on (#216964)
You can use gcc -S -masm=intel to review the generated assembly (.s files) in Intel format. Rest of my advice about optimisation flags etc. from and earlier post still applies. Don't be surprised if what you find scares you.
Re: Programming in C - questions
by on (#216981)
koitsu wrote:
You can use gcc -S -masm=intel to review the generated assembly (.s files) in Intel format. Rest of my advice about optimisation flags etc. from and earlier post still applies. Don't be surprised if what you find scares you.

Compiler Explorer is very handy for this kind of work.
Re: Programming in C - questions
by on (#217085)
thefox wrote:
koitsu wrote:
You can use gcc -S -masm=intel to review the generated assembly (.s files) in Intel format. Rest of my advice about optimisation flags etc. from and earlier post still applies. Don't be surprised if what you find scares you.

Compiler Explorer is very handy for this kind of work.

One word - awesome.
I'm playing with it and... my CHR decoding function was 85 ASM lines. Now it's 67.
Re: Programming in C - questions
by on (#217175)
godbolt is such a great tool.

Quote:
However, if the performance of an initializer is critical for your application there is no general advice anyone can give for how to write it the "best" way.

I dunno. Initializer lists seem like a clear winner to me. They require no analysis to optimize; they produce better code than memset on -O0. And their syntax is... well... intended for initialization!

Quote:
Anyway, does anyone know of any good recourses for learning C from Java?

If you understand 'if' 'for' 'while' and functions then start writing code. That's all you need to get started.