Let's discuss loops now. I have 4 cases for it. Irrelevant about
for() vs
while() ok? ^_^;;
Code:
int i;
char buffer[8]; /* let's assume a small size */
//case 1
for(i=0;i<8;i++)
buffer[i] = 0;
//case 2
buffer[0] = buffer[1] = buffer[2] = buffer[3] =
buffer[4] = buffer[5] = buffer[6] = buffer[7] = 0;
//case 3
memset(buffer, 0, 8)
//case 4
char buffer[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; /* here, it could be filled up with specific values */
Is this about loops or methods of array initialization (out of which only one is clearly a loop)?
Personally, I like to avoid redundancy at all costs, so 2 is definitely out. I guess I'd pick 4 if I wanted to use different values for each position, and 3 (which I didn't know about until now) if I wanted to fill it all with a single value. Either way, a loop seems unnecessary if I can do it in one shot.
Unrolling loops. The examples cover clearing a buffer (filling up with zeroes).
Case 1 is the loop. The other cases are ways to unroll the loop. *Perhaps* the compiler does that for you.
The answer lies in knowing what the compiler generates assembly-wise. It's possible to profile all of the above, but as with most programming things, it boils down to the programmer needing to decide what's more important: space savings, CPU savings, or code sanity (i.e. what reads easiest/makes the most sense when read).
If you're using MSVC or some other modern compiler, all of those are likely to generate the same code, even the memset. Memset is often treated as a compiler intrinsic so that the optimizer can optimize through it. Simple for loops like that can be replaced by a memset during assembly too, especially if you've told it to optimize for size.
I don't really see an advantage to any of those coding styles in particular. Case 2 seems like a waste of text space, but otherwise the choice seems arbitrary to me.
If you want to know what your compiler is doing, turn on the flags to dump the assembled version and take a look. That's really the best way to learn how to write code that is germane to the optimizer.
What about
-funroll-all-loops in gcc?
Here's something good for reading.
Never use -funroll-all-loops. Just don't.
If your code is too slow re-code it in assembly. -funroll-all-loops won't make it any faster, especially if you target a modern PC where loading code in instruction cache is more expensive than doing a short loop over a ton of data.
If you have a large array to initialize, then you'd want to unroll partially (i.e. do a bulk of ~4-8 writes in each loop) for a significant performance gain, unrolling more than that isn't going to gain you anything.
As for your code, version 4) is forbidden to clear an array in both C and C++, because the langauges do not want to create a loop without you knowing. However, method 3) is the (in my opinion only) correct way to do it. memset() will automatically call something that is efficient enough.
No sooner than this afternoon I had to code my own memset() in assembly to make a performance boost in some program at work.
I also agree to avoid use of -funroll-loops in 99% of scenarios. It should only be used in very specific circumstances. The compilers today can make the determination to roll or unroll a lot better than we can (given today's CPU architecture designs).
This site puts the "fun" in
-funroll-loops:
Gentoo is Rice
Usually I do (now "did") -funroll-all-loops -ffast-math -msoft-float.