Bregalad wrote:
Quote:
Is it only concern about speed that prevented people from using higher-level languages? or the lack of tools, perhaps? I remember seeing some game for the NES here was made with cc65's C compiler, and it didn't (seem) to have speed issues.
Are you kidding ? I tried to make a program that basically writes text on the screen, and even that was too much to handle, it lagged.
The 6502's architecture does not seem to lend itself well to C. Fortunately, the same traits that cause this also make the '02 easy to envision solutions in assembly language. Here's an example of compiled C code, an empty count-to-100 loop, the line
Code:
for (i = 0; i < 100; i++);
with i defined as a char (1 byte). The equivalent BASIC line would be
Code:
FOR I = 0 to 99 : NEXT I
The popular CC65 C compiler, which admittedly is not optimizing, produces the following 6502 code from it. The comments were added by hand, not by the compiler.
Code:
jsr decsp1 ; make 1 byte space on the stack
000003r 1 A2 00 ldx #$00
000005r 1 A9 00 lda #$00
000007r 1 A0 00 ldy #$00
000009r 1 91 rr sta (sp),y ; initialize i to 0
00000Br 1 A0 00 L0003: ldy #$00
00000Dr 1 A2 00 ldx #$00
00000Fr 1 B1 rr lda (sp),y
000011r 1 C9 64 cmp #$64 ; cmp i to 100
000013r 1 20 rr rr jsr boolult ; do a less than comparison
000016r 1 F0 03 4C rr jne L0005 ; if less than
00001Ar 1 rr
00001Br 1 4C rr rr jmp L0004 ; if equal to 100
00001Er 1 A0 00 L0005: ldy #$00
000020r 1 A2 00 ldx #$00
000022r 1 B1 rr lda (sp),y ; get i again
000024r 1 48 pha
000025r 1 18 clc
000026r 1 69 01 adc #$01 ; increment i
000028r 1 A0 00 ldy #$00
00002Ar 1 91 rr sta (sp),y ; store i
00002Cr 1 68 pla
00002Dr 1 4C rr rr jmp L0003
000030r 1 20 rr rr L0004: jsr incsp1 ; restore stack
all for what can be done in assembly language with only:
Code:
LDX #0
L0003: INX
CPX #$64
BNE L0003
which, besides being less than one-seventh as many bytes, has no subroutine calls to eat up even more time like the compiled C version did.
Since the index value is not used inside the loop, we could further shorten the code by starting at 100 and counting down, to zero, to get the same number of loop iterations. We'll use DEX, which like many other instructions, has an automatic, implied, compare-to-zero instruction built in, so we can omit the CPX #0:
Code:
LDX #$64
L0003: DEX
BNE L0003
Clearly the assembly-language version will absolutely
fly compared to the compiled C-language version.
I'm sure a much better C compiler could be written for the '02. Perhaps there is one and I'm just not aware of it. The above is from the middle of my web page, "
Assembly Language: Still Relevant Today." An experienced programmer came across it last week and emailed me saying he was working on a 68020 around 1990 and they were looking for the C compiler for it that put out the fastest-performing code. One thing they tried was an empty count-to-10,000 loop, and timed it. Zero. What? So they tried 10,000,000 instead. Still zero. It turned out that the compiler was smart enough to see that the loop did nothing, so it compiled no code for it. Unless the desired delay is super short, a loop is usually a very poor way to get it; but if a programmer knew what the compiler should put out, he might have a very good reason for the empty "for" loop, and wouldn't want the compiler to delete it.
I remember reading, back then, about compilers trying to optimize things to make their compiler look better than the competition's. They watched for various things you might want to do, particularly well known benchmarks and tests, and replaced them with the hand-optimized version strenuously written by the few programmers who knew the processor a lot better than the customer would. My reader commented that that was "a bit naughty."