JMP, fine...JSR, nope...

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
JMP, fine...JSR, nope...
by on (#139196)
So this is the first time I've experienced this. It's sorta wacky.

I have a routine that I'd like to use in a few instances, so I set up a JSR->Routine->RTS scenario. It does not work. I mean, there are no errors returned, however its effect does not work...

Now, if I take the meat of the routine (all except for the RTS) and put it in line rather than JSRing to it, it works exactly as expected.

Also, I tried JMPing to it and then JMPing back to the next line (which would, for all intents and purposes, do the same as a JSR->RTS I would think), and that actually worked as expected.

I could do it in line, except that I'd like to use the routine in a few places and I'd have to change the label names...it just seems gratuitous to have to put the exact same code in multiple places like that; I had intended to just JSR to it from the few places that I needed it.

Can anyone think of a reason why it would work in line, JMPing to it and JMPing back would work, but cutting the exact code out, pasting it somewhere as a routine, and JSR-RTSing from it would not?

In context, I feel like this might be a fundamental thing I don't understand rather than something wrong with the code.

Thanks!
Re: JMP, fine...JSR, nope...
by on (#139197)
I had a problem like this not too long ago. I think it was because I did not push and pull my registers during NMI since I don't put everything in NMI. Maybe its that?

Just my noobish 2 cents.
Re: JMP, fine...JSR, nope...
by on (#139199)
In addition to possible interference by interrupt handlers, it's worth asking whether the meat of your routine touches the stack pointer or data on the stack at all.
Re: JMP, fine...JSR, nope...
by on (#139202)
Hey guys - thanks.

I push all my stack values at the top of the NMI and pull them at the end. So that's not it...

But more importantly, nothing is being pushed or pulled during this routine.

I'm mostly perplexed that it works perfectly in line, but not as a routine.

Thanks for the suggestions though, definitely appreciate them. Any other thoughts?



EDIT: Actually, I am using a data pointer to pull some info during the routine. Could that have something to do with it? In which case, what exactly is the problem and how could it be remedied?
Re: JMP, fine...JSR, nope...
by on (#139203)
My only guesses are that there's something wrong with the stack or with the place you're putting this subroutine in the ROM.

Is the stack being properly initialized? Is the subroutine outside of the program flow? Is it in a bank that's mapped in when you call it?

Maybe if you shared some code or a ROM it would be easier for us to debug. This is definitely not common and shouldn't be happening.
Re: JMP, fine...JSR, nope...
by on (#139204)
JoeGtake2 wrote:
EDIT: Actually, I am using a data pointer to pull some info during the routine. Could that have something to do with it? In which case, what exactly is the problem and how could it be remedied?

Pushes and pulls from the stack could mess with a subroutine, but reading from an indirect location shouldn't, no.

Code:
lda #$FF
pha;one byte pushed to the stack
dostuff:
pla;the same byte (#$FF) is pulled off the stack


Versus:

Code:
lda #$FF
pha;one byte pushed to the stack
jsr dostuff;this pushes the return address to the stack
returnlabel:

;Later in the code...

dostuff:
pla;This pulls half the address of returnlabel-1 off the stack rather than
;#$FF, because the address was pushed onto the stack on top of
;#$FF
rts;And, because of half the address has been pulled
;the two bytes the RTS uses as the return address
;will no longer be the two pushed there by JSR

;Your program will probably return to a garbage location

If you're not touching the stack at all in the routine, it's not that. As tokumaru says, there's not much to talk about without seeing the code in question.
Re: JMP, fine...JSR, nope...
by on (#139205)
No problem - here is some code. Hopefully it's enough context?

I have added a acceleration and deceleration method to my movement (works fine). Now I have to update my collision detection. This is where the problem is coming in.

Here is the code that works in line. For example, this reads the point directly to the right of the player's right bounding box, first at the top bbx, then at the bottom bbx, and evaluates the tile there for a collision. This works absolutely fine:
Code:

TileCheckRight:
     LDA playerXright
     CLC
     ADC maxSpeedHi
     STA tempx

     LDA playerYtop
     STA tempy

     JSR GetTilePosition  ;; this reads from a collision table based on tempx and tempy
     JSR CheckCollision ;; for right now, this just checks solid or not solid (1 or 0)
     BEQ nextA
     JMP skipDec
nextA:
     LDA playerYbottom
     STA tempy
     JSR GetTilePosition
     JSR CheckCollision
     BEQ nextAA
     JMP skipDec
nextAA:



However, if I am to take that bit of code, turn it into a routine (i've tried putting it plenty of places just in sort of a trial and error way, and most I know have no conflicts as I can put other things in place there and JSR to them just fine), JSR->RTS to it (RTS coming right after the nextAA) , the values are not loaded.

Again, if I put it elsewhere and do something like this instead:

In the inline:
Code:
     ;;;; the code leading up to it...
     JMP TileCheckRight
whatever:



and then in the routine:

Code:

TileCheckRight:
     LDA playerXright
     CLC
     ADC maxSpeedHi
     STA tempx

     LDA playerYtop
     STA tempy

     JSR GetTilePosition  ;; this reads from a collision table based on tempx and tempy
     JSR CheckCollision ;; for right now, this just checks solid or not solid (1 or 0)
     BEQ nextA
     JMP skipDec
nextA:
     LDA playerYbottom
     STA tempy
     JSR GetTilePosition
     JSR CheckCollision
     BEQ nextAA
     JMP skipDec
nextAA:
     JMP whatever



...this ALSO works just fine, and the values are loaded and everything works as expected...

Does this help at all?


****EDIT, I've disabled all potential banks swapping in and out - so it's definitely not that. Everything in this code is loaded from the start and never unloaded.
Re: JMP, fine...JSR, nope...
by on (#139207)
So the RTS should be after nextAA?

Is there another RTS after whatever skipDec leads to? If not, you're jumping someplace and never actually returning from the subroutine.
Re: JMP, fine...JSR, nope...
by on (#139209)
Honestly what I would do is just run the rom in an emulator with a debugger like fceux or nintendulatordx, set a break-point on the routine, and step through it. You'll probably spot the problem immediately that way.
Re: JMP, fine...JSR, nope...
by on (#139211)
Random things that come to mind:

1. Stack isn't initialised (i.e. ldx #$ff / txs and should only be done once during RESET) and/or cleared correctly (i.e. random crap is left in $0100-01FF that may be affecting certain behaviours depending on how your code actually works).

2. Some of your variables are referencing areas of RAM that are intended to be used by the stack, e.g. somevar eq $01ff with some code that modifies somevar (thus modifying the value at $01ff), but if the modification is done within the subroutine and then you call rts the PC that you get returned to is going to be different than what you expected (because you essentially modified the stack contents).

3. Bug in your code that is overwriting the stack when it shouldn't be. Common reasons I can think of for this would be using indexing operations that wrap pages (e.g. $00ff -> $0100) and you end up encountering Issue #2, or NMI routines periodically doing something to the stack area which they shouldn't. Pre-assembled values being calculated within the stack space would be determined through an assembly listing generation, but ones in real-time would require a debugger (see emulator). I would suggest looking through an assembly listing anyway, especially to make sure all of your variable accesses are zero page when they should be and absolute (16-bit) addresses when they should be (this matters if using modifiers that return the high byte or low byte of an address of something (a lot of code has to do this) -- and while that's done assemble-time, it can still have unexpected behaviours depending on what your goal is with your code).

4. Bug in your code where you're already using the stack but aren't pulling ("popping") values off entirely, resulting in a stack imbalance (more specifically: eventual stack underflow, I believe), e.g. pha / phx / phy somewhere, but then only doing ply / plx at the end of your routine (3 pushes vs. 2 pulls). Every time that routine would get called the stack pointer would decrement by 1 indefinitely and eventually bottom out (to $0100) and wrap back to $01FF.

5. A few other things that are in the back of my mind but would rather not discuss without ruling out all of the above.

It would really be more helpful if you could just post actual code snippets like "This works" {insert code block} "This doesn't" {insert code block}. Your explanation given so far is pretty good, but the attempt to explain the code (at this stage) is more complex than just showing what does and doesn't work with verbatim 6502.

Most importantly: I second thenendo's recommendation of using an emulator. FCEUX will show you stack contents in real-time, so if you've got a stack exhaustion situation it's quite easy to determine. Oh, and about FCEUX: remember, it shows the stack in "reverse order" -- that is to say, a stack that shows C5,80,A5,57,80 means that $C5 was the most-recently-pushed-value onto the stack. I'd have expected it to show the most-recently-pushed values at the end of the list, not the start.

I think you basically have a "lingering bug" somewhere in your code that manifests itself only because you're now using the stack more -- meaning this would probably bite you further down the road if you did more stack operations anyway, so fixing it now is wise.
Re: JMP, fine...JSR, nope...
by on (#139220)
thenendo wrote:
Honestly what I would do is just run the rom in an emulator with a debugger like fceux or nintendulatordx, set a break-point on the routine, and step through it. You'll probably spot the problem immediately that way.

You're absolutely right. We can continue to play the guessing game here, but the best way to approach this would be to trace through the code as it runs. Set up a breakpoint at the JSR instruction, then step through the code and see if the CPU can successfully reach the subroutine and return from it, and what happens afterwards. That should give you the answer.
Re: JMP, fine...JSR, nope...
by on (#139221)
koitsu wrote:
a stack that shows C5,80,A5,57,80 means that $C5 was the most-recently-pushed-value onto the stack. I'd have expected it to show the most-recently-pushed values at the end of the list, not the start.

Well, since the stack grows downwards (from $1FF down to $0100), when entries are sorted by their addresses the stack does look the way FCEUX shows it. Maybe they thought it would be confusing to display the same area of memory sorted differently (increasing addresses in the hex editor, decreasing addresses in the debug window). I guess I prefer to have the ordering consistent across windows, but I agree that there could be some sort of indication of where the top of the stack is.

Quote:
I think you basically have a "lingering bug" somewhere in your code that manifests itself only because you're now using the stack more -- meaning this would probably bite you further down the road if you did more stack operations anyway, so fixing it now is wise.

That's my opinion too. The bug is probably somewhere else, and even if you can get around it by not using JSR/RTS in this particular case, it will probably manifest itself some other way later in development, so you'd better find out what's wrong now.
Re: JMP, fine...JSR, nope...
by on (#139222)
Code:
 The bug is probably somewhere else, and even if you can get around it by not using JSR/RTS in this particular case, it will probably manifest itself some other way later in development, so you'd better find out what's wrong now.


Yeah - that's my inclination too. I don't want to bury the gremlin and have it show up again later.

Quote:
the best way to approach this would be to trace through the code as it runs. Set up a breakpoint at the JSR instruction, then step through the code and see if the CPU can successfully reach the subroutine and return from it, and what happens afterwards. That should give you the answer.


Thanks for the tip. Thus far, I have not played with the debugger in FCEUX at all. How would I determine the address of the JSR instruction in question in order to set the breakpoint? I'm looking now, but I'm not entirely sure how to find that info.

You guys rock.

Joe
Re: JMP, fine...JSR, nope...
by on (#139223)
JoeGtake2 wrote:
How would I determine the address of the JSR instruction in question in order to set the breakpoint? I'm looking now, but I'm not entirely sure how to find that info.

If your assembler can generate a listing file, that will contain all the addresses. I often do something more "hacky" though: I put "sta $ff" (or any other memory location other than $ff that's known to be unused) wherever I want to start debugging (in your case, that'd be right before the JSR) and set up a breakpoint for writes to $00ff. It's easier than looking addresses up.

When the breakpoint is triggered, just keep clicking "step into" and watch what happens as each instruction is executed. After you solve the bug, don't forget to remove the "sta $ff".
Re: JMP, fine...JSR, nope...
by on (#139226)
Great advice. I followed it. I loaded $FF to A at the start of the routine so I could see if it was getting accessed, and then step through to see what the accumulator values were and to make sure things were going where expected. I was able to follow the code line by line, and everything is exactly as expected. The position counter goes where it should (from everything I can tell), the values in the accumulators are correct at the correct times, the branching seems to work the way it is supposed to...but when playing, it does not work as supposed to (read: as it does when this code is in line rather than JSRed to).

...I still don't have any idea what the cause is, but in just doing some trial and error stuff... I tried storing the accumulator into Y before the returning from the routine (TAY), then restoring it to A after the jump back (TYA) and then did my evaluation on the value. To my surprise, this worked.

So despite the accumulator reading the right value in debugging, somehow during the return from the subroutine, the value is being corrupted. Shifting it into Y and then pulling it out of Y after the return seems to fix the problem. I find it incredibly peculiar, especially since there is no evidence of this value being corrupted in the debugger, and everything *seems* to work just as it should when stepping through it in the debugger.

Anyhow, thought I'd give an update. Weird.
Re: JMP, fine...JSR, nope...
by on (#139230)
Just to be clear, you're not expecting registers to be preserved during JSR and restored at RTS, right?

If you're using ca65, you can output your labels to a text file with the -Ln linker flag. You can then search for the label in that text file to find its address, or better yet, turn that label file into an FCEUX debug symbols file which will show the labels directly in the debugger. There is a python script that does this in an example I made a while ago: minimal ca65 example
Re: JMP, fine...JSR, nope...
by on (#139231)
Other helpful things when debugging:

1. Debug > Trace logger.

This lets you dump a text file containing every instruction executed and the status of every register/flag at each step.

2. Conditional breakpoints.

This lets you set a breakpoint that also has a condition.
See: FCEUX debugger guide


For example, if you know that if something is wrong if A is 5 at line $8075, then what you can do it start a trace log at a time before things go wrong, create an execuition breakpoint on 8075 with the condition A==#5, then run until the breakpoint is hit. Stop the trace at this point, and you will have a log of everything that happened up until that breakpoint. From here you can work backwards from the end of the file until you see exactly what caused the problem.
Re: JMP, fine...JSR, nope...
by on (#139233)
Quote:
Just to be clear, you're not expecting registers to be preserved during JSR and restored at RTS, right?


They're not? Hm...that is definitely news to me...

So if I were to do something like this:

Code:

     LDX #$01
     JSR whatever


.......

whatever:
    TXA
    RTS


...could I not expect 1 to now be in the accumulator (not to mention, X register still as well...)? If that's the case...eek...


I'm guessing what you meant is that THIS will not work:

Code:

      LDX #$01
      JSR whatever

.....

whatever:
      INX  ; no longer preserved, because something was done to affect it
      TXA ; accumulator is no longer preserved as it was before the jump, because something was done to affect it
      RTS



...that I wouldn't expect X to be preserved as '1', because it was affected in the subroutine...so now, both the accumulator and x register should be '2', right? If you're asking if I was expecting the accum and x reg to still be preserved as they were prior to the jump in THIS case, no. I understand they have been affected so their values will be different continuing on. However, if you're saying I can not count on the value of the variables at all if I'm JSR->RTSing...that I couldn't even count on this subroutine returning '2'...I'm afraid I have a fundamental flaw in my understanding...
Re: JMP, fine...JSR, nope...
by on (#139236)
I guess you miss kasumi's comment.

this is what I think your code is doing(' is for branch).
Code:
loop:                             ;(6)
 jmp TileCheckRigth               ;(1)
somewhere:                        ;   ('4)
 ;...
skipdec:                          ;(4)
 jmp loop                         ;(5)('5)

TileCheckRigth:                   ;(2)
 ;...
 jmp skipdec                      ;(3)     there is not a problem, everything is jump
 ;...
 jmp somewhere                    ;   ('3)


versus jsr way

Code:
loop:                             ;(6)
 jsr TileCheckRigth               ;(1)     push program counter to stack
somewhere:                        ;    ('4)
 ;...
skipdec:                          ;(4)
 jmp loop                         ;(5) ('5)

TileCheckRigth:                   ;(2)
 ;...
 jmp skipdec                      ;(3)      miss one rts, this corrupts stack
 ;...
 rts                              ;    ('3)
Re: JMP, fine...JSR, nope...
by on (#139237)
Oooooh! I gotcha now. I think some synapse just fired.

So I'd have to RTS back from the JSR, or otherwise get rid of the last value pushed to the stack, as it's hooked to the RTS. It's just still sitting on the stack lingering, and the next time something pulls from the stack, it's pulling that value...something like that?

Huh. Man...I better go through my code with a highlighter and make sure I haven't made this mistake anywhere else!
Re: JMP, fine...JSR, nope...
by on (#139239)
JoeGtake2 wrote:
Quote:
Just to be clear, you're not expecting registers to be preserved during JSR and restored at RTS, right?
They're not? Hm...that is definitely news to me...

To expand on rainwarrior's comment/question: the contents of A, X, and Y are not "backed up" and "restored" when doing a JSR/RTS. Take this code for example:

Code:
  lda #$01
  ldx #$aa
  ldy #$bb
;
; At this point in the code, A = $01, X = $AA, Y = $BB
;
  jsr someplace
;
; At this point in the code, A = $00, X = $22, Y = $BB
;
loop:
  jmp loop   ; Just to keep the code below this from being run

someplace:
  ldx #$22
  lda #0
  rts

That code should be clear/concise and easy to understand.

If that comes as a surprise to you, i.e. "I thought A/X/Y were backed up when JSR was used, and restored when RTS was used!", then that would be a bug. Is it *the* bug? I don't know, that's for you to figure out.

So if you need to back up and restore registers, you need to make sure your subroutines push those values onto the stack when the routine starts, and pull them off right before the end (and remember to pull them off the stack in the opposite order as you pushed them on). Again the same code, but fixed to back up and restore A/X/Y:

Code:
  lda #$01
  ldx #$aa
  ldy #$bb
;
; At this point in the code, A = $01, X = $AA, Y = $BB
;
  jsr someplace
;
; At this point in the code, A = $01, X = $AA, Y = $BB
;
loop:
  jmp loop   ; Just to keep the code below this from being run

someplace:
  phx
  pha
  ldx #$22
  lda #0
  pla
  plx
  rts

You could also accomplish the same thing by putting the pushes and pulls around (before and after) the JSR, but that's for you to decide. Usually when making subroutines people do the backup/restore within the routine itself (at the start of the routine, and immediately before the RTS).

I don't use PHY/PLY anywhere in the "someplace" routine because Y isn't modified; but you could add it there just as a safety net if you wanted, no harm done (other than some CPU cycles wasted).
Re: JMP, fine...JSR, nope...
by on (#139246)
I'm sorry, I might be on drugs here... But PLX/PHX and PLY/PHY are not real 6502 commands... Right?
Re: JMP, fine...JSR, nope...
by on (#139247)
No I got it.

Celius, I think he was just using that to abbreviate TAX -> PHA. I knew what he meant. Unless I'm mistaken.

But Koitsu, yeah, I get it. I was misreading Rainwarriors meaning (specified in the edit). I get it. I had a moment of revelation when Nostromo posted that visual example. It was the exact same thing EVERYONE was saying, but seeing it like that for some reason just kicked over whatever blockage was happening in my brain.

JSR puts a value into the stack. Without the RTS, the value never gets pulled. So when I was JMPing out of the subroutine, there was leftover junk in the stack, and I'm pretty sure that's what was causing the problem. I may have done this in other places too, so I'll have to go back through and check. I wasn't making the correlation between JSRing and the stack...I understood using PHA PLA to push and pull the accumulator the stack, but just didn't know (or at least, didn't factor) that JSR RTS also pushed and pulled to the stack.

Thanks for all the input - sorry for being so dense! haha
Re: JMP, fine...JSR, nope...
by on (#139248)
Celius wrote:
PLX/PHX and PLY/PHY are not real 6502 commands... Right?

Nope! {PH|PL}{X|Y} were added on the 65C02.

You might also want a PHP/PLP sometimes.
(Interrupts and RTI include status push-pulls and don't need them added.)

Also, if you had imbalanced push-pulls, those use the same stack as the return address generated/used by JSR/RTS.
Re: JMP, fine...JSR, nope...
by on (#139257)
I figured they weren't, but it seems like the kind of thing I would glance over and then learn about 10 years later :). It's also good to clarify here for anyone browsing the Newbie Help Center so they don't get confused.
Re: JMP, fine...JSR, nope...
by on (#139259)
Celius wrote:
I'm sorry, I might be on drugs here... But PLX/PHX and PLY/PHY are not real 6502 commands... Right?

Sigh. :( As Myask pointed out they're 65c02. I often forget which opcodes the 65c02 introduced in comparison to the original 6502 (and to me that's funny since I started with 6502, went to 65c02 within about 6 months, then later to 65816) -- the only one I always remember is how inc (a.k.a. inc a) doesn't exist on 6502 (instead forced to clc / adc #1); the others I often forget. Thanks guys for keeping me on my toes + pointing out this mistake of mine. Apologies if anyone reading this thread misses that mistake / gets bitten by it.

Revamped routine which does the same thing but works on 6502:

Code:
someplace:
  pha
  txa
  pha
  ldx #$22
  lda #0
  pla
  tax
  pla
  rts
Re: JMP, fine...JSR, nope...
by on (#139265)
JoeGtake2: You had the part I was asking about correct, I was just asking because you mentioned that using TAY/TYA to manually restore A helped. It sounds like you have cleared up your confusion though, otherwise.
Re: JMP, fine...JSR, nope...
by on (#139496)
koitsu wrote:
Celius wrote:
I'm sorry, I might be on drugs here... But PLX/PHX and PLY/PHY are not real 6502 commands... Right?

Sigh. :( As Myask pointed out they're 65c02. I often forget which opcodes the 65c02 introduced in comparison to the original 6502 (and to me that's funny since I started with 6502, went to 65c02 within about 6 months, then later to 65816)

This is of modest use, though a bunch of the 816 opcodes get overridden by the new ones he put in. (Might also be c02, not sure; only barely familiar with either).

It looks nice, anyway.
Re: JMP, fine...JSR, nope...
by on (#139509)
Myask wrote:
This is of modest use, though a bunch of the 816 opcodes get overridden by the new ones he put in. (Might also be c02, not sure; only barely familiar with either).

It looks nice, anyway.

Charts like that tend to be hard for me to follow because they seem "thrown together" (and in my experience a lot of people who have "charts of opcodes" end up making mistakes -- there are all sorts of 6502 web pages that have these which are riddled with mistakes. They get mentioned here once in a while). I prefer stuff like what's in the WDC manual:

...but then I just realised on the per-opcode-breakdown, the formatting (column sizes and layout) in the PDF is completely botched compared to the actual Ron Lichty and David Eyes book, making the one in PDF format hard to read. Hahaha wow... way to go WDC. *sigh*