BSNES debugger - NESdev BBS

BSNES debugger
by mic_ on 2009-12-02 (#53360)

I really appreciate being able to debug with an emulator that's much more accurate than snes9x, and the VRAM viewer is nice to have. But there are a few things I wondered about..

If I press "step", the console isn't updated automatically. I have to click in the main emulator window for the console to be updated. That makes stepping extremely tedious. I hope that's not the intended behaviour (this is on Vista 32-bit).

It would be nice to see not just the latest executed instruction, but also a disassembly of the code around it. Something like this (you don't have to make your disassembler look as ugly as mine).

A palette viewer is a really simple addition, but can also be helpful sometimes.

Re: BSNES debugger
by byuu on 2009-12-02 (#53362)

mic_ wrote:

If I press "step", the console isn't updated automatically. I have to click in the main emulator window for the console to be updated. That makes stepping extremely tedious. I hope that's not the intended behaviour (this is on Vista 32-bit).

Go to Settings -> Configuration -> Advanced and under "When main window does not have focus:", set it to "Ignore Input".

Quote:

It would be nice to see not just the latest executed instruction, but also a disassembly of the code around it. Something like this (you don't have to make your disassembler look as ugly as mine).

SNES code cannot be disassembled ahead of or behind the current position without some sort of emulation happening, this is because the processor has variable-length opcodes.
I am aware there are plenty of workarounds, and there are pitfalls and shortcomings to all of them.

Quote:

A palette viewer is a really simple addition, but can also be helpful sometimes.

May make it in eventually.

Re: BSNES debugger
by tepples on 2009-12-02 (#53374)

byuu wrote:

Whatever workaround Nintendulator and FCEU series use to disassemble ROM while emulating the mapper is probably worth it. Or are you referring specifically to sep and rep?

by byuu on 2009-12-02 (#53378)

To the opcodes rep and sep can affect the size of, at least.

Okay, example:

Code:
routine1:
lda $2180
jsl routine2
nop
...

routine2:
pha
plp
db $a9,$00,$1a
rts

Does routine 2's db code represent "lda #$1a00" or "lda #$00; inc"?
It depends on the value of $2180, how would a pure disassembler know about that? Further fun is that just by having a fake CPU read that value, it would alter the state of the machine (the WRAM address index) via auto-increment-on-read.

So to do this, we'd some sort of opcode history for the last several hundred opcodes to disassemble backwards (which may not even be enough if routine1 was too far back), a fake CPU emulator clone of the entire system (or a save state + reload, which is not a 100% non-volatile action in bsnes) to run several opcodes ahead to generate future opcode disassemblies, and a re-implementation of every memory-mapped register (and another for all special chips). Not happening.

So we need a totally different approach. I would say to keep a large block of memory to keep history information about opcodes that will allow your debugger to show an intelligent forward and backward disassembly as something separate to the main console. Eg log MX states at each specific opcode address, as well as an "is this byte an opcode?" flag so you can backwards seek for opcode disassembly. Should be about 8MB packed.

It would fall apart on dynamic code executing out of RAM, as well as on polymorphic code that changes its meaning based on M/X states, but for the most part the code would be valid after it was hit by the emulator a single time. Make the history file non-volatile so it works across runs so long as the SHA-1 hash hasn't changed.

by tepples on 2009-12-03 (#53384)

byuu wrote:
we'd some sort of opcode history for the last several hundred opcodes to disassemble backwards (which may not even be enough if routine1 was too far back)

I seem to remember reading WDC's data sheet for the 65C816 (from Apple IIGS Hardware Reference) and finding a pin that outputs whether a fetch is the first byte of an instruction. I don't know if the Super NES version of the CPU preserves this signal, but the logic for computing it could be reused to build your "has an opcode been fetched from this physical address?" flag. (Several FCE Ultra derivatives already dunnit.) A second list in RAM could even work for internal RAM ($7E0000/7) as long as the flag is cleared when the corresponding byte is written.

Quote:
(or a save state + reload, which is not a 100% non-volatile action in bsnes)

"Non-volatile"? That's a long disambiguation page:
volatile at Wiktionary
volatile at Wikipedia
What exactly do you mean by this? That saving and loading a state is lossy?

Quote:
and a re-implementation of every memory-mapped register (and another for all special chips).

When tracing into an area that isn't mapped to cart ROM or internal RAM, put the blinders back on.

by byuu on 2009-12-03 (#53390)

This is why I hate discussing technical things. I don't always get the jargon right and people don't make an attempt to understand what I'm saying. I don't mean that in a bad way directed at you, it's much more my fault for getting the terms wrong.

> "Non-volatile"?

When you close the emulator / game, it writes the file to disk. When you load the game, it reads it back from disk and continues. Like save RAM.

> What exactly do you mean by this? That saving and loading a state is lossy?

If you save a state and reload it, you won't have the exact same state that you started with. This is because in order to save a state in my multi-threaded (as opposed to state machine based) application, each thread has to be aligned to its entry point. That requires running the system ahead by up to one opcode.

by koitsu on 2009-12-03 (#53394)

byuu wrote:
To the opcodes rep and sep can affect the size of, at least.

Okay, example:

Code:
routine1:
lda $2180
jsl routine2
nop
...

routine2:
pha
plp
db $a9,$00,$1a
rts

Does routine 2's db code represent "lda #$1a00" or "lda #$00; inc"?

There is no correct answer to your example, because you're explicitly not telling us whether or not bit #5 of P is 1 (8-bit accum) or 0 (16-bit accum) -- it's all based on whatever $2180 returned at the time. Come on, you know this. :-)

Furthermore, tracing changes to bit #5 of P (accum size) and bit #4 of P (index size) is not that hard. TRaCER did it by following the bits of SEP/REP operations. Was it flawless? Heck no -- because TRaCER is not a SNES/SFC emulator. I didn't bother to deal with PLP, XCE, etc.. TRaCER didn't follow branches/jumps, or emulate any aspect of the 65816; it simply read bytes linearly from a file.

An emulator can most definitely accomplish this task since it always knows what the bits of P are, and change disassembled output on the fly based on that. Please don't think of it as "I want to use your emulator to disassemble a game", think of it as "I want to see a disassembly listing of the code around the area of PC". Or, think of it as a 65816-specific version of IDA Pro.

Could this take up a lot of memory? Sure -- depends on how much detail you track as the emulator runs. Given that most x86 PCs today have 2GB RAM or more, I really can't imagine this being a problem.

by tepples on 2009-12-03 (#53396)

koitsu wrote:
There is no correct answer to your example

This lack of a correct answer is byuu's entire point: the length of an instruction depends on the MX bits (WDC used "ENVMXDIZC" to label the nine bits of P), which in turn can depend on things that are harder to determine in advance than on an NES. The compromise solution would have only parts of the debugger, not the emulation core, glitch in such a case.

Quote:
An emulator can most definitely accomplish this task since it always knows what the bits of P are, and change disassembled output on the fly based on that.

In theory, you don't even know the opcode bytes around the PC until you fetch them, and fetching has side effects. Imagine trying to execute from PPU registers, for instance. That's why I recommended limiting the scope of disassembly-around-PC to the common cases of WRAM and ROM, which more easily admit a side path to fetching without side effects.

Quote:
Given that most x86 PCs today have 2GB RAM or more, I really can't imagine [caching instruction lengths] being a problem.

I agree, even on older or portable machines with only 256 MB of RAM, because all such a list will do is double the size of the loaded ROM, and even then only when the debugger is turned on.

Re: BSNES debugger
by magno on 2009-12-03 (#53402)

mic_ wrote:
It would be nice to see not just the latest executed instruction, but also a disassembly of the code around it. Something like this.

What debugger is that one? I've never seen it...

by mic_ on 2009-12-03 (#53404)

Quote:
What debugger is that one? I've never seen it...

That's from Dualis - my Nintendo DS emulator (which I haven't worked on for the past 2 years).

Anyway, back on topic. Why I made this suggestion to byuu was because it's nice to be able scroll around in the code (or Ctrl-G to a specific address) and see the instruction in the context of its neighboring instructions, instead of just each one individually as you step forward. Plus it makes it a hell of a lot easier to determine where to place breakpoints.

And no, I don't care at all if it would fail for the I/O register areas. ROM is prio 1, 65816 WorkRAM is prio 2, anything else is just a bonus.

by byuu on 2009-12-03 (#53419)

tepples wrote:
This lack of a correct answer is byuu's entire point

Exactly right :D

I really don't want to seem rude, but if you double-check my post, I even explained why it's not simple to track P and a potential workaround that eats lots of memory, as you were suggesting.

tepples wrote:
That's why I recommended limiting the scope of disassembly-around-PC to the common cases of WRAM and ROM, which more easily admit a side path to fetching without side effects.

Indeed, that's great advice. I do exactly that, block any potentially state-changing reads or writes, including from my indirect -> direct address translation. Worst case is your debugger gives you a weird result. If the user applies a tiny bit of thought, they'll understand why.

I do need to extend it a little better though, it doesn't currently block on DSP-n memory-mapped commands. I should specialize the function based on the active special chips.

by koitsu on 2009-12-03 (#53465)

Quote:
This lack of a correct answer is byuu's entire point: the length of an instruction depends on the MX bits (WDC used "ENVMXDIZC" to label the nine bits of P), which in turn can depend on things that are harder to determine in advance than on an NES. The compromise solution would have only parts of the debugger, not the emulation core, glitch in such a case.

You'll have to show me proof that tracking M/X of P is simply not sufficient enough. I'm having a very hard time believing otherwise.

With regards to the last sentence: you'll have to explain what you mean by "glitch". There's really nothing glitchy about it -- you'd have to modify the output (disassembly) every time M/X of P is changed, which is why I recommended only disassembling certain areas around PC.

This is exactly how GSBug back in the Apple IIGS days did it, and it worked wonderfully. None of us had any complaints. There were always conditions where you'd scroll around to examine areas of RAM which hadn't executed yet and get inaccurate results -- but that's just the nature of the beast. Anyone working on this processor should know that ahead of time, else send their complaints to /dev/null.

There's no easy way to describe what I'm trying to say with text, I suppose. Basically the section of code which was disassembled (visually on the screen) was based on what M/X of P was at the time. If the executing code changed either bit, the disassembly shown on-screen would change.

I'd have to set up my IIGS and get GSBug on there to show you exactly what I'm talking about. Once you show GSBug someone who's familiar with 65816 (but hasn't ever used said debugger), they go totally berserk with joy. It's a wonderful tool.

Quote:
That's why I recommended limiting the scope of disassembly-around-PC to the common cases of WRAM and ROM, which more easily admit a side path to fetching without side effects.

Yep, I'd agree with this as well. Chances are majority of debugging will be done with regards to code executing in those areas.

by byuu on 2009-12-04 (#53466)

Quote:
You'll have to show me proof that tracking M/X of P is simply not sufficient enough. I'm having a very hard time believing otherwise.

If you're meaning TRaCER style prediction based purely on disassembly, I'm definitely not going to go there. It is much too error prone, and I say that as a ROM hacker who used to use TRaCER (and Louis Bontes' ISDA) quite a it back in the day.

But you're right that if we track MX during execution, we can do a great job of forward disassembly. Add in "is this offset a known opcode" tracking and we can include backward disassembly.

I'm probably just going to merge two of my goals in one with this. I'll keep track of read/write/execute/coprocessor execute/M/X in a 16MB table to represent the full 24-bit address bus.

This will be a useful tool for analysis of existing game code, eg to find unused sections of RAM for your inserted routines. It can also be used directly by intelligent disassemblers to aid those who make things like those SMB1 complete disassembly listings.

by tepples on 2009-12-04 (#53489)

koitsu wrote:
You'll have to show me proof that tracking M/X of P is simply not sufficient enough.

Load I/O register, pha, plp, as in byuu's example. That makes MX sufficiently unpredictable. But I agree that most non-pathological code shouldn't need it; tracking rep and sep should cover the common case of reading ahead to code that hasn't executed yet. It's again the tradeoff between compatibility (handling the common cases well) and accuracy (handling unknown cases well).

Quote:
There's no easy way to describe what I'm trying to say with text, I suppose. Basically the section of code which was disassembled (visually on the screen) was based on what M/X of P was at the time.

And by "glitch", I was just referring to the kind of full-screen change one sees in GSBug when MX changes. Even a disassembler that tracks rep and sep might "glitch" after executing a plp. But how did GSBug know where to start disassembling to provide a list of instructions before PC?

by byuu on 2009-12-06 (#53607)

Added read/write/execute/M/X tracking to the latest WIP, which allows disassembly in both directions, with the caveat that code has to be executed at least one time. Screenshot:
http://byuu.org/images/bsnes_20091206.png

I'm going to add this same support for the S-SMP, SuperFX and SA-1 eventually. I don't really intend to add a scrollbar as that's obviously not going to work well since we always have to start from the current offset. I may add a manual offset seek function though. For now it updates when you step the CPU.

I'll link to a WIP that has it in the 21fx thread.

by MottZilla on 2009-12-06 (#53631)

Would this allow what I was asking you about before with FCEUX type table generation of ROM access execute vs data for help in disasm? Just curious if that may be down the road, I know you're busy with 21fx right now.

by byuu on 2009-12-06 (#53633)

Yes, usage.bin can be directly used to:

- assist a full-fledged disassembler (though it can't help with non-executed areas; that would have to fall back on heuristics)
- identify code vs execution areas
- find unused portions of WRAM and SRAM for use with hacks
etc.

by mic_ on 2010-05-17 (#61522)

I compiled the BSNES sources on Ubuntu today, but I can't find the debugger anywhere. Is there a flag I need to set when building BSNES in order to get the debugger included?

by MatthewCallis on 2010-05-17 (#61525)

mic_ wrote:
I compiled the BSNES sources on Ubuntu today, but I can't find the debugger anywhere. Is there a flag I need to set when building BSNES in order to get the debugger included?

Yeah, it's in one of the main header files, can't remember the name.

by mic_ on 2010-05-17 (#61532)

Adding -DDEBUGGER to the flags in the makefile did the trick.

by mic_ on 2011-01-03 (#72252)

I've got another idea for an improvement to the debugger: it'd be nice if bsnes could read the SYM files generated by wlalink, so that I could set breakpoints on symbols and get label names when I step/trace my code. You can generate a SYM file by adding a capital S to your link switches (IIRC).. they're just text files with address/name pairs in plain text.

by magno on 2011-01-11 (#72585)

Is bsnes debugger version still on progress? Or was it dropped?

I can't see the binaries anymore on byuu website...

by byuu on 2011-01-13 (#72710)

It needs GTK+ for the hex editor widget, but it is otherwise there and working.

by magno on 2011-01-14 (#72727)

byuu wrote:
It needs GTK+ for the hex editor widget, but it is otherwise there and working.

Those are great news for me! I thought debugging wasn't anymore on windows version as you said in your webpage:

Quote:
Lastly, the debugger is still Linux-only[...]

by tepples on 2011-01-14 (#72733)

Unless "works on Windows" means "feel free to install Ubuntu in VirtualBox on Windows".

by magno on 2011-01-14 (#72736)

So that means there is no debugging in the windows binaries anymore, doesn't it?

If so, could I compile the source code with -debugger option in Visual C or GCC for Windows?