arrangement of pattern tables in use by sprites

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic

arrangement of pattern tables in use by sprites
by GradualGames on 2009-01-01 (#41213)

disclaimer: major newbie alert (well...newbie to NES but definitely not to programming or assembly language) so forgive me if my post is hard to understand or misuses any terminology..I'll try to work with you on subsequent posts...

How do NES games generally organize their game character's sprites and animations? Go ahead and answer me now if you like, but what follows is my own guess (and how I would probably first attempt to do it once I get around to writing a game):

At this point in my research, having not yet studied much other than docs and an example program that manipulates a single sprite, my guess would be that game characters are organized as follows in the pattern table:

As a pseudo code example, say our character is 4 tiles wide and 4 tiles high, and has two frames of animation:

-each number is a row of 8x8 pixel tiles, where each repetition of that number is a single 8x8 pixel tile along that row. (and thus represents all 16 bytes of that tile)

;this is the pattern table
org:
;animation frame one (left foot in front of right for example)
1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4
;animation frame two (right foot in front of left for example)
1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4

And then, the programmer creates sixteen sprite entries and creates code for animation timing, and displays all of the patterns from the first frame, and then all the patterns from the second frame, and calculates offsets (for x y for all 16 sprites) on the fly from the top left of the game character himself.

That's my guess at this point---and from what I've read and understand, this is definitely one way of doing it...

Regards and Thanks,
-Zom
nes noob

by AWJ on 2009-01-01 (#41216)

There's no "standard" convention to pattern table layout. It depends on what size and shape (square, rectangular, or irregular) of meta-sprite you're using, how complex your animations are, and of course on what hardware sprite size (8x8 or 8x16) you're using. If you look at five different commercial games you'll probably find five completely different implementations, and quite likely multiple methods in the same game (e.g. in a platformer like Ninja Gaiden, powerup items and small enemies could be simple 2x2 tile squares while the player and larger enemies could be irregularly-shaped)

by Celius on 2009-01-01 (#41217)

Here's how I do mine.

Most objects (entities in the game world, e.g. enemies, the character, bullets, swinging platforms, anything that has some sort of intelligence) are made of sprites, like you were saying. Some can be made with the BG, like really really big bosses, but most are made with sprites.

All sprite objects in my game are placed on the screen relative to the coordinates of the top-left corner of their "box". Their box is just a rectangle that encompasses the whole object. So in ROM, I have tables which define placements of sprites relative to this coordinate. They may look something like this:

Code:
ObjectXAnimation1:
.db 4    ;Object is made of 4 sprites
.db 0,$40,$02,0    ;Relative Y, Tile ID, Attribute, Relative X for sprite 1
.db 0,$41,$02,8    ;Relative Y, Tile ID, Attribute, Relative X for sprite 2
.db 8,$42,$02,0    ;... Sprite 3
.db 8,$43,$02,8    ;Sprite 4

So that says the object is to be drawn with 4 sprites, where all the relative Xs and Relative Ys are just added to the object's top left corner coordinates to determine their placement. So if the object's top left corner is located on screen at 155, 32, the first sprite will be at 155, 32, the next will be at 163, 32, the next will be at 155,40, then the last one will be at 163,40.

The beauty of doing it this way is you can have sprites made out of any tiles anywhere (so long as it's not more than 256 pixels away from the object's coords, because relativity is measure using 8 bits). You can also have a large object and only use like 2 sprites if you need to; you don't need to put in a bunch of empty space into the pattern tables.

great responses
by GradualGames on 2009-01-01 (#41218)

Thanks for your responses. Yes, I assumed there were probably many ways of implementing this. Celius, my guess was to calculate the relative x and y on the fly---but that's a big waste of precious cpu cycles isn't it? I like your table technique. I'll try something like that when I make a game.

Regards,
-Zom

Re: great responses
by tokumaru on 2009-01-02 (#41222)

ZomCoder wrote:
Celius, my guess was to calculate the relative x and y on the fly---but that's a big waste of precious cpu cycles isn't it?

I use the same method Celius mentioned, because this is the most versatile way I can think of. If your game only uses perfectly aligned grids of sprites, you could save ROM and CPU cycles by defining a single pair of relative coordinates, the dimensions of the grid and then only tile IDs and attributes for each of the sprites. It really depends on how complex your graphics are. You might even decide to use a single palette for the whole metasprite, something that will save even more ROM.

A good alternative would be to have 2 sprite rendering routines, one that deals with the more versatile complex sprites, and another to handle simpler grid sprites. Since the calls to the drawing routines usually are made from the code of each object, depending on their complexity you call one or the other, and then you won't be wasting cycles.

Now, about animations, they can also be implemented with simple tables. Each entry in the table should consist of a pointer to the sprite definition to use and a frame count indicating how long it should be displayed. A flag could be used to indicate the end of the animation. Each object will need to have a pointer to the animation frame they are currently using, and a counter (copied from the frame duration table) that is decremented each frame and causes tha animation to advance when it reaches 0.

by Celius on 2009-01-02 (#41224)

I was thinking of making a special drawing routine for 2x2 objects, because there are lots of them and it might be quicker as they don't have a variable size. Another thing to do is make an unflippable-object drawing routine where every object is drawn as is. They cannot be flipped, saving you tons of cycles. Basically you just add the coordinates and store.

One thing you'll definitely want to make is a one-sprite-object drawing routine. If an object is made of one sprite only, you should be able to get away without wasting cycles like you would in the really complex object drawing method.

by UncleSporky on 2009-01-06 (#41432)

I was just thinking about this and Celius's way is the most obvious and best way of doing it, as far as I can see. I would've done that without seeing it, I like organizing things that way.

Something else just hit me as I was playing around: are all 64 sprites constantly in use? I store 00 in the whole block of sprite data, and when I accidentally put some graphics in ID 0, I noticed that it showed up in the corner of the screen. Of course it would, with 0 X, 0 Y...

So logically they're all constantly in use? It's no problem keeping graphics out of ID 0, I was just wondering. It's not very convenient to fill SPR-RAM with anything other than 0s because it would mess up the attributes.

by Celius on 2009-01-06 (#41433)

UncleSporky wrote:
It's no problem keeping graphics out of ID 0, I was just wondering. It's not very convenient to fill SPR-RAM with anything other than 0s because it would mess up the attributes.

Falsitude! Yeah, I know that's not a word.

If you fill unused sprites with $FF, they will not be displayed, because they will be at Y = 255, and there are only 240 pixels vertically on the screen (224 for NTSC). They will always be hidden. If they're hidden, it doesn't matter what data they contain, as they aren't visible.

And also, all 64 sprites are renderable. There's no way to say disable sprite #5 from being rendered at a variable location. But you can do tricks to disable sprite rendering mid frame to not render sprites with coordinates past that scanline.

by UncleSporky on 2009-01-06 (#41436)

Celius wrote:
UncleSporky wrote:
So logically they're all constantly in use? It's no problem keeping graphics out of ID 0, I was just wondering. It's not very convenient to fill SPR-RAM with anything other than 0s because it would mess up the attributes.

Falsitude! Yeah, I know that's not a word.

If you fill unused sprites with $FF, they will not be displayed, because they will be at Y = 255, and there are only 240 pixels vertically on the screen (224 for NTSC). They will always be hidden. If they're hidden, it doesn't matter what data they contain, as they aren't visible.

Oh, so the NES doesn't care if the attributes contain junk data if it's off the screen...good to know.

I just adopted your method for the first time and now I have a headache trying to figure out how to implement it. I sort of know the assembly, it's just putting it together...

How do I add the metadata value to the sprite's Y while maintaining a nice loop...should I use all the registers, or even a temp variable? That seems like the easy way out.

Code:
ldx 0
ldy 4 ;object of 4 sprites
Repeat:
lda MetaData, x
<increment data at (SpriteTable, x) by a>
inx
lda MetaData, x
sta SpriteTable, x
inx
lda MetaData, x
sta SpriteTable, x
inx
lda MetaData, x
<increment data at (SpriteTable, x) by a>
inx
dey
beq Repeat

Hmm...

by Disch on 2009-01-06 (#41438)

Celius wrote:
(224 for NTSC)

Somewhat misleading.

NTSC NES's render a full 240 scanlines, not just 224. The 'extra' scanlines are visible on some TVs, and even if they're not, the game creator should not assume they're invisible.

by Celius on 2009-01-06 (#41439)

Oh, you should know that these routines eat up many many scanlines lots of the time. My routine is pretty long. It's not a small little loop like that, because I have to take into consideration flipped sprites, attributes, and 16-bit coordinates.

There's no one answer "how to do it". What you want to do is develop a routine that can be fed these values:

Address of Animation Table (16 bits)
Y coordinate (probably 16-bit)
X coordinate (probably 16-bit too)
Flip Status (Usually like an attribute value for the whole object, 8 bits)

I have mine set up a little differently, so I feed 8 bytes into the routine to draw an object. With that system above, you'd feed 7.

Since there are 4 values (most likely 7 bytes), you'll definitely want to be using temp values and all registers to make the most time efficient loop possible. Again, these will probably take a very long time to execute.

EDIT: Oh yes, and what Disch said is true. All 240 scanlines are rendered, just not displayed on lots of NTSC TVs. So any Y value for sprites more than 240 is safe for hiding them.

by Bregalad on 2009-01-07 (#41448)

Since the adress of the table is always in the $8000-$ffff range, I use bit 15 of adress for horizontal flip status.

Oh and yeah Celius' way of doing it really seems like the standard. I used another format at first, but I eventually modified it so that it is exactly what Celius is suggesting. The best you can think of (it may waste a little ROM space tough, but is so flexible you don't care).

Also, I don't draw sprites form ther AI routine, but for a dedicated routine. The AI only tells the number of animation I should drawn. I do this for obvious sorting purposes : The sprites I need to draw must be sorted in some order, and I run object's AI in a constant order.

The main limitation of my way of doing it is that 1 object = 1 metasprite. I'd like to have a ghost with flames rotating arround him, if I make all rotations in the same metatsprites it will eat up a lot of ROM space defining all of these
I still have an engine to draw explosions that are always on topmost priority and that aren't objects (not directly at least). I could tweak it to draw other stuff as additional explosions.

I could re-arrange my engine so that the sort applies on AI, which will automatically maze sprites in the same order. That wouldn't be a bad idea in fact I just didn't have it.

by UncleSporky on 2009-01-07 (#41449)

Yeah, the dumb routine I wrote is a terrible hack way to do it that doesn't take anything important into account, I just wanted to see if I could get it do something. I'm actually glad to hear that it ends up being a complex and slow operation, since I'm most worried that I'll code that way when a short little routine would suffice.

by Celius on 2009-01-07 (#41463)

My engine for drawing metasprites doesn't know anything about objects really. At max, there can be 32 metasprites on screen at once, since there is a 256 byte buffer that holds 8 byte chunks for each metasprite. This buffer kind of acts like a stack, where you put the information for one metasprite on, then in the next "slot" you put the info for the next metasprite. In an enemy's AI code, it could place whatever it wants into the stack. An object could display itself as 2 or 3 metasprites if it wanted to, or even as none. Though this might be stupid, unless it's some invisible switch that triggers an event.

The great thing about this is that an object could actually put whatever it wants to as it's coordinates or it's graphics. There aren't any decisions that aren't made by the objects themselves. If an object wanted to, it could place it's graphics at 23 pixels south of it's own coordinates, which might come in handy for just a few special scenarios.

EDIT:
UncleSporky wrote:
I'm actually glad to hear that it ends up being a complex and slow operation, since I'm most worried that I'll code that way when a short little routine would suffice.

Yeah, but eventually you'll be able to nicely optimize most of your code. Actually, lots of the time, I find myself optimizing code conceptually instead of just focusing on the routine I've written, scraping by for cycles. I think optimization could be categorized to Micro-optimization and Macro-optimization (kind of like microeconomics and macroeconomics). In Micro-optimization, you optimize the routine by rearranging some instructions, nickel and diming for cycles here and there. Like here:

lda #$23
sta $334
lda #$23

An example of "micro-optimization" would be to eliminate the last 'lda #$23'. Just little things like that.

But with Macro-optimization, you focus on the big picture of things. You focus on the concepts behind your code, and optimize those, usually causing you to completely rewrite routines. For example, if I decided to split an RLE compressed map into 4 quadrants so I wouldn't have to work through so much RLE to find a specific value (DW did this, I believe). This is where you REALLY save cycles.

By the way, I totally just made up the whole "micro" and "macro" thing, so those aren't real terms. Though I think they're good to describe what I'm talking about.

by Bregalad on 2009-01-07 (#41464)

Celius wrote:
My engine for drawing metasprites doesn't know anything about objects really. At max, there can be 32 metasprites on screen at once, since there is a 256 byte buffer that holds 8 byte chunks for each metasprite. This buffer kind of acts like a stack, where you put the information for one metasprite on, then in the next "slot" you put the info for the next metasprite.

My, this is a really great idea. Can I plagiarize it ? Oh seriously I was thinking of doing something like that, but I already changed my game engine extensively once. I don't know yet how I'll do. My engine is perfect as long as each object is either 0 or 1 metatile, but it can't be more (altough objects can decide to display explosions above them).

It's not stupid to have invisible enemies. I have a level who is not terminated by a boss. In order to terminate the level, I have an invisible enemy next to the exit and if the player step on him, instead of getting damaged it just set the stage end flag, like a normal Boss AI does when the boss is dead. However, by doing that I waste some ROM space, as I have to specify an additional enemy and specify dummy values for it's strenght and score reward values. But the size of the enemy, and his sprites maze sense (it's sprite is bland).

So yeah for me each enemy = 1 metasprite. If I were to do a stack system (of FIFO really, it doesn't really matter which one it is), it would be more complicated, more RAM just to optimize things for maybe 3 enemies who would use more than 1 metasprite in the entiere game and some invisible "enemies" that wouldn't neeed a dummy pointer to a metasprite who consist of zero sprites. On the other hand I have a lot of free RAM and this method sounds flexible. I'd still need to find a way to sort sprites in a certain maneer, becuase unlike most 2D games, the order of sprites in my game matters because it is top-down view and not side-scrolling.

Also, filling and emptying the buffer each frame seems like a waste of CPU time. Another method would be to have objects know which metatile slot(s) they take and only change the necessary. Altough that sounds like it would be a lot of bugs doing it that way.

My current game engine allows 8 enemies, 1 player and 8 explosions, which make I guess 18 metasprites in total. There is no way you'd need 32 of them I guess.

by Celius on 2009-01-07 (#41466)

Oh, I forgot to mention for my buffer I have a variable outside of the page that contains how many object slots are being used. So I don't need to clear anything, because if my routine knows I have only 3 metasprites on screen, it will know to only read the first 24 bytes (8 bytes a piece, 3 metasprites). Every frame, this "counter" for the number of objects on screen is reset after all metasprites are drawn. Then whenever a metasprite is put into a new slot, that counter is incremented.

If you want, you can use the idea; I'd be happy knowing that I had a legitimately good concept that other people would like to use. But I believe you if you say you also came up with a similar system.

My game might need more objects because there are lots of spells cast, fireballs thrown, axes thrown, basically lots of independently moving metasprites. But I decided on this system because it was really flexible, and that is my number 1 goal in programming a game, flexibility.

EDIT: About the invisible object, you're right that it's not stupid. It could be stupid though, if executed stupidly, but of course, anything could be stupid if executed stupidly. I actually think I'll be having invisible objects that act as event switches, like your enemy at the end of the level. Though since I have unique code for every room in my game, I could just hard code event switches into that.

by tepples on 2009-01-07 (#41467)

Bregalad wrote:
Celius wrote:
My engine for drawing metasprites doesn't know anything about objects really. At max, there can be 32 metasprites on screen at once, since there is a 256 byte buffer that holds 8 byte chunks for each metasprite. This buffer kind of acts like a stack, where you put the information for one metasprite on, then in the next "slot" you put the info for the next metasprite.

My, this is a really great idea. Can I plagiarize it ?

Balloon Fight does something similar, except it can have only 9 "normal" active objects.

Quote:
It's not stupid to have invisible enemies. I have a level who is not terminated by a boss. In order to terminate the level, I have an invisible enemy next to the exit and if the player step on him, instead of getting damaged it just set the stage end flag

Then it shouldn't be called an "enemy"; it's a trigger. Invisible triggers are useful; invisible enemies that can damage the player, not as much.

by Bregalad on 2009-01-07 (#41469)

OK maybe not an enemy, but technically in my game engine there is no difference between that and a true enemy, other than what is inside the AI.

And I'll think whether I continue to use my system or if I'll switch to the metasprite buffer idea. If I make a sidescrolling game where priorities doesn't matter at all, I'll definitely use that. But for my current game I think I'll work with the 1 object = 1 metasprite limitation, and find other tricks arround that. Limitations aren't THAT bad, they are guidlines. Altough being limited when you want to do something very cool that breaks the limit can be annoying.

by Celius on 2009-01-07 (#41474)

The one thing that is annoying about the way I physically draw the sprites in my game is that I do a quick-and-dirty priority shuffling method, so currently any visible sprite pixels that are layered will constantly flicker. But if I somehow got around that, the stack system wouldn't necessarily mean that all sprites drawn couldn't have priority. I understand though that you'd have to make some fixes to make sure that it is displayed correctly when the player is in front of an object or behind it, etc.

On a side note, being limited is the one thing I try to eliminate when programming a game. I can do almost anything I want with my game engine once I program it. It's a side scrolling platformer with rooms, and each room points to event code. If it's a really simple room, it might just point to "rts". But it's nice because I can have unique 6502 code for every room, and with 6502 code, well, a -lot- is possible. I could tell it to change the palette midway through the room, I could tell it to display the player upside down, I could tell it do display the sum of variables $0-$2E in the center of the screen, etc. Though the last two I mentioned were completely ridiculous, it's just an example of flexibility. It gets really really fun as a game designer when you know nearly anything (within the realm of reasonability) is possible .

by UncleSporky on 2009-01-11 (#41742)

I got Celius's method working, for the most part. Here is my current routine, for your consideration. A waste of cycles? Unclean and trashy? Let me know!

Code:
;an example of one metasprite this routine interprets
ExampleSprite:
.db 6
.db 0,$01,$00,0
.db 0,$02,$00,8
.db 8,$11,$00,0
.db 8,$12,$00,8
.db 16,$21,$00,0
.db 16,$22,$00,8

Code:
loadsprite:

;tmpada = address of sprite data
;tmp16x/tmp16y = 16-bit x and y of sprite's position, this will be culled to fit onscreen
;tmp8x = sprite flip values
;tmp8y = where to store this data in the sprite table
;sprY sprID sprAtt sprX = final data to be inserted into table

ldy #0
lda (tmpada),y ;load number of tiles in this sprite
beq +    ;if 0, jump to simple sprite routine

tax
- dex
ldy Table4,x ;y = x * 4
iny
lda (tmpada),y ;a = tile x's y offset
clc
adc tmp16y ;add y offset to low bits of y position
sta sprY
lda #0
adc tmp16y+1 ;finish the add
beq ++    ;move on if there's nothing in the high bits
lda $FF    ;otherwise, this part of the sprite is offscreen
sta sprY

++ iny
lda (tmpada),y ;a = tile x's index
sta sprID

iny
lda (tmpada),y ;a = tile x's palette
ora tmp8x    ;merge palette with flip values
sta sprAtt

iny
lda (tmpada),y ;a = tile x's x offset
clc
adc tmp16x ;add y offset to low bits of x position
sta sprX
lda #0
adc tmp16x+1 ;finish the add
beq +++    ;move on if there's nothing in the high bits
lda $FF    ;otherwise, this part of the sprite is offscreen
sta sprX

+++ txa
clc
adc tmp8y
tay
lda Table4,y ;4 * (x + sprite number) = OAM offset
tay
lda sprY    ;store all final values in shadow OAM
sta OAM,y
lda sprID
sta OAM+1,y
lda sprAtt
sta OAM+2,y
lda sprX
sta OAM+3,y

cpx #0    ;are we done with this sprite?
bne -
rts

+ nop    ;simple sprite loading not implemented
rts

Table4 is a short multiplication table of 4s since I can't be bothered to a multiplication routine now and it's faster anyway, as long as I can spare the bytes.

You'll notice I preserve x the whole time as a counter, and it actually counts backwards from the end of the metadata. Seemed like it might save a few instructions that way.

Now obviously, the biggest problem with this is that it does not take into account flipping the whole sprite, rather than just individual tiles! If you try to flip it you get a mess. How would I start fixing this? It's a pretty big problem. The main issue is deciding which tiles switch places; in 2x2, you just swap them side to side, but in 3x3 you preserve the center one a swap the far sides, and it gets more complex as you go.

I could duplicate my metadata with things positioned and offset correctly but that seems very wasteful to me...however it would save on a lot of cycles.

The easiest solution might be to store an extra byte with the sprite that somehow indicates a flipping routine.

by tepples on 2009-01-11 (#41744)

Multiplication by 4 is dead easy on a 6502. Replace this:
Code:
tay
lda times_four,y

with this:
Code:
asl a
asl a

to save two bytes and two cycles.

As for flipping, that's more a matter of turning x+=8 into x+=248 when you see attribute 2 bit 6 turned on; most of the rest of the code stays the same.

by tokumaru on 2009-01-11 (#41747)

UncleSporky wrote:
Now obviously, the biggest problem with this is that it does not take into account flipping the whole sprite, rather than just individual tiles! If you try to flip it you get a mess. How would I start fixing this? It's a pretty big problem. The main issue is deciding which tiles switch places; in 2x2, you just swap them side to side, but in 3x3 you preserve the center one a swap the far sides, and it gets more complex as you go.

This is indeed a big problem, and, IMO, the main reason why drawing metasprites is slow. In my routine, that uses relative coordinates for each hardware sprite, I just invert the coordinate of each sprite if the metasprite is flipped (4 becomes -4). Of course I have to account for the fact that sprite coordinates represent the top left corner of the picture, but ideally I'd like them to represent the top right corner when they are flipped horizontally, and since that's just not possible, I fix the origin coordinate before drawing the individual sprites in case of flipping to compensate for that.

You did present a good idea though, of using different routines for the different flipping states. If you had 4 different sprite rendering routines, you'd get the maximum speed possible out of them. Many conditional branches within a repeated operation (such as outputting sprites) are big performance killers.

by Memblers on 2009-01-11 (#41750)

I didn't like the idea of flipping sprites, when I had that situation I just made walking the other way to be a separate animation.

by Celius on 2009-01-11 (#41752)

It can prove to be a big life saver if you have tons of different animations to handle flipping.

Oh, but one thing I will say is that it might indeed be a good idea to handle flipping with different routines. I think in my routine, I use a ZP variable for a flip byte, so I will draw something like this:

lda SpriteRelativeX
eor FlipX
clc
adc MetaSpriteXCoordLow
...

Where if FlipX is 0, the sprite will be drawn as is, but if FlipX is $FF, it will like Tokumaru does, invert the value so that values are multiplied by -1.

If you really wanted to flip a sprite with separate routines, you could do away with signed numbers all together. So in the unflipped drawing you'd do:

clc
lda SpriteRelativeX
adc MetaSpriteXLow

and in the other you'd do

sec
lda MetaSpriteXLow
sbc SpriteRelativeX

So you do additions in one, and subtractions in the other. Though using signed numbers makes for more universal code, it might end up saving you time to do straight up addition and subtraction of positive values.

by Bregalad on 2009-01-12 (#41762)

Quote:
It can prove to be a big life saver if you have tons of different animations to handle flipping.

Maybe life saver, but ROM waster for sure. In my game I don't want to waste any ROM so I do automatic flipping horizontally (but not vertically as you'd hardly flip whole sprites vertically anyway). I just use a EOR instruction to do this, so that if the sprite is already flipped horizontally in the definitions, it's not flipped finally. I had a little trick to compute the horizontal position, but it works 100% fine. I belive I had put much tought in it back then.

I will share my routine too and you guys will say what you think about it. It only uses 8-bit coordinates toug. Oh yeah I use another system which allow sprites to be automatically disabled when in a certain area, this is usefull when drawing textboxes and you don't want sprites to be visible in them (I call this sprite clipping).

In my game this routine is copied into RAM, and the instructions in comments are regularly exchanged with their uncommented counterparts above them to get sprite cycling (the metasprites is copied either forwards or backwards). Having 2 versions of the routine in ROM would have been possible but more wasteful. I don't know if I could do anything better, if anyone has a great idea please share it.

Code:
SetupSpriteTable
sty PointerH
sta PointerL
ldy #$00
lda [Pointer],Y
beq _done    ;Make sure to skip if the sprite is transparent

sta SpriteCtr    ;This is the # of sprites in this frame
sbc #$03
nop
; asl A
; asl A
; sbc #$03
; tay
_loop
lda SpriteDMAFlag    ;Make sure that the buffer isn't already full
bne _done

iny
lda [Pointer],Y    ;$$
clc
adc SpritePosV
sta SpriteBuffer.PosV.w,X ;Set the vertical position

iny
lda [Pointer],Y    ;$$ ;Tile number
sta SpriteBuffer.TileNmr.w,X

iny
lda SpriteHFlipFlag
lsr A
and #$40
eor [Pointer],Y    ;$$ ;Read attributes
eor SpriteGlobalColor    ;Main palette adder
sta SpriteBuffer.Palette.w,X

iny
lda [Pointer],Y    ;$$ ;Add relative horizontal offset
bit SpriteHFlipFlag
bpl +
eor #$ff
sec
sbc #$07
+ cmp #$80
bcc +
clc
adc SpritePosH
bcc _skip       ;Skip if underflow if offset is negative
bcs _noskip
+ adc SpritePosH
bcs _skip       ;If positive ofset skip if overflow
_noskip
sta SpriteBuffer.PosH.w,X ;Store definite H position

bit SpriteClippingFlag
bpl _noClipping       ;Check if there is clipping

cmp SpriteClippingHMin
bcc _noClipping
cmp SpriteClippingHMax
bcs _noClipping       ;Check if the sprite is horizontaly in the clipped window

lda SpriteBuffer.PosV.w,X
cmp SpriteClippingVMin
bcc _noClipping       ;Check if the sprite is vertically in the clipped window
cmp SpriteClippingVMax
bcc _skip

_noClipping
inx
inx
inx
inx          ;Process to next sprite in memory index
bne _skip       ;Avoid overflow
inc SpriteDMAFlag    ;This will be set if the buffer is full
_skip
tya
sec
dec SpriteCtr
nop
bne _loop
; tya
; sec
; sbc #$08
; tay
; bcs _loop
_done
rts

by Celius on 2009-01-12 (#41764)

Oh wow, that was really misleading sorry. I meant to say that it's a life saver to handle flipping for animations if you have tons of different animations. I didn't mean it's good to use tons of different animations to handle flipping. Sorry about that.

It's a little late and I can't quite follow the routine without looking at it some more, but it looks like if you separated flipped and unflipped sprite drawing, along with having 8 bit coords, you would be able to have really time saving routines. Though it might take a little more space.

by Bregalad on 2009-01-12 (#41765)

Quote:
It's a little late and I can't quite follow the routine without looking at it some more, but it looks like if you separated flipped and unflipped sprite drawing, along with having 8 bit coords, you would be able to have really time saving routines. Though it might take a little more space.

It's fun you mention it's a bit late, because here it's 12 AM.

And I didn't really separate flipped and unflipped sprites. If the sprite is flipped just use a eor #$ff and for some reason I substract 7 afterwards. I guess it's what worked best.

by Celius on 2009-01-12 (#41766)

I guess it would save time though to seperate them, because you wouldn't have to do any eoring for drawing metasprites. You could just have one routine that does all adding for relative coordinates, and another that does subtracting for when it's flipped. This would take up a little more ROM space, but maybe a lot less time.

Oh, and by the way, it's 5:20 AM here. Usually my "late" is someone else's "early" (Actually, my dad just woke up, and I'm just going to bed, haha).

by tokumaru on 2009-02-05 (#42930)

Celius, I'm curious: How do you handle the high byte of the relative coordinates? I mean, I've seen your EOR trick to invert the low byte, but since the relative coordinate is added to a 16-bit number, it has to be extended to 16 bits as well (the high byte is either $00 or $FF).

Obviously, the high bytes of the relative coordinates are not stored in tables, because it'd be a waste to store all those $00's and $FF's, so I generate them based on bit 7 of the coordinates. The fastest way I found to generate it is the following:
Code:
lda #$ff
cmp coord
adc #$80

What makes this process is slow is that I have to do this before the addition (because the carry is used), so the result must be stored in a temp location, and other small details. So I was wondering if you had found a better way to deal with this?

by Celius on 2009-02-05 (#42936)

I'm sorry, I don't think I quite understand what you're asking. All of my relative coordinates are 8 bits in size, and unsigned. My current sprite drawing engine uses a temporary variable as the "high" byte of relativity in case the sprites are flipped, and I set all of this up before drawing. This is either $FF or $00 ($FF if flipped, $00 if not).

So I'll take that value, EOR the relative X coordinate with it and add it as the "high" coordinate. If the value is $00, it'll add $00 as the relative high, eoring the unsigned relative X coordinate by 0, thus doing nothing to the low part. If the value is $FF, it'll add that as the relative high, and invert the low value.

I set it up like this so I wouldn't have to do really any checks in the actual drawing code, but the initialization took like 200 cycles and I ended up adding all sorts of values for the sake of the code being universal.

I'm going to rewrite my code to have 4 different routines that will each be smaller, simpler, and faster. Each one corresponds with the flip status of an object:

00 - No flipping
01 - Horizontal flipping
10 - Vertical flipping
11 - both vertical and horizontal flipping

So in the one without flipping, I can just add the coordinates as is. In the horizontal one, I will do an SBC instead of ADC for the X coordinates. For the vertical one, I'll do SBC instead of ADC for Y coords, and for the last one I'll do SBC instead of ADC for both.

In that case, I really won't need a high byte for the relative X. Well, it may be an immediate value like:

clc
lda XCoordLow
adc RelativeX
sta XCoordLow
lda XCoordHigh
adc #0
sta XCoordHigh

Sorry if this doesn't at all answer your question. Please elaborate if it doesn't.

by tokumaru on 2009-02-05 (#42938)

Celius wrote:
Sorry if this doesn't at all answer your question.

This answers it. I was just unaware that your relative coordinates were unsigned. Mine are signed, so I have to calculate the high byte of each one, meaning I can't do it just once at the start.

My coordinates are signed because I want to be able to put sprites all around the reference coordinate. You probably have the reference coordinate at the top left corner of the sprite, as that is the only way I can think it would be possible to have unsigned relative coordinates, right?

by Celius on 2009-02-05 (#42939)

Yeah, all of my metasprites are defined based off of a top-left coordinate. I honestly think this works better for the sake of speed. But you say you want to be able to put sprites all around the reference point. Why not just move the reference point so all sprites are beyond it?

by tokumaru on 2009-02-05 (#42940)

Celius wrote:
Why not just move the reference point so all sprites are beyond it?

This is an interesting idea! I'm not exactly dissatisfied with my current solution, but I'll keep yours in mind, just in case I decide I need more CPU time. Thanks for the idea.

I just realized that moving the reference point can be just a matter of interpreting the relative coordinates differently. Let me explain. With signed coordinates, I could give the sprites coordinates ranging from -128 to +127, relative to the reference point. Now say that in order to keep my coordinates always positive, I move the reference point to (-128, -128) relative to the original reference point. So now the relative coordinates (0, 0) will be at the same location as old coordinates (-128, -128). 128 is the new 0, and 255 is the new 127.

This means that this scheme is just as versatile as the old one, and once the relative coordinates are all unsigned, the additions will be much faster, because the high byte is the same for all coordinates and can be calculated at the start of the routine. And since the referent point is always moved by the same amount (-128, -128) there is no need to store any extra coordinates in ROM. This is pretty simple, and might actually improve the speed a lot.

by Celius on 2009-02-05 (#42944)

I'm glad I could help .

Yeah, I actually found that signed numbers are great because they're so universal, but dealing with them is a pain. For example, comparing two signed numbers is still a mystery to me. I don't really even care to know how to at the moment, because I've got a quick fix:

lda VarA
eor #$80
sta Temp
lda VarB
eor #$80
cmp Temp

Though that's pretty slow, I seem to remember a legitimate signed comparison not being such an easy deal either.

by Bregalad on 2009-02-06 (#42952)

Quote:

Yeah, I actually found that signed numbers are great because they're so universal, but dealing with them is a pain. For example, comparing two signed numbers is still a mystery to me.

I use signed numbers for relative coordinates, but I don't need to compare them so it's safe. If you need to compare them don't use CMP but SBC. See the sign of the difference and make your conclusions.

I use the center of a sprite for it's reference point and I am satisfacted that way, altough it really depend on the genre of the game.

Tokumaru for the high byte you might come up with something like that :
Code:
lda RelativePos
pha
clc
adc LowPos
sta LowSpritePos
pla
bmi +
lda #$00
.db $2c ;BIT opcode (carry untouched)
+ lda #$ff
adc HighPos
sta HighSpritePos