A year ago, I posted a code for software sprite rotation, and promised a demo for it, but never actually got finished with it.
The reason being so was:
1) I had school in the way.
2) It wasn't fast enough to do it in realtime.
My old algorithm addresses pixels in a 256 x 256 grid. First it calculates the y-coordinates for a line of 8 pixels using 16 bit values, with the low 8-bits being decimals. Then it does the same with the x-coordinates, and stores them one byte before the y-coordinates, so the top x byte overwrites the bottom y byte, to create the 16-bit address for the pixel.
Last night in the shower, I came up with an idea that can gain a magnitude of speed. Instead of calculating the pixel coordinates for every pixel, I can calculate the coordinates for the middle row, and middle column and use "LDA (dp),y" to add the row coordinate and the column coordinate and access the pixel in one instruction.
Converting from packed pixel to planar takes a lot of work too, so I decided to use 2bpp as opposed to 4bpp. I then realized, since I only need to calculate one row and one collumn of coordinates, I can arrange it so it uses a 128x128 pixel grid, with 2 bytes per pixel. Each byte holds just 1 bit. All it needs to do is "ASL" and "ORA (dp),y" the next pixel.
To avoid carring a bit from the x-coordinate into the y-coordinate, a sprite needs to be within a box twice as large on all sides.
For an upright 32x32 sprite the collumn and row coordinates are:
Collumns:
(0,16),(1,16)...(30,16),(31,16)
Rows:
(16,0),(16,1)...(16,30),(16,31)
Notice how the middle row (16,16) and middle collumn (16,16) add up to (32,32), which is right in the middle of the 64x64 box!
The reason being so was:
1) I had school in the way.
2) It wasn't fast enough to do it in realtime.
My old algorithm addresses pixels in a 256 x 256 grid. First it calculates the y-coordinates for a line of 8 pixels using 16 bit values, with the low 8-bits being decimals. Then it does the same with the x-coordinates, and stores them one byte before the y-coordinates, so the top x byte overwrites the bottom y byte, to create the 16-bit address for the pixel.
Last night in the shower, I came up with an idea that can gain a magnitude of speed. Instead of calculating the pixel coordinates for every pixel, I can calculate the coordinates for the middle row, and middle column and use "LDA (dp),y" to add the row coordinate and the column coordinate and access the pixel in one instruction.
Converting from packed pixel to planar takes a lot of work too, so I decided to use 2bpp as opposed to 4bpp. I then realized, since I only need to calculate one row and one collumn of coordinates, I can arrange it so it uses a 128x128 pixel grid, with 2 bytes per pixel. Each byte holds just 1 bit. All it needs to do is "ASL" and "ORA (dp),y" the next pixel.
Code:
ldy !row_0
lda (!collumn_0),y
asl
ora (!collumn_1),y
asl
ora (!collumn_2),y
asl
ora (!collumn_3),y
asl
ora (!collumn_4),y
asl
ora (!collumn_5),y
asl
ora (!collumn_6),y
asl
ora (!collumn_7),y
sta !bitmap
lda (!collumn_0),y
asl
ora (!collumn_1),y
asl
ora (!collumn_2),y
asl
ora (!collumn_3),y
asl
ora (!collumn_4),y
asl
ora (!collumn_5),y
asl
ora (!collumn_6),y
asl
ora (!collumn_7),y
sta !bitmap
To avoid carring a bit from the x-coordinate into the y-coordinate, a sprite needs to be within a box twice as large on all sides.
For an upright 32x32 sprite the collumn and row coordinates are:
Collumns:
(0,16),(1,16)...(30,16),(31,16)
Rows:
(16,0),(16,1)...(16,30),(16,31)
Notice how the middle row (16,16) and middle collumn (16,16) add up to (32,32), which is right in the middle of the 64x64 box!