thought of a new way of converting packed to planar graphics

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
thought of a new way of converting packed to planar graphics
by on (#240816)
Code:
lda {word_a}
asl #2
eor {word_b}
and #$cccc
tay
eor {word_b}
xba
asl
tax
lda lut,x
ror
sta {word_b}

tya
lsr #2
eor {word_a}
xba
asl
tax
lda lut,x
ror
sta {word_a}



How this works is that it takes 2 words like this:
Code:
ccccddddaaaabbbb
gggghhhheeeeffff

Swap bits so that planes 0 and 1 are separates from 2 and 3:
Code:
ccggddhhaaeebbff
ccggddhhaaeebbff


Then swap high and low bytes:

Code:
aaeebbffccggddhh
aaeebbffccggddhh


Then shift left to get the index for a 64kB conversion LUT, and then shift the high bit back in, and we're done.

I didn't count how many cycles this takes but I believe this would be pretty fast, as long as you already have 2 pixels per byte in the first place.

Edit: Found out this takes 69 cycles total, so 8.625 cycles per pixel.
Re: thought of a new way of converting packed to planar grap
by on (#240869)
I have no idea how long other methods take, but that's really pretty fast if it's only just over 8 cycles per pixel. Using some really loose math (to get an exact measurement, you would need to know how many cycles are toward (fast)rom vs ram, how you got the bytes in the first place, etc) I got that it'd take about 1/10 of a second to convert an entire 256x224 screen. That's not bad for small software rendering stuff like variable width font, as mentioned in the other thread, or layering sprite graphics for customization or whatnot.
Re: thought of a new way of converting packed to planar grap
by on (#240871)
Code:
lda $0000,y   //6
asl #2      //4 10
eor $0002,y   //6 16
and #$cccc   //3 19
tcd      //2 21   use dp register as a temporary register
eor $0002,y   //6 27
xba      //3 30
asl      //2 32
tax      //2 34
lda lut,x   //6 40
ror      //2 42
sta $0002,y   //6 48

tdc      //2 50
lsr #2      //4 54
eor $0000,y   //6 60
xba      //3 63
asl      //2 65
tax      //2 67
lda lut,x   //6 73
ror      //2 75
sta $0000,y   //6 81


Unfortunately, doing it to an arbitrary RAM location is not quite as fast as doing it using fixed memory locations. It takes 81 cycles assuming that the "indexed addressing takes an extra cycle if index registers are 16-bit" thing is true. Also, if I'm not using the "increment by 8, 32 times" mode in the $2115 register, and I'm just doing sprite patterns, I would need to also switch the order of the words within the tiles.
Re: thought of a new way of converting packed to planar grap
by on (#240880)
Why do you have Packed data and not planar data?
Re: thought of a new way of converting packed to planar grap
by on (#240882)
Oziphantom wrote:
Why do you have Packed data and not planar data?


In case I feel like making either a ray casting engine, or a super tight compression algorithm.
Re: thought of a new way of converting packed to planar grap
by on (#241002)
I just came up with a compression algorithm earlier today. You can do RLE compression on 8x8 blocks, but instead of just horizontal it can choose between horizontal, vertical, or diagonally, with a zigzag pattern.