Well then, I missed an obvious change to the sprite code that will speed it up, so I updated it, and threw in a 2nd speedup I knew about while I was at it. A few extra instructions are executed at the end, but it's more than offset by eliminating one branch per loop.
The case where no shift is needed is separate to eliminate a lot of unneeded instructions. I originally used the same code to keep sprite drawing rate consistent, but it's easy enough to go back to the old way if it's a problem.
;----------------------------------------------------
; Draw sprite
;
; AGD sprites are normally 16 pixels wide by 16 pixels high, but the heighth can be changed with the loop counter
; spr must be on the direct page
; spr+1 must be loaded with the bytes per line (32) on startup
; For 8 bit wide sprites, the ROR <spr instructions could be dropped along with the last screen write.
; LEAX 32,X would then be the fastest way to advance to the next screen line instead of using LDB ABX
;----------------------------------------------------
Sprite:
leau ,x ; move graphic address to regU
jsr scadd ; get screen address in X
ldy #SPR_HGT+1 ; load loop counter with number of lines per sprite + 1 for first loop.
ldb <dispx ; x position.
andb #7 ; position straddling cells.
beq sprit3 ; branch if no shifts are needed
lda #Endshift-ShiftLoc1 ; offset to skip shifts for BNE as a negative offset
aslb ; subtract 4 (number of bytes for each LSRD ROR spr) times number of shifts. 2 clocks each
sba ; subtract increases negative offset
sba
sta ShiftJump+1 ; set the BRA offset to perform the right number of shifts 4 clocks
bra sprit1 ; jump to the branch changed by the code above
;StartShift:
lsra
rorb
ror <spr
lsra
rorb
ror <spr
lsra
rorb
ror <spr
lsra
rorb
ror <spr
lsra
rorb
ror <spr
lsra
rorb
ror <spr
lsra
rorb
ror <spr
EndShift:
eora 0,x ; merge spr with screen image
eorb 1,x ; merge spr+1 with screen image
std 0,x ; write to screen.
ldd spr ; get sprite data in A, and screen line width in B
eora 2,x ; merge with sprite
sta 2,x ; write to screen
; b contains number of bytes per line (32) after the LDD, spr+1 is initialized at the very start of the game
; for different resolutions, this could be changed elsewhere on startup
abx ; move to next screen line
sprit1:
ldd ,u++ ; load data from sprite into D register.
clr <spr ; clear byte for shifting, also clears the Z bit
dey ; decrement the # of lines counter
ShiftJump:
bne StartShift ; go again if not done, address offset is modified above
ShiftLoc1:
rts
; no shifts
sprit3:
dey
sprit2:
ldd ,u++ ; load data from sprite into D register.
eora 0,x ; merge spr with screen image
eorb 1,x ; merge spr+1 with screen image
std 0,x ; write to screen.
leax 32,x ; move to next screen line
dey ; decrement the # of lines counter
bne sprit2 ; go again if not done, address offset is modified above
rts
My on again off again blog about whatever computer related hobby projects I happen to be working on at the moment.
Sunday, August 14, 2022
You ever wonder how you misseds something obvious?
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment