Here is the original MC-10 ROM routine for normalizing a floating point number 8 bits at a time. After this it rotates 1 bit at a time.
;* Normalize FPA0
LEFD6 clrb ; exponent modifier = 0
LEFD7 ldaa FPA0 ; get hi-order byte of mantissa
bne LF00F ; branch if <> 0 (shift one bit at a time)
;* Shift FPA0 mantissa left by 8 bits (whole byte at a time)
ldaa FPA0+1 ; byte 2 into..
staa FPA0 ; ..byte 3
ldaa FPA0+2 ; byte 1 into..
staa FPA0+1 ; ..byte 2
ldaa FPA0+3 ; byte 0 into..
staa FPA0+2 ; ..byte 1
ldaa FPSBYT ; sub-precision byte..
staa FPA0+3 ; ..into byte 0
clr FPSBYT ; 0 into sub-precision byte
addb #8 ; add 8 to the exponent modifier
cmpb #5*8 ; has the mantissa been cleared (shifted 40 bits)?
blt LEFD7 ; loop if less than 40 bits shifted
Here is the current test code.
FPSBYT only needs cleared the first pass, but I don't see a way of getting ride of it without unrolling the loop once. Positioning FPSBYT after FPA0 would let us speed this up, but it would be at the cost of a few clock cycles in the multiply. But normalization should take place more often than multiplication, so I may test that at some point to see if it's an improvement. The ldab at the end is commented out because it isn't needed. If we get that far the mantissa is zero.
;* Normalize FPA0
LEFD6
ldx #5 ; loop a maximum of 5 times (cleared mantissa)
LEFD7
ldaa FPA0 ; get hi-order byte of mantissa
bne LEFFFa ; branch if <> 0 (shift one bit at a time)
;* Shift FPA0 mantissa left by 8 bits (whole byte at a time)
ldd FPA0+1
std FPA0
ldaa FPA0+3
ldab FPSBYT
std FPA0+2
clr FPSBYT ; 0 into sub-precision byte
dex ; has the mantissa been cleared (shifted 40 bits)?
bne LEFD7 ; loop if less than 40 bits shifted
; ldab #8*5 ; set the exponent modifier where x = 0
then we calculate the exponent modifier right before the single bitshift code. If A is negative, we skip the next code, if not, we directly fall into shifting bits.
LEFFFa ;calculate exponent modifier
stx TEMPM
ldab #5 ; max # of times we shifted the mantissa
subb TEMPM+1 ; actual number of times we shifted the mantissa
; no need to test result, always > 0
rolb ;* 2 ; muliply 8 for 8 bits per byte
rolb ;* 4
rolb ;* 8
tsta ; Is A positive or negative?
bmi LF00Fa
If we use the 6303 we can do this. The xgdx instruction exchanges the contents of D and X. So in one XGDX instruction, we load the address of the table-1 into X, and the table offset into B.
The table address minus 1 is because this is never called when x=0. The table itself is only 5 bytes with pre-calculated values for the exponent modifier, and the table can reside anywhere it ROM. Technically, 8*5 is never loaded because this isn't called when X goes to zero, so we just leave it off and skip subtracting from X or B by loading the xpmtable pointer -1.
LEFFFa ;calculate exponent modifier, 6303 version
ldd #xpmtable-1 ; get the address of the exponent modifier table
xgdx ; exchange D and X
abx ; point X to modifier in table
ldab ,x ; load it
tsta ; is A positive or negative?
bmi LF00Fa
xpmtable
fcb 8*4, 8*3, 8*2, 8, 0
No comments:
Post a Comment