Monday, August 14, 2017

Some normalization code


Here is the original MC-10 ROM routine for normalizing a floating point number 8 bits at a time.  After this it rotates 1 bit at a time.


;* Normalize FPA0
LEFD6     clrb                          ; exponent modifier = 0
LEFD7     ldaa      FPA0                ; get hi-order byte of mantissa
          bne       LF00F               ; branch if <> 0  (shift one bit at a time)

;* Shift FPA0 mantissa left by 8 bits (whole byte at a time)
          ldaa      FPA0+1              ; byte 2 into..
          staa      FPA0                ; ..byte 3
          ldaa      FPA0+2              ; byte 1 into..
          staa      FPA0+1              ; ..byte 2
          ldaa      FPA0+3              ; byte 0 into..
          staa      FPA0+2              ; ..byte 1
          ldaa      FPSBYT              ; sub-precision byte..
          staa      FPA0+3              ; ..into byte 0
          clr       FPSBYT              ; 0 into sub-precision byte
          addb      #8                  ; add 8 to the exponent modifier
          cmpb      #5*8                ; has the mantissa been cleared (shifted 40 bits)?
          blt       LEFD7               ; loop if less than 40 bits shifted

Here is the current test code.
FPSBYT only needs cleared the first pass, but I don't see a way of getting ride of it without unrolling the loop once.  Positioning FPSBYT after FPA0 would let us speed this up, but it would be at the cost of a few clock cycles in the multiply.  But normalization should take place more often than multiplication, so I may test that at some point to see if it's an improvement.  The ldab at the end is commented out because it isn't needed.  If we get that far the mantissa is zero.

;* Normalize FPA0
LEFD6
ldx #5 ; loop a maximum of 5 times (cleared mantissa)
LEFD7
ldaa  FPA0                                ; get hi-order byte of mantissa
   bne  LEFFFa                            ; branch if <> 0  (shift one bit at a time)
   ;* Shift FPA0 mantissa left by 8 bits (whole byte at a time)
   ldd FPA0+1
std FPA0
ldaa FPA0+3
ldab FPSBYT
std FPA0+2
clr   FPSBYT                          ; 0 into sub-precision byte
dex                                  ; has the mantissa been cleared (shifted 40 bits)?
bne         LEFD7                             ; loop if less than 40 bits shifted
; ldab #8*5  ; set the exponent modifier where x = 0

then we calculate the exponent modifier right before the single bitshift code.  If A is negative, we skip the next code, if not, we directly fall into shifting bits.

LEFFFa ;calculate exponent modifier
stx TEMPM
ldab #5 ; max # of times we shifted the mantissa
subb TEMPM+1 ; actual number of times we shifted the mantissa
; no need to test result, always > 0
rolb ;* 2 ; muliply 8 for 8 bits per byte
rolb ;* 4
rolb ;* 8
tsta ; Is A positive or negative?
  bmi LF00Fa


If we use the 6303 we can do this.  The xgdx instruction exchanges the contents of D and X.  So in one XGDX  instruction, we load the address of the table-1 into X, and the table offset into B.
The table address minus 1 is because this is never called when x=0.  The table itself is only 5 bytes with pre-calculated values for the exponent modifier, and the table can reside anywhere it ROM.  Technically, 8*5 is never loaded because this isn't called when X goes to zero, so we just leave it off and skip subtracting from X or B by loading the xpmtable pointer -1.


LEFFFa ;calculate exponent modifier, 6303 version
ldd #xpmtable-1 ; get the address of the exponent modifier table
xgdx ; exchange D and X
abx ; point X to modifier in table
ldab ,x ; load it
tsta ; is A positive or negative?
bmi LF00Fa
xpmtable
fcb 8*4, 8*3, 8*2, 8, 0

No comments:

Post a Comment