Tuesday, March 27, 2018

To Err is Human, to Invert and Multiply is Divide... or how to take advantage of the optimized MC-10 ROM Part 1.

Supporting the 6803's hardware multiply in the floating point math library provided the single greatest speed improvement of any of the optimizations in the new MC-10 BASIC ROM if a program uses lots of multiplication.  That speeds up a lot of calculations, but math isn't just multiplication.  You also need to divide numbers a lot, and there is no hardware divide instruction on the 6803.

There is a partial solution, and it comes from a shortcut normally used when dividing fractions.
Invert and multiply.  But we aren't dealing with fractions, so how do you invert a number?  Microcolor BASIC represents all numbers with floating point numbers, so fractions are represented as floating point, and to invert a number, you divide 1 by the number.

Example:

Suppose you want to divide 94 by 144. 
Normally, the BASIC code would look something like this:
10 PRINT 94 / 144

But to invert and multiply, 94 can be represented as 94/1 and 144 can be represented as 144/1.  When you invert 144/1 you get 1/144.  When you multiply 94/1 by 1/144, you get 94/144.  Since 94/1 is still 94 that part of the code is the same as the original.
The code now looks like this:
10 C = 1 / 144
40 PRINT 94 * C

This lets us take advantage of the fast 6803 hardware multiply used in the optimized floating point library, but as I said, this is a partial solution.  When you invert 144/1, in order to create the floating point equivalent of 1/144, you are performing a divide.  So now you are performing a divide and a multiply to get the result instead of just a divide.  This is obviously slower, so it will not be faster in every case, including the one above, but that shows how it works.

Repeatedly dividing by a number is where we can take advantage of the faster multiply.  How much faster the code is will depend on how many divides there are.  This approach may be faster with as few as 2 divides given how slow the floating point divide is compared to the multiply, but you need to benchmark the code to be sure if there are a small number of divides.  If the divide is inside of a loop, the savings could be significant.
Example:

Original code:
10 FOR I = 1 TO 1000
20 PRINT I / 7
30 NEXT I

Becomes:
0  C = 1 / 7
10 FOR I = 1 TO 1000
20 PRINT I * C
30 NEXT I

1 comment: