A Bitbanger's Blog: July 2019

Tuesday, July 30, 2019

Sort code update

Here is an update to the DATA/string/sort test code I've been using. It fixes a bug in the comb sort, and slightly optimizes it. I wasn't worried about that when I was just comparing the speed of the same code head to head on different versions of the interpreter. Now I'm starting to compare vs other machines, and "let's compare the two using the same code" always turns into "let's optimize the code on my machine for better results", so this was inevitable.

I tried using a code formatter on this, but every language I chose mangled it, so raw text it is.

0 I=0:N=0:C=0:D=0:E=0:NW=0:H=0:T=0:GOTO20

'comb sort.
'p = what we divide the array size by to get the gap
'G = the gap
'using nested for loops avoids having to perform a line search that would
'happen using GOTO. The address is simply pulled off of the stack.
'using STEP 0 keeps FOR NEXT from changing the variable and the endless loop
'exits once the exit condition is achieved
'This revision prevents the 0 gap pass, and duplicate gaps if no swap took place the first time
'It checks the T temp variable used to swap indexes to see if a swap took place
'We start the array index at 1, so T should never = 0 if a swap takes place.
'Starting the array at 0 could break this code
1 Q=1:F=1.3:P=F:FORG=INT(N/P)TO0STEP0:PRINT"PASS =";Q;"GAP =";G:T=0
2 FORI=1TON-G:IFA$(B(I))>A$(B(I+G))THENT=B(I):B(I)=B(I+G):B(I+G)=T
3 NEXT:H=P:Q=Q+1
4 P=P*F:G=INT(N/P):IFNOT(T)ANDP=HTHEN4
5 NEXT:RETURN

'check to see if names are in order
6 R=0:FOR I=1 TO N-1:IFA$(B(I))>A$(B(I+1))THENPRINT"OUT OF ORDER "A$(B(I)):R-1
7 NEXT:RETURN

'print out the names based on the array index
8 FOR I=1 TO N:PRINT A$(B(I))" ";:NEXT:RETURN

'initialize the index array, and read in the names. Skip the other data.
20 CLS:PRINT"READ, LIST, AND SORT MOST POPULAR SIR NAMES"
21 PRINT"SCANNING FOR END OF DATA"
22 READ K$,C,D,E:IF K$<>"" THEN N=N+1:GOTO22
23 RESTORE
24 PRINT N" NAMES IN DATA, INITIALIZING ARRAYS"
25 DIM A$(N),B(N)
26 FORI=1TON:READ A$(I),C,D,E:B(I)=I:NEXT
'27 PRINT"UNSORTED DATA":PRINT
'28 GOSUB 8
29 PRINT:PRINT "SORTING"N"NAMES ALPHABETICALLY"
30 GOSUB 1
31 PRINT:PRINT "VERIFYING THE ORDER IS SORTED"
32 GOSUB 6:IFR=1THENSTOP
33 GOSUB 8
34 PRINT:PRINT"DONE"
35 END

Friday, July 26, 2019

Testing 1... 2... 3...

Just testing different software for recording video from emulators. I was using CamStudio, but it doesn't handle large files, and long recordings end up wrapping around to the start of the file. An overflow issue apparently. This is my 2nd attempt using OBS. It still needs work but I got this far without reading the documentation.

This is the test I wrote to verify the new string compare function, but it gives a lot of the things in the interpreter a workout. Note that the gaps during the comb sort are now the same, where they aren't in the earlier video due to a bug. There are still a couple more bugs to track down, but it's to the point where I can start worrying about adding new commands, or replacing some slow math functions with faster modern algorithms.

A lot of effort was made to keep memory variables the same in the past, but you have to break some eggs to make an omelet... meaning that's over with, it's a waste of time. The direct page use wasn't really thought out with speed in mind, so... good riddance. The change is minor, but significant to optimize the math for 16 bit opcodes, and storing the last byte read was good for a 1+% speedup for about an hour's work... but it moved the character parsing code. One thing I know will break is the patch I posted that could be included in a program. A simple PEEK will tell you if you should install the patch or not. Details will follow once the code is finalized.

The BASIC variables and interrupt hooks after video RAM will be there for one release version, but after that, those will be moved below video RAM to make hi-res support possible. This had to happen. I will also add an interrupt hook for the illegal instruction interrupt in the HD6303. That requires replacing the pointer to the floating point register at the end of RAM. A table of pointers to different functions will be added to support future programs. If you want a compiler to support floating point, the best way is using ROM calls, and those functions are going to move, so this is another thing that must happen.

When will there be a release? I have to track down a couple bugs first, then I will make one attempt to generate an 8K ROM. If it fits in 8K again, I'll release that. If not, I'll add several new functions first. No point in building an 16K ROM that only uses 110 bytes of the additional 8K. INSTR is a likely candidate. I want to add PRINT USING, but that is pretty large.

Tuesday, July 9, 2019

I had to remove a couple optimizations, so the speed is just over 9%. Some of that can be gained back, but I'm hunting down another bug that has been around for a while. The expression evaluator hasn't been changed, but it appears to be malfunctioning at times. This is likely to be something in the math library, or s related to switching from a line number to a line pointer. It could take some time to track down.

Sunday, July 7, 2019

Invert, and multiply optimization

On several occasions I'm mentioned using "invert, and multiply" to speed up the MC-10 interpreter.
Instead of dividing by a constant, the updated ROM now multiplies by 1/constant.
When combined with the fast hardware multiply function, it results in significantly faster functions, including SIN, and COS.

For example, if a formula calls for dividing a number by the constant PI (3.14...), you recalculate the constant as 1/PI, and then multiply by it. This also works in your BASIC code, if you have to do a lot of division by a constant, you can speed up the program by multiplying with 1/constant, and that can be calculated at the top of your program.

Anyway, the reason for bringing it up again, is because I ran across this article that explains the concept, and why it works.
https://www.quickanddirtytips.com/education/math/how-to-divide-fractions-using-invert-and-multiply

Saturday, July 6, 2019

10%!

And we have 10%!!!!

Okay, it's not always 10%, but this program is bottlenecked due to the design.

...and another enhancement...

Here is something I've shown before that doesn't require a lot of math.
This indicates over a 10% speedup, but it will take a longer comparison to be sure.

let the programming resume

After taking some time away from personal projects for "life", I've started working on some projects again including the updated BASIC interpreter for the MC-10.

After tracking down a significant bug, this is where things are at. Around an 8% speedup for most code, and much more for math intensive stuff... with more on the way.
This is about 44% faster: