Sunday, May 28, 2017

Why so many patches were required to get ELSE working and why I was not expecting it.  I've looked at the internals of other versions of Microsoft BASIC before, so why the surprise?  It has to do with the size of the Microcolor BASIC ROM.

Valid keyword tokens are all negative numbers.  That means there are up to 128 tokens without using multiple bytes to represent keywords.  A jump table filled with the address of every possible function those tokens might represent would require 2 bytes per address, times 128 tokens, for a total of 256 bytes.  In 8K, 256 bytes is a lot, and reducing the size of the jump table is an easy way to save space.
Simply group all the valid tokens together starting with the lowest token value, and any token greater than the highest one in the group is a syntax error.  

The other version of Microsoft BASIC I worked with used a 256 byte jump table and did not require the range check.  Any tokens that would be a syntax error simply have the address of the syntax error routine at their location in the table.  This is faster because no range check is required before making the table call, but it does require around 60 more bytes than the code using the range check on the token.  

Microsoft made a trade off, they sacrificing a few clock cycles in exchange for a few bytes.  Sadly, it creates another problem.  In addition to a few clock cycles, it also makes it difficult to add additional commands while maintaining the original tokens.  There were two possible solutions to the problem once I encountered it.  Either patch the functions performing a range check so they perform an additional check for the ELSE token, or switch to larger table.  Given the time and ROM space I had, I chose the first approach.  Now that I have time, I may see if I can switch to the table.  A full 256 byte table isn't mandatory, but it means the interpreter will crash if an unknown token is encountered.

Saturday, May 27, 2017

Here are a few programs I've been using to benchmark the math changes.
I didn't write most of it, so if it does something stupid it's not my fault.  :)


10 W=500:DIM F(W):P=1:A=3:F(P)=A
20 FOR P=2 TO W
30 A=A+2:X=1
40 S=A/F(X):IF S=INT(S) THEN 30
50 X=X+1:IF X<P AND F(X)*F(X)<=A THEN 40
60 F(P)=A
70 NEXT P


10 REM PRIME NUMBER GENERATOR
20 FOR X=1 TO 1000
30 FOR Y=2 TO X-1
40 IF X/Y=INT(X/Y) THEN 70
50 NEXT Y
60 PRINT X
70 NEXT X



This simple BASIC mandelbrot generator required some porting, but it mostly works now.
The original used a MOD function in line 110 which has been replaced, but it has some issues with negative numbers.  The IF THEN was enough to get it running but it's not a proper fix.
The step rate has been cut in half to adjust for 40 columns, but for a benchmark it might be better to use 80 columns, and build a string array with the results instead of just printing them.  That would test some string functions as well as math, and the results could be compared for differences.
It does the job for now.

You could use this as the basis for a graphic version if you adjust the step rate to match the graphics resolution, replace the print with SET() or equivalent function, replace the string with an integer array containing the color numbers you want to use, and replace 5 with the number of colors in your graphics mode.


10 REM SIMPLE MANDELBROT GENERATOR USING TEXT
20 X0=-2: X1=0.5: Y0=-1: Y1=1: I1=20
30 X2=0.06: Y2=0.2: D$=" .-=#"
40 FOR Y=Y0 TO Y1 STEP Y2
50 FOR X=X0 TO X1 STEP X2
60 Z0=0: Z1=0
70 FOR I=1 TO 11
80 Z2=Z0*Z0-Z1*Z1: Z3=2*Z0*Z1
90 Z0=Z2+X: Z1=Z3+Y: IF Z0*Z0+Z1&gt;4 THEN GOTO 110
100 NEXT I
110 IF Z0 AND Z1 &gt; 0 THEN PA=SQR(Z0*Z0+Z1):A=(PA-(5*INT(PA/5)))+1:C$=MID$(D$,A,1):PRINT C$;
120 NEXT X: PRINT
130 NEXT Y

Friday, May 26, 2017

The results of the Retro Challenge are up.  I did not win anything but it was fun.
First of all, thanks to John Linvillle for running the retro challenge this year.
The pay isn't exactly great, and I'm sure everyone involved was already busy.

As for the results... someone who built a Win98 gaming machine finished ahead of me.
Well, that's depressing.


John's comments on my entry:

"James cleverly picked a project platform close to my heart -- the MC-10, a cousin of my retro favorite, the Tandy Color Computer. James sets-out to improve the comptibility between the CoCo and MC-10 BASIC implementations. There is a fair amount of "2 steps forward, 1 step back", as many changes caused somewhat unpredictable breakage in other areas. Having done similar projects before, I suspect that the MC-10 ROM disassembly is a bit imprefect. Nevertheless, he completed the project. Jim Gerrie and other MC-10 BASIC aficionados of the world may now rejoice!"

No clever about it... just trying to back up a claim I made on a forum.
John's "2 steps forward..." comment pretty much sums up the project though.
The disassembly isn't perfect, but it was more than sufficient.  It is far more detailed than other projects I've worked with.

Monday, May 22, 2017

After spending several hours implementing a faster SQR function, there isn't enough space left in the ROM for it. :(

Friday, May 19, 2017

There were two unused bytes on the direct page between Floating Point Accumulator 1 and the current execution line number.  MATHTEMP (added to speed up the math library) has been moved there so that all direct page variables are now at the same location as the original ROM.  As long as a program doesn't try to use the two MATHTEMP addresses, there shouldn't be any incompatibilities on the direct page.
That was going to hold the next line pointer, but that's going to be a new revision so this one might as well be as compatible as possible.
Here's an updated list of BASIC tokens for the MC-10 including the new ELSE statement.


Microcolor BASIC Tokens
Hex Dec Token meaning
80128FOR
81129GOTO
82130GOSUB
83131REM
84132IF
85133DATA
86134PRINT
87135ON
88136INPUT
89137END
8A138NEXT
8B139DIM
8C140READ
8D141LET
8E142RUN
8F143RESTORE
90144RETURN
91145STOP
92146POKE
93147CONT
94148LIST
95149CLEAR
96150NEW
97151CLOAD
98152CSAVE
99153LLIST
9A154LPRINT
9B155SET
9C156RESET
9D157CLS
9E158SOUND
9F159EXEC
A0160SKIPF
A1161TAB(
A2162TO
A3163THEN
A4164NOT
A5165STEP
A6166OFF
A7167+
A8168-
A9169*
AA170/
AB171^
AC172AND
AD173OR
AE174>
AF175=
B0176<
B1177SGN
B2178INT
B3179ABS
B4180USR
B5181RND
B6182SQR
B7183LOG
B8184EXP
B9185SIN
BA186COS
BB187TAN
BC188PEEK
BD189LEN
BE190STR$
BF191VAL
C0192ASC
C1193CHR$
C2194LEFT$
C3195RIGHT$
C4196MID$
C5197POINT
C6198VARPTR
C7199INKEY$
C8200MEM
C9201ELSE

Thursday, May 18, 2017

The memory move optimization has been removed due to a bug.  Once it's fixed I'll put it back in.

Wednesday, May 17, 2017

The faster code to move a block of memory higher in RAM is now in place, and seems to be working.  It looks like it makes a difference in how fast variables are initialized, but I haven't spent a lot of time testing it yet.  It's up and in the same beta version as before.

Tuesday, May 16, 2017

The problem i encountered with the beta version has been fixed.
It was due to the test I tried removing the control and graphics key entry.
An extra opcode was left in the code.
Good thing that's what it was, because that's the first thing I checked.
The beta version has been replaced with the fixed one.

This version has some additional changes to the math library that reduce it's size and improve performance.  This leaves 44 bytes for enhancements that increase the size of the code.  This also means over 100 bytes have been cut from the interpreter since I started working on this.

I'll make a couple more passes through the math lib looking for code that can be optimized.  Once there are no more obvious math optimizations, and it passes testing, I'll remove the beta designation, and finalize it as a release version.

The next version will either focus on memory moves string comparison, and using the pointer to the next line to skip to the end of the line.

*edit*
A quick "LIFE" speed comparison vs the original ROM shows the new version is about 1 second ahead of the original after 1 1/2 generations.

Saturday, May 13, 2017

Even though I mention "optimizations" in the latest beta, they are all fairly small.
The multiply probably offers the biggest improvement and that's mostly due to the fact that it loops... a lot.  Some of the optimizations only save a handful of clock cycles, but they also save a few bytes which are just as important.

The more time I spend in the math library, the more apparent it is that floating point needs to be avoided as much as possible, which would require a redesign of BASIC.  Any significant improvement may not be an option in 8K.

And there some things just can't be fixed in software.  The 6803 can do a lot of things fast, but it has some significant limitations.  Much of the math library depends on operators that directly modify memory for manipulation of the floating point registers.  Even though the floating point registers are located on the direct page, many 6803 instructions don't have support of direct addressing.  The most significant of these are ROL, ROR, INC, and DEC.  This means that every time one of these opcodes is used, it requires an extra byte to store the address, and an extra clock cycle to read that byte.  I would not be too surprised to find 50-100 bytes of additional ROM have been used just to store the zeros of these addresses.  It also means there are a lot of wasted clock cycles.  The 6803 also lacks many important opcodes for dealing with the 16 bit D register.  The missing add carry D, subtract carry D, rotate left D, rotate right D, and compare D, would make several more optimizations possible.

Math library aside, there are still many other optimizations related to strings, memory moves, and basic interpreter functionality that haven't been put in the 2.1B version.  Once everything is in, a program that performs a lot of string manipulation, and comparisons is more likely to see a noticeable speed increase than something that depends on math.  I just don't know if there is enough room for everything without dumping something like the CONTROL-key token entry system.  Dumping that and the graphics entry saves about 100 bytes, but I don't want to dump it unless I have to.

I still have to debug the code that saves the next line pointer, and uses it to advance to the next line.  It will make the biggest difference on code that uses ELSE since it is more likely to skip the remainder of the line.

Here is a beta version with additional changes.  This has had minimal testing but it seems stable.
MC10_ROM_21b.bin


Speed and/or size enhancements:
Multiply
Divide
Misc other math functions
CLS


Friday, May 12, 2017

The last problem was a cut and paste error.  I didn't copy the size of the coefficient table when I added the fixed coefficients.  I updated the source with the fix and the 2.0 ROM.
Thankfully I found it while cleaning up the code instead of having to step through it.
My tests seem to all pass now.
Let me know if you have any problems.

Thursday, May 11, 2017

I found a piece of code that isn't functioning correctly, so I've got some tracing to do.
Before someone asks (not that I expect anyone to ask), the special cases I mentioned before are due to adding the ELSE at the end of the keyword, and command dispatch tables.  If ELSE had been built in from the beginning it could have been placed in a different location in the tables and no special cases would have been required.
 
The only thing important that still seems to be missing from Microcolor BASIC is OPEN, and CLOSE, which would also have other related file I/O handling.  You can duplicate some of that with CLOAD* and CSAVE* which read and write numeric arrays.  Sadly, they don't work on string arrays and that would probably require more room than would be available.
That would have been awesome for adventure games, RPGs, etc... where you have a bunch of arrays that have to be initialized from DATA statements.

What I didn't say at 2:00 a.m.

This fix always inserts a colon before ELSE, even if there is already one there.  It's not perfect, but the goal is to prevent a crash which is worse, and do it without slowing things down.

If there was enough space, I'd scan each line to remove redundant colons before copying them to their final location.  Extra spaces could be removed as well.  We could even create a new command to enable or disable the feature.  Sounds great right?
Well call it CRUNCH and parameters are 1 and 0, for ON and OFF.
"CRUNCH" is 6 bytes that have to be added to the command table, and we have to store a pointer to where it is in ROM which is 2 bytes.  That leaves 2 bytes of ROM which isn't even enough for the JMP instruction to return to the main interpreter.
If it's a permanent feature we save 8 bytes, but a single test and branch take 4 bytes, and I don't think it's going to fit in 10 bytes.

Give me more ROM and I can do a lot.

Back on April 1st, I said the Color BASIC version of IF THEN ELSE was about 60 bytes. I expected to possibly have 30 unused bytes left when I started, but all the special cases ate that up, and this has none of the 16 bit optimizations I wanted to add to the code.  So as far as running faster goes, yes it does but not like I'd hoped.

Now I'm going to clean up the code and put padding in front of the math library so the math functions sit at their original addresses.  That way anything that directly calls the math library will still work.
I think I'll call that good for the final V2.0 ROM unless I find another bug.  Maybe I can fix the bug in VARPTR in 10 bytes if there aren't any bugs related to ELSE.

*edit*
The math library is already at the correct address since the text string with the programmer's name followed it, and it happens to be... 10 bytes long.  How weird of a coincidence is that.

If I create a run only version that eliminates CSAVE, I could put in quite a few math optimizations.  It would mean editing and testing with one version, and using another version just to RUN the final code.  Both versions could be burned into a 16K ROM and an external switch could be used to switch between versions with CSAVE or fast math.
Come to think of it, the version might even be switchable at the command prompt without even rebooting the computer.
Solving the :ELSE issue in the subline/endofling scan routine required too much code and just slowed the interpreter down.  It turned out to be a complete waste of time.  I finally just inserted a colon before ELSE in the crunch routine.  It's only 5 instructions and it doesn't impact the speed of the interpreter which is why Microsoft did it that way.  It also took about 5 minutes, which was considerably less time than the other approach.  Some things just aren't worth fighting with.

This
          inx                                    ; pre-increment the output pointer
          stx       OUTPTR              ; store output pointer
...

Became this:
          inx                                         ; pre-increment the output pointer
          cmpb #$C9 ; Token for ELSE?
          bne crnext
         ldaa #':' ; output a colon
          staa ,X
          inx                                         ; increment the output pointer
crnext
          stx       OUTPTR                   ; store output pointer
...


The code has also been cleaned up a bit where it tests for the ELSE special case in the dispatch routine.

I posted the ROM but it still needs more testing, which won't be tonight.
There are still 10 unused bytes I haven't decided what to do with yet.

Wednesday, May 10, 2017

Just a quick note.  I looked at a hex dump of MCX BASIC, and ELSE appears to be the first new keyword.  If that is indeed the case, then programs saved from this upgrade should load on MCX BASIC.

Tuesday, May 9, 2017

I finally took a look at the colon issue the other day, and what I thought might be the best approach is to just issue an SN ERROR if there is no colon.  But this was met with unintended consequences, so I have to look at more code before I have a final solution.

The problem,is how IF THEN ELSE is parsed and executed.  Once the interpreter evaluates the test that follows the IF, it has to either execute the THEN condition which can contain multiple statements, and skip what follows the ELSE, or it has to skip to the ELSE and execute what lies after it.  Having to execute multiple statements means jumping back to the main interpreter which expects colons between commands.   But somewhere in the mix the interpreter is running off to never never land which can be difficult to trace.  I think it's in an endless loop and keeps pushing to the stack until it overwrites the hardware registers.  I'm guessing it just needs to advance the pointer like a previous fix, but I haven't located where yet.

Correction... it needs to issue an SN ERROR instead of going into an endless loop.


Edit...
Or I could be issuing an RTS when there isn't a return value on the stack.
ELSE reuses to the REM code... but it shouldn't.
At least that appears to be the problem.  Working on the fix...
One question... why does REM use RTS when it should JMP to the code to process the next line?
!!!!!!!!!!!!!!!!!

Tuesday, May 2, 2017

The code to insert a colon in front of ELSE during the line crunch should be easy enough.  It's probably only 3 or 4 instructions.  The code shouldn't crash if there isn't a colon, it should issue s syntax error, so I'll have to fix that as well.  I based my code on Color BASIC, so there is a chance that would crash just as spectacularly.
I don't have time to look at this at the moment, but when I do I'll post fixes and maybe start working through the other optimizations.  All these fixes are eating up ROM space so I'm not sure what will still fit.

Monday, May 1, 2017

The fix for the LIST command bug in the original source is pretty simple.
You have to increment the pointer so the loop doesn't get stuck on the same character..

Change this code:

LE49C         stab      UNCRFL              ; store flags byte
                      jsr       RVEC12              ; call UNCRUNCH extension hook
                      cmpa  #$C8                ; compare with highest token value (MEM)
                      bhi      LE462               ; output exclamation point (!) for invalid tokens
   

To this:

LE49C         stab      UNCRFL              ; store flags byte
                      jsr       RVEC12              ; call UNCRUNCH extension hook
                      cmpa   #$C8                ; compare with highest token value (MEM)
                      ble      skip
                      inc      TMPTR1
                      bra      LE462               ; output exclamation point (!) for invalid tokens
skip



This raises a question.  With numerous stories of people trying to load CoCo BASIC programs into the MC-10 when they were kids, why didn't anyone mentions this bug?
Well, it turns out my problem with the ELSE requiring a colon in front of it is not unusual.  In fact it is a requirement of every Microsoft BASIC that uses ELSE.  The tokenizer (cruncher) automatically inserts it into the code and the LIST command just hides it.  I was just unaware of this behavior.

Thanks for the info Darren!


Well, lack of testing clearly reared it's ugly head.

The LIST command barfs on ELSE because it does a range check on valid tokens and it goes into an endless loop.  I have already added a special case to deal with this and LIST now works, but it does expose a problem with the list command.  It should not go into an endless loop when an invalid token is encountered.  This is a bug in the original code that will eventually need to be fixed.

The other problem is that ELSE needs a : in front of it in order to prevent the interpreter from crashing hard.  This one might take more than a couple minutes to figure out, but it's likely to be in the routine that scans for characters.

The ROM image with a fixed LIST command has replaced the original version.
I don't have time to track down the other problem at the moment, but if I have time, it will be fixed today. ELSE will need a colon in front of it until then.


Retro Challenge Postmortem

The contest deadline has passed, my entry is in, I delivered everything I said I would, and I even threw in corrected math coefficients thanks to phyber0pt1k on the MC10 yahoo group.  Thanks for the contribution.

However, I had hoped for a greater speed improvement.  I implemented a LOT of optimizations on a test version of of the interpreter, but I ran into some problems with that many changes implemented all at once, and I think there was an issue with the initial ROM I started with.  I had to backtrack and create a clean ROM image and go from there.  Given the time remaining on the project it was better to play it safe than include minimally tested optimizations, especially when most of them only offered small performance gains.  

To make matters worse, I was chasing down a syntax error my ELSE test code was throwing up to 11:50 PM.  I had commented out the ELSE keyword for some unknown reason and it wasn't being tokenized.  I spend over an hour stepping through code trying to figure out why the ELSE code wasn't being called.    

After adding the ELSE statement, there are still 26 BYTEs of unused ROM, and there are some other space/speed optimizations that should improve on that.  Hopefully this will leave enough room for some optimizations that require more space  than the original code.

And finally, this might not have been the best choice of projects for a contest.  There are no fancy graphics or animation, no hardware to take photos of, and... well... in spite of the difficulty level there just isn't a lot to see.  The confines of an 8K ROM don't really allow any dramatic changes in speed that are going to really be obvious on video.