Wednesday, February 7, 2018

Finalizing a new ROM release

It's time for a new MC-10 ROM release.  I've squeezed about as much into 8K as possible within a reasonable amount of time.  As of now there are around 10 bytes free in the ROM, but it's not enough to implement storing the pointer to the next line... which was the only other optimization I thought might fit in 8K.


What's going to be in this release?

1. A faster divide  Some cycles were removed from the inner loop.

2. Faster Screen Scroll.  The INX INX replacement with LDAB # ABX optimization along with unrolling the loop once.  This saves over 1000 clock cycles per scroll.

3. Faster end of line handling.

4. Faster array handling thanks to a 16x16 bit multiply using the hardware multiply.

5. Some minor optimizations here and there to save space and/or clock cycles.

6. The original parsing routine, CHRGET (common to all Microsoft BASICs), was split between direct page RAM built into the 6803, and ROM.  The original code would increment the memory pointer, load a byte, and then JMP to the 2nd half of the code in ROM.   The code has been updated so the full code is copied to RAM.  Since it extends past the end of the direct page ($00FF), it tests address $0100 to see if the code copied there exists.  If not, it patches the code to JMP to the ROM.  Even if there isn't any expansion RAM at $0100, there was still room on the direct page to handle the most common case/before jumping to ROM.  It's always faster than the factory ROM, but if you have RAM in that address range, it's even faster.


How does the performance compare to the original ROM?

My original goal was a minimum of a 5% speedup across the board, and beating 1 MHz 6502 machines like the Apple II and C64 at Ahl's Benchmark.  Testing has shown the latest version to be about 8% faster on the slowest code, 10% faster on code using arrays but little math, and math intensive things like the 3D "Fedora" plot which uses a lot of multiplication take about 33% less time.  Ahl's Benchmark still sits at about 67 seconds which is where it was after the first use of the hardware multiply, though it's tough to tell with hand timing.  The MC-10 consistently benchmarks faster than 1 MHz 6502 machines running Microsoft BASIC.  Not just at Ahl's Benchmark, but everything.  The speed difference vs the original ROM is now obvious when drawing, printing, etc...

The only thing left to do on this release, is to track down a bug in the error message handling.  The error printing isn't receiving the right error codes.  This is probably due to an optimization that treats the status bits wrong.

No comments:

Post a Comment