There are several more patches I'm going to release.
The CHRGET/CHRGOT patch and changing the code that divides by 10 used for ASCII conversion are about ready to go. They should speed up all programs somewhere between 2% and 4% if the results on the MC-10 are any indicator. These could cause a few programs to act differently or even fail depending on what they do. That's unlikely, but possible. So the patch will be designed so it can be assembled with or without that code.
There is a 16x16 multiply that can be replaced. This speeds up array indexing if I remember right. The code should be easily ported from the MC-10.
There are a series of 6309 patches that can be easily implemented. Screen scrolling and memory moves are the easiest to implement and the new code should fit over the old code in memory.
A fast divide and faster multiply can be implemented. The 6309 divide and larger multiply instructions are signed, so that will require a work around, but it's doable. Those will probably offer the greatest improvement for the least work. There are little patches that can go here and there, but those will require a lot of testing to be sure they don't break anything. Many of these will even work on a CoCo 1 & 2 since they will fit in the original ROM space. The fast multiply might even fit.
New square root code. (SQR). There wasn't enough space for this in the MC-10 ROM, but this will sit in RAM above ROM so it won't be an issue with the CoCo 3. Most of the code has already been worked out on the MC-10, but there was one lingering issue. The new code depends on a "magic" constant. The existing one only works for double precision numbers and the appropriate number for the CoCo/MC-10 floating point format will have to be generated. It may also impact precision. *IF* I get this working, it should run about 20% faster. This should drop Ahl's benchmark numbers down close to an IBM PC and SQR is also used in the fractal generator code I've been tinkering with.
This mod may take some time to work out.
The CHRGET/CHRGOT patch and changing the code that divides by 10 used for ASCII conversion are about ready to go. They should speed up all programs somewhere between 2% and 4% if the results on the MC-10 are any indicator. These could cause a few programs to act differently or even fail depending on what they do. That's unlikely, but possible. So the patch will be designed so it can be assembled with or without that code.
There is a 16x16 multiply that can be replaced. This speeds up array indexing if I remember right. The code should be easily ported from the MC-10.
There are a series of 6309 patches that can be easily implemented. Screen scrolling and memory moves are the easiest to implement and the new code should fit over the old code in memory.
A fast divide and faster multiply can be implemented. The 6309 divide and larger multiply instructions are signed, so that will require a work around, but it's doable. Those will probably offer the greatest improvement for the least work. There are little patches that can go here and there, but those will require a lot of testing to be sure they don't break anything. Many of these will even work on a CoCo 1 & 2 since they will fit in the original ROM space. The fast multiply might even fit.
New square root code. (SQR). There wasn't enough space for this in the MC-10 ROM, but this will sit in RAM above ROM so it won't be an issue with the CoCo 3. Most of the code has already been worked out on the MC-10, but there was one lingering issue. The new code depends on a "magic" constant. The existing one only works for double precision numbers and the appropriate number for the CoCo/MC-10 floating point format will have to be generated. It may also impact precision. *IF* I get this working, it should run about 20% faster. This should drop Ahl's benchmark numbers down close to an IBM PC and SQR is also used in the fractal generator code I've been tinkering with.
This mod may take some time to work out.