Another speed optimization I'm working on involves parsing every character via the CHRGET subroutine. CHRGET is a piece of self modifying code that is copied from ROM to the direct page at startup. The function looks like this:
If we can place CHRPTR in X, we can move this to ROM right in front of BPARSE and reduce it from 22(?) clock cycles to 7(?) clock cycles. If we don't have to update CHRPTR until we exit BPARSE, we save even more clock cycles since BPARSE loops back to CHARGET.;*;* Byte parser subroutine utilizing self-modifying code.;* This routine is copied to RAM at CHRGET ($00EB) during cold start.;*fcb INIDAT-PARSER ; number of bytes to copyPARSER inc CHRPTR+1 ; increment LSB of parse locationbne LF7D8 ; branch if no carryinc CHRPTR ; increment MSB of parse locationLF7D8; ldaa $0000 ; load byte from parse location into ACCAfcb $B6,$00,$00jmp BPARSE ; call back-end of parser routine in ROM
This also only resulted in a 1 byte increase in code size thanks to removal of a couple STX CHPTR instructions elsewhere. Sadly, only 2 out of 20+ calls can use this optimization without major changes, and neither will speed up program execution.
CHRGET2
inx
stx CHRPTR
ldaa ,X ; get the next character
If someone builds an MC-10 clone using a 68HC11 based microcontroller (as has been suggested in the MC-10 yahoo group), the code could easily be optimized my placing CHRPTR in the Y register. Then every update to CHRPTR just involves loading Y or incrementing Y, and CHRGET becomes INY LDAA ,Y.
This should speed up the interpreter significantly.
No comments:
Post a Comment