Tuesday, August 29, 2017

ADCD, ADCD, wherefore art thou ADCD? (or... What is missing on the 6803?)

The 6803 isn't bad, but here is a list of what I think are the biggest oversights or shortcommings in the design.
  1. No prefetch
    A prefetch makes a huge difference on performance without having to increase the clock speed It typically reduces the number of clock cycles per instruction by 1. One of the key reasons why the 6502 is so fast is due to it's prefetch. Perhaps the biggest speed improvement the Hitachi 6309 has over the 6809 is a prefetch. The Hitachi HD6303 (compatible with the 6803) has a prefetch. That' makes the 6303 at least 20% faster with ZERO changes to the code. An MC-10 with a 6303 should benchmark similar to 2MHz 6502s or 4MHz Z80s.
  2. No direct addressing form of INC or DEC.
    When you have so few registers, directly incrementing or decrementing your loop counters in memory saves a lot of register shuffling. A direct addressing mode would save a clock cycle for every INC or DEC executed in a loop. Execute a loop 100 times, save over 101 clock cycles if you include initializing the counter. And that's if you aren't using 16 bits where you might need to DEC or INC twice per loop. Where you don't use indexing, you can use X as a counter, but that's usually for counting down to zero where you don't have to perform a 16 bit test, just BNE until the loop is complete. Direct addressing also requires 1 less byte everywhere it's used.
  3. No ADCD (Add Carry D) or SBCD (subtract carry D). The missing ADCD really impacts the math library. You have to use two instructions, ADCB ADCA, which slows down the code and makes it larger. The BASIC math library could have used 20 or more ADCD and many of them are in loops. Really, 16 bit support on the 6803 and 6809 should have been better. One of the key advancements the 6309 has over the 6809 is finishing out the 16 bit opcodes.
  4. No XGDX (exchange D & X) instruction.
    Without XGDX, moving data between D & X requires multiple instructions and an intermediate RAM location. You have to perform math on pointers, but other than ABX, you can only do that with the accumulator. This could have sped up my math library optimizations quite a bit. It would speed up the my line drawing code as well. The HD6303 and 68hc11 support XGDX.. The 68hc12 and 6809 have LEA for performing some math on index registers, and you have the ability to transfer between registers. The same pretty much goes for XGDS. It would make a significant difference when adjusting the stack from compiled code. It's not as efficient as LEAS, but turning 8 or more instructions into 3 and eliminating multiple RAM accesses is a pretty significant improvement.
  5. No Y index register like the 6809, 68hc11, etc... The single index register presents a few problems. You can use the stack pointer for some operations, but that may involve disabling interrupts and you can't use offsets from it like X.
  6. No stack relative addressing. Stack relative addressing is important for compilers. Most high level language compilers pass parameters on the stack, and dynamically allocate variables on the stack. When you only have one index register, accessing variables passed on the stack can become a register swapping mess. Just adding stack relative addressing suddenly makes the 6803 a lot more efficient for compiled code. The 8080, Z80, and 6502 lack stack relative addressing as well, but the 6803 suffers from a bit more index register swapping. It also makes using the stack pointer as an index register much more like using X.
  7. No divide instruction. Hardware math is faster than software math. A hardware divide is going to speed up a lot of math intensive applications. When combined with the MUL instruction, you have a machine that's pretty good for Mandelbrots, calculating primes, fractals, 3D plots, etc... All fun stuff that 8 bits take hours to do. The instruction would only be used a few times in an 8K ROM, but the difference in speed where it would be used is significant.
  8. The built n hardware is addressed on the direct page and cannot be relocated. It interferes with existing software applications from 6800 systems like FLEX. It's not a big deal if you have the source code and don't need to use all of the direct page, but patching binaries to use different index registers isn't simple. The 68hc11 uses $1000, so it doesn't interfere with the direct page, but then it interferes with the code. The HD64180 (Z180) allows you to select where in the Z80 I/O region the hardware is located. Being able to set the high nibble or byte would let the hardware be addressed in a region FLEX normally uses for hardware. ..

No comments:

Post a Comment