A Bitbanger's Blog: Article 'A Great Old-Timey Game-Programming Hack', and a response

Here's a nice little story related to the 6809. It shows of one of the more interesting optimizations you can use with the 6809. It's also neat to see that people came up with similar solutions completely isolated from each other.

Link

My MC-10 (6803) 64 column graphics text code's screen scroll also uses the stack register as the destination pointer for similar reasons, but there are differences vs the 6809.

Each register PUSHed or PULLed requires a separate instruction, where the 6809 can PUSH or PULL multiple registers with a single instruction. As a result, he 6803 code looks more like their earlier code.

With only one stack pointer, you have to use the index register for the other source or destination pointer, and the offset is only 1 byte, so you can only go up to 254 with LDD #,X before you have to change X. The code looks like this, and it's unrolled for a 256 byte section of the screen:

LDD #255,x ; 2 bytes, 5 clock cycles
PSHB ; 1 byte, 3 clock cycles
PSHA ; 1 byte, 3 clock cycles
LDD #254,x ; 2 bytes, 5 clock cycles
PSHB ; 1 byte, 3 clock cycles
PSHA ; 1 byte, 3 clock cycles

etc...

You could PUSH/PULL two bytes at a time if you are storing/loading the index register. You would loose the index register as a source pointer, so you have to hard code the address for each pair of bytes.

LDX ROWADDRESS+254 ; 3 bytes, 5 clock cycles

PSHX ; 1 byte, 4 clock cycles

LDX ROWADDRESS+252

PSHX

LDX ROWADDRESS+250

PSHX

etc...

Using PSHX saves 22 - 9 = 13 clock cycles per pair of bytes moved, or 13 * ((32/2)*(192-8)) = 38,272 clock cycles per scroll! The code size also half then number of bytes per pair of bytes moved.

So why didn't I do that?

While this would be noticeably faster, you can't just change the index register for each 256 byte block, you have to hard code the addresses for the entire screen.
That may not be a big deal of you have a large RAM expansion, but it's not practical for most MC-10's. However, if you wanted to implement 4 rows of text at the bottom of the screen similar to the Apple II and several other 8 bit machines, then it's not so bad.

The latest code generates the scroll code on the fly at startup, so I could generate either version of the code depending on the hardware you have. We'll see.

A Bitbanger's Blog

Saturday, October 13, 2018

Article 'A Great Old-Timey Game-Programming Hack', and a response

No comments:

Post a Comment