Date : Wed, 20 Jul 1994 16:22:08 EST
From : Stephen Quan <quan@...>
Subject: Re: My 6502 now dead fast!
Chris Rae writes :
> My 6502 emulator now runs the 256^3 loop in 37.4 seconds on my dx/40.
Hey, now that is a good time! I don't think my current emulator under
C can ever match this.
I have been examining alternate ways of getting performance. Who
else is also using C with one big switch statement to select the
instruction to be executed? I have recently found that this switch
statement costs nearly the equivalent of 8 if statements, and have
thought about using if statements instead of a switch statement.
Something like a binary search.
It seems to be only feasible if it can find your opcode when it
has traversed at most 3-4 deep in the binary search tree. If it
traverses deeper you'll get performance less than using a plain
switch statement. So if we can move the more frequently used
instructions to the top of the tree, we can get big savings.
Anyone seen a Huffman compression algorithm? I have seen it once,
and they used something similar to this. In otherwords, this technique,
adopts the Huffman compression algorithm to compress execution time!!
If the only instruction you'll execute is the instruction at the top
of the binary search tree, then on Chris Rae's platform 386DX-40 and
running 33,000,000 instructions will take about 15 seconds. If you
run the top instruction and instructions 2-levels deep, you'll execute
33,000,000 instructions in about 25-30 seconds. Since the 256^3
loop involves mainly DEY, DEX and BNE. These 3 instructions could be
moved to the top 3 entries of this tree to give the best results.
However, in order to build a general binary search tree algorithm
to give it performance comparable to a beeb, you'll have to move the
7-15 most frequently used instructions to the top of the tree, this
could be tricky to construct in C.
if (opcode == 0x80)
goto opcode_0x80;
else if (opcode > 0x80)
goto try_above_0x80;
else
goto try_below_0x80;
Let's say opcode 0xa9 (I think is the LDA #immediate) instruction is
the most frequent instruction how do we remap it so that it is at the
root of a binary search tree? Well you do exactly that, you remap it
there.
new_opcode = remap[opcode];
if (new_opcode == 0x80)
goto LDA_immediate;
else
goto try_above_0x80;
else
goto try_below_0x80;
Anyhow, implementing this thing is a real pain, especially trying
to obtain every case in the binary tree. There is just too many
goto cases to consider. I am looking at writing a C program that
writes the binary tree for me, as well as writing a remapping and
profile utility that lets me interactively select which instructions
I want to be moved to the top of the tree. I think this technique
is an interesting technique to implement, but I fear that it lacks
the smoothness and elegance of a plain switch statement.
--
Stephen Quan (quan@... ), SysAdmin, Analyst/Programmer.
Centre for Spatial Information Studies, University of Tasmania, Hobart.
GPO BOX 252C, Australia, 7001. Local Tel: (002) 202898 Fax: (002) 240282
International Callers use +6102 instead of (002).