<< Previous Message Main Index Next Message >>
<< Previous Message in Thread This Month Next Message in Thread >>
Date   : Fri, 05 Aug 1994 13:28:27 +0100 (BST)
From   : clr1@...
Subject: Re: Assembly register usage.

I sent this to SQ individually by mistake; thought I'd post another copy 
to the list. Sorry you got two, SQ!

+-------------------+-------------------------------------------------+
| /-- |_| /-- | (~  | "And the driving is like the driving of Jehu,   |
| \-- | | |   | _)  | the son of Nimshi, for he drives furiously."    |
+-------------------+-------------------- Second Book of Kings 9 v20 -+

> Hi Chris!  That increment does not cost an extra clock cycle at all.
> This increment is free!  This is the reason why I suggested it.

Oops. Well, I did say I wasn't sure. I'm afraid you seem to know a great 
deal more than me about the hardware side of things...

> To fully partially understand why, you need to remember that this CPU
> is implemented with electronics at it is possile to have more than one
> part of the electronics in the CPU to be active.  I have done some
> VLSI design and know that autoincrement is just a piece of wire added
> to your circuit so that whenever you access (say) the PC register, you
> can have it autoincremented in the same clock cycle.

Fully partially?   ;-)

Okeydokey - I believe you about the chip things! I don't understand the 
electronics side of computing very well. It was probably a bad idea to 
start writing an emulator then, I hear everyone saying...

>    Besides the cost usages (according to my reference) are :
> 
>        LODSB                        = 5 cycles
>        MOV   AL,[DS:SI]             = 5 cycles
>        MOV   AL,[DS:SI+dispacement] = 9 cycles

Hmm. According to my TASM reference manual, mov AL,[DS:SI] took 4 on a 
386. I haven't got it here so I couldn't check up but I realise you may 
be taking cycles for a different processor.

> I suggested LODSB as an alternative
> 
>        LODSB            (jumped based on)
>        LODSB            (load #12)
>        MOV   CL,AL      (put it into the accumulator)
> 
> This would cost me 5+5+2 = 12 cycles.

Err... umm... it would mess up my system of storing the PC flags in AH 
though. Bearing that in mind, I would reckon it is six and half a dozen; 
your system is faster for all instructions and mine is faster for some 
(and arguably the ones you do pretty often). There's not much in it. 
Actually, looking at the code your is probably better but right now I'm 
not *that* bothered about speeding up my 6502; I've got a few other ideas 
up my sleeve which should probably give 10% or so more speed but they're 
a bit complicated to implement before my 6502 is fully working!

Once I get a whole beeb working (dream on!) I'll get back to work on my 6502.

> I haven't added the overheads of jumping.  This would make the
> difference in the implementions almost nothing.  4 IBM cycles
> per (say) 50 IBM cycles?  It really depends on how the rest of
> the emulator looks, after you have graphics et al support built

Well, I think you're underestimating the jumps. I'm probably outof my 
depth here but I definitely remember it taking >4 cycles for two jumps. I 
initially used a CALL and RET system but I realised that it would be much 
faster to do, eg, instead of:

  BX=routine offset
  CALL BX
  JMP next_instruction
routine:
  RET

(you get the jist)

I could instead do:

  BX=routine offset
  JMP BX
routine:
  JMP next_instruction

which saved a large amount of time (as far as I recall, 20%!) because the 
addresses didn't have to go onto the stack.

With a program like this, the scope for code optimisation is almost infinite!

+-------------------+-------------------------------------------------+
| /-- |_| /-- | (~  | "And the driving is like the driving of Jehu,   |
| \-- | | |   | _)  | the son of Nimshi, for he drives furiously."    |
+-------------------+-------------------- Second Book of Kings 9 v20 -+
<< Previous Message Main Index Next Message >>
<< Previous Message in Thread This Month Next Message in Thread >>