<< Previous Message Main Index Next Message >>
<< Previous Message in Thread This Month Next Message in Thread >>
Date   : Tue, 17 Oct 1989 13:24:58 GMT
From   : mcsun!unido!gmdzi!wittig@uunet.uu.net (Georg Wittig)
Subject: Z80 algorithms

Someone mailed to me the following memory fill algorithm:

>
>   ld      hl,buffer               ; point at buffer
>   ld      de,buffer + 1           ; point at next byte
>   ld      bc,count - 1            ; number of bytes minus one
>   ld      (hl),xxx                ; save the first byte
>   ldir                            ; replicate through rest of buffer
>
>is the fastest buffer fill I know on the Z80.

There exists a much faster faster algorithm for that. Let's see:

The central statement in your solution is `ldir'. It takes 21 T states per
byte. For 16 bytes this is 336 T states.

Using the push statement is much faster:
       Set D = E = the byte to be filled in;
       let SP point 1 byte after the end of the area to be filled;
       B contains the number of 16 byte blocks to be filled.
Then use "push DE"s, and you're finished very quickly.

       DI                      ; CP/M must not interrupt, because SP will be
                               ; misused
       LD      (sp_save),SP    ; save current SP value
       LD      SP,HL           ; assuming HL points to <end+1>
L:     PUSH    DE              ; 8 times, so 16 bytes are filled
       PUSH    DE
       PUSH    DE
       PUSH    DE
       PUSH    DE
       PUSH    DE
       PUSH    DE
       PUSH    DE
       DJNZ    L               ; a 16 byte portion has been processed.
       LD      SP,(sp_save)    ; restore the SP
       EI                      ; done

This way you can fill up to 4096 (256*16) bytes. If more bytes are to be
filled, build a loop around it. If the number of the bytes to be filled isn't a
multiple of 16, the bytes 1 to 15 can be filled straight forward with a
traditional algorithm.

The timing of that algorithm: The central loop starts at "L:" and ends with
"DJNZ". "push DE" needs 11 T states; DJNZ needs 13 ones. So for 16 bytes to be
filled you get: 

       8 * 11 + 13 = 101

       101 / 336 = 30 %

Relatively fast, isn't it? And -- it works!

PS: The idea isn't mine, I found it some time ago in a journal. I'm sorry I
don't remember which one it was.

-- 
Georg Wittig   GMD-Z1.BI   P.O. Box 1240   D-5205 St. Augustin 1 (West Germany)
email: wittig@gmdzi.uucp   phone: (+49 2241) 14-2294
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"Freedom's just another word for nothing left to lose" (Kris Kristofferson)


End of INFO-CPM Digest V89 Issue #188
*************************************
<< Previous Message Main Index Next Message >>
<< Previous Message in Thread This Month Next Message in Thread >>