<< Previous Message Main Index Next Message >>
<< Previous Message in Thread This Month Next Message in Thread >>
Date   : Thu, 12 May 2005 11:37:14 +0100
From   : Richard_Talbot-Watkins@...
Subject: Re: 6502 Division routine

Peter Craven wrote:

> Does anyone have simple machine code routines for 16 bit
> division.  Were faster methods ever developed? If so, does
> anyone have these?

Try this, it's the fastest one I know:

dividend=&70:REM 16-bit
divisor=&72:REM 16-bit
result=dividend:REM 16-bit
remainder=&74:REM 16-bit

.divide
\ Divides 16-bit dividend by 16-bit divisor
\ Set remainder/remainder+1 to zero before use.
\ Returns result and remainder

LDY #16:.loop
ASL dividend:ROL dividend+1
ROL remainder:ROL remainder+1
SEC:LDA remainder:SBC divisor:TAX
LDA remainder+1:SBC divisor+1:BCC skip
STX remainder:STA remainder+1
INC dividend
.skip DEY:BNE loop:RTS

Here's an article by a chap called Bob Sander-Cederlof who describes how it
all works (from the Apple Assembly Line Archives)... apologies if my mail
client screws up all the formatting again...

[article begin]

Remembering long division in decimal can be hard enough, but visualizing it
in binary and implementing it in 6502 assembly language is awesome!  Study
the following example, in which I divide an 8-bit value by a 4-bit value:

                 00110            6
             ----------         ---
       1101 ) 01010101      13 ) 85
   step A:   -0000              -78
              ----               --
               1010               7
   step B:    -0000
               ----
               10101
   step C:    - 1101
               -----
                10000
   step D:     - 1101
                -----
                  0111
   step E:       -0000
                  ----
                  0111   Remainder

In the binary version, I have not made any leaps ahead like we do in
decimal.  That is, I wrote out the steps even when the quotient digit = 0.
Now let's see a program which divides an 8-bit value by a 4-bit value, just
like the example above.  If you think this is a clumsy program, you may be
right.  Note that the loop runs five times, not four.  This is because
there are five steps, as you can see in the sample division above.

 1010  *--------------------------------
 1020  *      DIVIDE 8-BIT VALUE
 1030  *          BY 4-BIT VALUE
 1040  *--------------------------------
 1050  DIVIDEND   .EQ 0
 1060  DIVISOR    .EQ 1
 1070  QUOTIENT   .EQ 2
 1080  *--------------------------------
 1090  S.DIV.8.BY.4
 1100         LDY #5       COUNT OFF 5 STEPS
 1110         LDA #0
 1120         STA QUOTIENT
 1130         LDA DIVISOR       SEE IF DIVISOR IN RANGE
 1140         BEQ .3            DIVIDE BY ZERO IS ILLEGAL
 1150         ASL          SHIFT DIVISOR TO LEFT NYBBLE
 1160         ASL
 1170         ASL
 1180         ASL
 1190         STA DIVISOR
 1200  .1     LDA DIVIDEND      COMPARE DIVIDEND TO DIVISOR
 1210         SEC
 1220         SBC DIVISOR
 1230         BCC .2            DIVIDEND IS SMALLER
 1240         CMP DIVISOR       SEE IF STILL LARGER
 1250         BCS .3            YES, OVERFLOW
 1260         SEC               SET QUOTIENT BIT = 1
 1270         STA DIVIDEND
 1280  .2     ROL QUOTIENT      SHIFT QUOTIENT BIT IN
 1290         LSR DIVISOR       SHIFT DIVISOR OVER
 1300         DEY
 1310         BNE .1            DO NEXT STEP
 1320         ROL DIVISOR  RESTORE DIVISOR
 1330         RTS
 1340  .3     BRK          DIVIDE FAULT

The first thing the program does is to clear the quotient value.  In a
4-bit machine performing 8-bit by 4-bit division would yield a 4-bit
quotient, so the top bits must be cleared.  The rest of the bits will be
shifted in as the division progresses.

Next the divisor is shifted up to the high nybble position, to align with
the left nybble of the dividend.  This is equivalent to step A in the
example above.  The loop running from line 1200 through line 1310 performs
the five partial divisions.

If the divisor is zero, or if the first partial division proves that the
quotient will not fit in four bits, the program branches to ".3".  I put a
BRK opcode there, but you would put an error message printer, or whatever.

To run the program above, I typed:

     :$0:55 0D N 800G 0.2

and Apple responded with:   0000- 07 0D 06

which means the remainder is 7, and the quotient is 6.

Dividing Bigger Values:

The following program will divide one two-byte value by another.  The
program assumes that both the dividend and the divisor are positive values
between 0 and 65535.  This program was in the original Apple II monitor ROM
at $FB84, but is not present in the Apple II Plus and Apple //e ROMs.

 1010  *--------------------------------
 1020  *      DIVIDE 16 BY 16
 1030  *--------------------------------
 1040  ACL    .EQ $50
 1050  ACH    .EQ $51
 1060  XTNDL  .EQ $52
 1070  XTNDH  .EQ $53
 1080  AUXL   .EQ $54
 1090  AUXH   .EQ $55
 1100  *--------------------------------
 1110  DIVMON LDY #16      INDEX FOR 16 BITS
 1120  .1     ASL ACL      DIVIDEND/2, CLEAR QUOTIENT BIT
 1130         ROL ACH
 1140         ROL XTNDL
 1150         ROL XTNDH
 1160         SEC
 1170         LDA XTNDL    TRY SUBTRACTING DIVISOR
 1180         SBC AUXL
 1190         TAX
 1200         LDA XTNDH
 1210         SBC AUXH
 1220         BCC .2       TOO SMALL, QBIT=0
 1230         STX XTNDL    OKAY, STORE REMAINDER
 1240         STA XTNDH
 1250         INC ACL      SET QUOTIENT BIT = 1
 1260  .2     DEY          NEXT STEP
 1270         BNE .1
 1280         RTS

As written, this program expects the XTNDL and XTNDH bytes to be zero
initially.  If they are not, a 32-bit by 16-bit division is performed;
however, there is no error checking for overflow or divide fault
conditions.

This program builds the quotient in the same memory locations used for the
dividend.  As the dividend is shifted left to align with the divisor
(opposite but equivalent to the shifting done in the previous program),
empty bits appear on the right end of the dividend register.  These bit
positions can be filled with the quotient as it develops.


Double Precision, Almost:

What if I want to divide a full 32-bit value by a full 16-bit value?  Both
values are unsigned.  The 32-bit dividend may have a value from 0 to
4294967295, and the divisor from 0 to 65535.  All of the published programs
I could find assume the leading bit of the dividend is zero, limiting the
range to half of the above.

If the leading bit of the dividend is significant, a one bit extension is
needed in the division loop.  The following program implements a full 32/16
division.

 1010  *--------------------------------
 1020  DIVIDE LDX #17           16-BIT DIVISOR
 1040         CLC               START WITH NO OVERFLOW
 1050  .1     ROR OVERFLOW
 1060         SEC
 1070         LDA DIVIDEND+1    NEXT-TO-HIGHEST BYTE
 1080         SBC DIVISOR+1     LEAST SIGNIFICANT BYTE
 1090         TAY               SAVE RESULT
 1100         LDA DIVIDEND      HIGHEST BYTE
 1110         SBC DIVISOR
 1120         BCS .2            QUOTIENT BIT = 1
 1130         ASL OVERFLOW      TRUE QUOTIENT BIT
 1140         BCC .3
 1150  .2     STY DIVIDEND+1    QUOTIENT BIT = 1
 1160         STA DIVIDEND
 1170  .3     ROL DIVIDEND+3    SHIFT QUOTIENT BIT INTO END
 1180         ROL DIVIDEND+2    AND MOVE TO NEXT POSITION
 1190         ROL DIVIDEND+1
 1200         ROL DIVIDEND
 1210         DEX
 1220         BNE .1
 1230         ROR DIVIDEND      SHIFT REMAINDER BACK IN PLACE
 1240         ROR DIVIDEND+1
 1250         ROR OVERFLOW      SET SIGN BIT IF OVERFLOW
 1260         RTS
 1270  *--------------------------------
 1280  DIVIDEND   .BS 4
 1290  REMAINDER  .EQ DIVIDEND
 1300  QUOTIENT   .EQ DIVIDEND+2
 1310  DIVISOR    .BS 2
 1320  OVERFLOW   .BS 1
 1330  *--------------------------------
 1340         .LIF

Line 1020 sets up a 17-step loop, because the 16-bit divisor can be shifted
to 17 different positions under the 32-bit dividend.  To make it easier to
understand the layout of bytes in memory, I departed from the usual
low-byte-first-format in this program.  I assume this time that the most
significant bytes are first:

     Dividend:   $83A $83B $83C $83D
                 msb . . . . . . lsb

     Divisor:    $83E $83F
                 msb...lsb

I also have written this program to feed the quotient bits into the least
significant end of the dividend register, as the dividend shifts left.  The
remainder will be found in the left two bytes of the dividend register, and
the quotient in the right two bytes.

[article ends]



...and for completeness, here's a super-fast 8 bit * 8 bit multiply.  This
is the sort of thing that lets you write Elite on a BBC :)

.multiply
\ Multiplies A*X, and stores result in product/product+1

CPX #0:BEQ zero
DEX:STX product+1
LSR A:STA product
LDA #0
BCC s1:ADC product+1:.s1 ROR A:ROR product
BCC s2:ADC product+1:.s2 ROR A:ROR product
BCC s3:ADC product+1:.s3 ROR A:ROR product
BCC s4:ADC product+1:.s4 ROR A:ROR product
BCC s5:ADC product+1:.s5 ROR A:ROR product
BCC s6:ADC product+1:.s6 ROR A:ROR product
BCC s7:ADC product+1:.s7 ROR A:ROR product
BCC s8:ADC product+1:.s8 ROR A:ROR product
STA product+1
RTS
.zero
STX product:STX product+1
RTS


**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
postmaster@...

This footnote also confirms that this email message has been checked
for all known viruses.

**********************************************************************
Sony Computer Entertainment Europe
<< Previous Message Main Index Next Message >>
<< Previous Message in Thread This Month Next Message in Thread >>