Date : Mon, 19 Oct 2009 11:33:55 +0100
Subject: pdp-11 32-bit mult/div
Doing 32-bit multiplication and division on a 16-bit machine like the
PDP-11 needs cleverness. My PDP-11 C compiler used subroutines to do
long * and /. The multiplication routine was
/
/ 32-bit multiplication routine for fixed pt hardware.
/ Implements * operator
/ Credit to an unknown author who slipped it under the door.
.globl lmul
.globl csv, cret
lmul:
jsr r5,csv
mov 6(r5),r2
sxt r1
sub 4(r5),r1
mov 10.(r5),r0
sxt r3
sub 8.(r5),r3
mul r0,r1
mul r2,r3
add r1,r3
mul r2,r0
sub r3,r0
jmp cret
which is neat, and I wasn't smart enough to figure it out. I don't feel
guilty, though, because I didn't then know who suggested it and I did
acknowledge the fact.
But I'll carry this one on my conscience for a while. The division
routine included
1:
mov r4,-(sp)
clr r0
div r3,r0
mov r0,r4 /high quotient
mov r1,r0
mov r2,r1
div r3,r0
bvc 1f
sub r3,r0 / this is the clever part
div r3,r0
tst r1
sxt r1
add r1,r0 / cannot overflow!
1:
I almost (or maybe even completely) figured out why it worked.
The spot on the soul is the "this is the clever part" comment.
Addendum 18 Oct 1998
Amos Shapir of nSOF (and of long memory!) just blackened (or widened)
the spot a bit more in a mail message, to wit:
I gather the "almost" here is because this trick almost worked... It has
a nasty bug which I had to find the hard way!
The "clever part" relies on the fact that if the "bvc 1f" is not taken,
it means that the result could not fit in 16 bits; in that case the long
value in r0,r1 is left unchanged. The bug is that this behavior is not
documented; in later models (I found this on an 11/34) when the result
does fit in 16 bits but not in 15 bits (that is, overflow for signed,
but not unsigned types), the overflow bit is set, but the unsigned
result does overwrite the original values -- which makes this routine
provide very strange results!