<< Previous Message Main Index Next Message >>
<< Previous Message in Thread This Month Next Message in Thread >>
Date   : Mon, 20 Aug 1990 17:35:11 GMT
From   : hpfcso!hpldola!hp-lsd!was@hplabs.hpl.hp.com (Bill Stubblebine)
Subject: How to speed up Ampro LB+ SCSI?

Several weeks ago, I asked for advice on how to improve throughput for bulk
data transfers from my SCSI hard disk to my SCSI QIC tape drive.  For those
who missed the original article, my configuration is:

       Ampro LB Z80+ (w/built-in SCSI interface)
       Adaptec ACB4000 (not 4000A) SCSI hard disk controller
       Seagate ST-125 20 MB 40 ms hard disk drive
       3M MCD-403 40 MB QIC SCSI tape drive 
       NZ-COM/Z-System

The 3M MCD-403 SCSI tape drive was added recently to support backups.  As I
started transferring data between the hard disk and the tape drive, I
discovered that although the SCSI disk performance was adequate for
interactive and disk-to-disk operations, the hard disk could not source or
sink data fast enough to keep the tape drive streaming during transfers.

Before I posted my original request, I had experimented with several disk
transfer strategies to try to increase throughput.  All of my tests
employed standard BIOS calls that transfers 128 bytes per BIOS call, based
on Ampro's BIOS deblocking algorithm that reads or writes 512-byte SCSI
logical blocks to the hard disk.  My experiments indicated that BIOS calls
could never achieve sufficient throughput to keep the cartridge tape drive
streaming, no matter what the interleave factor is on the tape drive or on
the disk drive.  With all the stopping, repositioning and restarting of the
cartridge drive, the overall throughput from disk to tape was under 3K
bytes per second, plus the agony of hearing the drive stop and start for
each 8K SCSI tape block transferred.

Having run out of ideas, I asked the net for advice, and was gratified by
the quantity and quality of the responses I received.  To make a long story
short, I have increased the overall throughput of disk to tape transfers
from under 3K bytes per second to 12.7K bytes per second, allowing 10
megabytes to be backed up in about 13 minutes unattended.  This is bliss
compared to the endless attended floppy disk backups I am accustomed to.

To assist anyone who may be facing similar system integration problems, I
decided to keep a log of my experiments, which is summarized below.  The
quadrupling of throughput from 3K bytes/sec to 12.7K bytes/sec resulted
from three categories of improvements to my configuration:

1. Read or write as many bytes as possible in each SCSI command, both from
   the SCSI hard disk and the SCSI tape drive.

2. Use the Z80 high-speed INIR/OTIR I/O instructions instead of software
   controlled byte-by-byte handshaking to talk to the 5380 SCSI interface
   chip on the Ampro LB+.

3. Once #1 and #2 are implemented, select optimal interleave factors on
   both the hard disk and the tape drive to maximize overall throughput.


The biggest improvement came from #1.  Reading 8k from the disk in one SCSI
command more than doubled the overall throughput compared to normal BIOS
calls, providing streaming operation in the tape drive for tape interleave
factors of 6:1 or greater.

               HD interleave:          9:1
               HD transfer mode:       byte-by-byte
               HD transfer size:       8K x 1
               Tape interleave:        6:1
               Tape transfer mode:     byte-by-byte
               Tape transfer size:     8K x 1
               Net throughput:         6631 Kbytes/sec

Next, I modified the disk read routine to read 8K bytes in two 4K SCSI
commands, thereby simulating processing two distinct 4K CP/M disk
allocation groups.  The results were the same as for a single 8K SCSI
operation, i.e., the tape keeps streaming.  This experimental result
suggests that the disk-to-tape backup program should bypass the BIOS
altogether, and process CP/M allocation groups directly from the CP/M disk
directory entries, converting the (4K-byte) CP/M allocation group number
into a SCSI logical block number, then read all 4K of the allocation block
from the disk in one SCSI command.  This should be a robust strategy,
because (in the Ampro system) HD space cannot be allocated in chunks of
less than 4K bytes = 1 CP/M allocation group.

               HD interleave:          9:1
               HD transfer mode:       byte-by-byte
               HD transfer size:       4K x 2
               Tape interleave:        6:1
               Tape transfer mode:     byte-by-byte
               Tape transfer size:     8K x 1
               Net throughput:         6631 Kbytes/sec

Next, I changed the SCSI handshakng from byte-by-byte to INIR/OTIR burst
mode for both the hard disk and the MCD tape drive.  This increased the
burst transfer rate from 15us per byte to 5.25us per byte for both devices.

Using a scope to monitor the SCSI bus, I then experimented with bulk SCSI
transfers from hard disk at various disk interleave factors, obtaining the
following surprising results:

               Hard Disk       Time to transfer 
               Interleave      8192 bytes HD->memory
               ----------      ----------------
                  2:1              165ms
                  3:1               80ms
                  4:1               95ms
                  5:1              110ms
                  6:1              120ms
                  7:1              140ms
                  8:1              120ms
                  9:1              140ms

At an interleave of 3:1, the fastest for bulk SCSI transfers, the hard disk
supports a burst transfer rate of 5.25us per byte = 190.4K bytes/sec to the
Ampro host, and a sustained data transfer rate of 102.4K bytes/sec, not bad
for a lowly Z-80.

Note: The previous and new interleave factors of 2:1 and 3:1, respectively,
      have virtually identical throughput for 512-byte BIOS transfers to
      and from disk.  However, for multi-block transfers like the ones I
      intend to use for tape backups, an interleave of 3:1 produces a huge
      (i.e., >double) increase in disk throughput compared to an interleave
      factor of 2:1.

With the hard disk formatted with interleave factor 3:1 and with burst mode
data transfers in effect to both the hard disk and the tape drive, I then
experimented with various tape drive interleave factors.  The result is
that I now can keep the tape drive streaming at a tape interleave factor of
4:1, which is much better than I had originally hoped.  The overall disk to
tape throughput increased to 9716 bytes/sec in this configuration.

               HD interleave:          3:1
               HD transfer mode:       burst
               HD transfer size:       4K x 2
               Tape interleave:        4:1
               Tape transfer mode:     burst
               Tape transfer size:     8K x 1
               Net throughput:         9716

Reading data from the hard disk in two 4K byte chunks takes about 80ms.  A
scope trace of SCSI bus activity indicated that a disk rotation was being
lost between reading sequential 4K chunks, even when the two chunks were
(logically) adjacent to one another on the same disk track, as is usually
the case in large sequential files.  When I repeated the experiments
reading 8K from the disk in one SCSI request, the time required to fill the
memory buffer from the disk dropped to around 60ms.  In this configuration,
the tape remained streaming at a tape interleave of 3:1, with overall
throughput from the disk to the tape increasing to 12787 bytes/sec.

               HD interleave:          3:1
               HD transfer mode:       burst
               HD transfer size:       8K x 1
               Tape interleave:        3:1
               Tape transfer mode:     burst
               Tape transfer size:     8K x 1
               Net throughput:         12787 Kbytes/sec

Getting writes to work to the tape was quite an adventure.  The same trick
that worked effectively for reads from the tape, namely setting the burst
mode for 256-byte transfers, caused writes to the tape to hang in mid SCSI
phase.  The curious thing was that the multi-block writes worked fine when
I stepped through them under manual control in the ZSID debugger, but hung
when running normally.  Figuring there was some race condition between the
disk reads and the tape writes, I fiddled around with delays everywhere to
no avail.  Because the multi-block transfers worked OK with byte-by-byte
handshaking, I finally concluded that 256 must be the wrong number of data
bytes to transfer to the tape controller in a burst during the SCSI
data-out phase.  But what was the right number?  I set the burst mode to 16
bytes per burst, which cut the byte-by-byte overhead by a factor of 16.
This worked fine, allowing writes to the tape to stream at a tape
interleave factor of 3:1, the same as for reads.

Note:  I still cannot explain why write transfers to the tape drive hang
       with 256 byte bursts and not with 16 byte bursts.  Reads and writes
       both transfer 8192 bytes from or to the tape controller.  This
       should loop the OTIR instruction exactly 32 times for 256 byte
       bursts and exactly 512 times for 16-byte bursts.  Moreover, the
       transfer rate in either case is only one third of the tape drive
       controller's 500Kb/sec rated SCSI burst throughput.  Maybe the
       discrepancy in the number of bytes transfered is on a 16-byte
       boundary, but I find this hard to believe.  My 16-byte burst
       solution works, but maybe I'll just RTFM one more time...)

None of my experiments thus far involved frequent head seeks on the hard
disk, which are bound to add some overhead to the tape transfers, and could
cause loss of streaming.  To allow some overhead for head seeks, and still
keep the tape streaming, I relaxed the tape interleave factor from 3:1 to
4:1.

All in all, I'm quite happy with the results.  I know that I can do 12.7K
bytes/sec at 3:1 tape interleave, and nearly 10K bytes/sec at 4:1 tape
interleave.  Depending on the tape interleave I finally settle on, I have
either tripled or quadrupled the overall disk-to-tape throughput compared
to where I started, and learned a little about my disk drive, my tape drive
and the SCSI protocol in the process.

Now it's on to building a primitive file system to manage my backups on the
cartridge tape.  Since I envision the tape as just an archive of large
backups (.LBR or tar files), without alot of random access going on, I'm
inclined toward using a simple directory structure similar to the one for
Novosielski .LBR files, but based on SCSI addressing instead of CP/M tracks
and sectors.  I'm flexible though, and I'd welcome any suggestions anyone
might have regarding a file system for the cartridge tape.

Lastly, a small personal note:  Over the years I've had to put up with no
end of criticism from associates regarding my ongoing interest in Z80
computers.  Still, I'm continually amazed at my ability to continually push
the envelope of this friendly little OS and CPU.

One of my other hobbies is sailing.  I get endless pleasure from trimming
the sails, reading the wind, pushing the last 1% out of the system.  I get
the same feeling when talking to one of those so-called DOS "power users"
as I do when some muscle boat goes tearing past me on the water.  I remark
to myself "very impressive - but what do you do after the first 10 minutes
when the novelty's worn off?"

Thanks again for all the help.  It's nice to know there is still a group
that shares some of my opinions.  Perhaps I can return the favor one day.

                                Bill Stubblebine
                                Hewlett-Packard Logic Systems Div.
                                Colorado Springs, CO
                                was@hp-lsd.hp.com  (Internet)
                                (719) 590-5568

<< Previous Message Main Index Next Message >>
<< Previous Message in Thread This Month Next Message in Thread >>