Date : Sat, 28 Apr 1984 20:54:00 MST (Sat)
From : "Frank J. Wancho" <WANCHO@simtel20.ARPA>
Subject: Space crunch and tar tapes
The current collections of public domain files are bursting the seams
of the RP06 that's holding them. Here are the current disk usage
stats taken Friday afternoon:
MICRO:<CPM.*> 14,818 disk pages
MICRO:<CPMUG.*> 7,692 15.4 MBytes
MICRO:<SIGM.*> 20,192 40.4
MICRO:<PC-BLUE.*> 3,260 6.5
------ ----
Total: 46,019 disk pages 62.3 MBytes
A TOPS-20 disk page is 512 36-bit words, and the above figures include
the values for the superior directories in each case.
On top of that, it also looks like I'll be getting new SIG/M releases
monthly (five more volumes just showed up). So, to relieve the
potential trauma of lack of elbow room on MICRO:, I have prevailed
upon Gail Zacharias to apply her file type determining algorithms used
in DE-LBR to build another program. This program analyses each file
and converts only those files which it determines are truly ASCII text
files, but stored in ITS-Binary format, into ASCII text files. I'll
be running her program on the files in the above directories, except
MICRO:<CPM.*> over the next few days.
This, of course, means that my oft-repeated statement that ALL the
files in those directories are stored in ITS-Binary format will no
longer be true. To avoid confusion, Gail's program happens to write
out the next generation, which will be .2 in most cases. This means
that your clue for FTP is that if the filename shows up as file.typ.1,
it is ITS-Binary, and if it shows up as file.typ.2, it is ASCII. You
will also have an updated .CRCLST to reference, which shows the
storage method for each file.
Simply because ASCII text files are stored five characters per word,
instead of four bytes per word in ITS Binary format, I expect to
recover a considerable amount of file space by using this utility.
However, the catch is, other than looking for the .2 generation
number, that most of the CRC values published with each volume and
elsewhere will no longer match the files stored in ASCII format.
Don't expect the conversions to happen all at once. Just be forwarned
that things are changing if you see something strange.
Finally, a note for those of you on Unix machines, especially those
not connected to DDN (and thus do not have FTP access to SIMTEL20). I
have received an updated version of the TOPS-20 tar program which can
now write binary as well as ASCII files to tar tapes. After I finish
the conversions, I'll see if I can find time to experiment with the
program to see how many tapes will be needed to store all these files.
What I expect we can do is make a set of tapes available to a
volunteer beta site to see if they can be read. If so, then start
them out to a distribution path to interested sites who are willing to
provide the disk space to keep them online, or to make copies for
further distribution. Just bear in mind that these files are as-is.
--Frank