Date : Tue, 05 Nov 2002 12:30:58 +0000 (GMT)
From : Frank Lee <frank@...>
Subject: Re: FW: Which format do you want BBC manuals in? RTF/HTML
Hi All,
Doesn't this boil down to two questions:
1) How should the archived data (in this case BBC manuals) be stored?
2) How should the archived data be presented?
I think that the formats which are suitable for part 2 (PDF, RTF, MSword
and the like) are not necessarily those which are suitable for part 1.
Material such as BBC manuals can, in some cases, be 20 years old or more.
Prospective solutions to part 1 should bear in mind this sort of
timescale: if our interest in the data twenty years after it was
published is sufficient that we are considering archiving it /
republishing it to make it more widely available we must be expecting it
to remain of interest on similar timescales.
Consider which formats were available twenty years ago. Compare those to
the formats which are currently available. Generally speaking, binary
formats come and go; are relatively short lived and are often restricted
as to which platforms they run on. Text based formats such as HTML, LaTeX
and `plain text' enjoy longer lives but, in the case of HTML and LaTeX,
require some sort of processing to display the data as it was intended,
with layout, fonts and the like. Plain text, whether in ASCII encoding or
its predecessors, has been around since (close to) the very beginning and
might reasonably be expected to continue: it is readable by (close to)
all machines from palmtops to supercomputers, no special programs are
required to display it in human form and it is `soft' (i.e. can be
compressed if required). The down-side of plain text is that there is not
formatting, of course, and therefore as an answer to question 2 above, it
is not helpful.
HTML and LaTeX may, with a little thought, be engineered to display a
manual in a format very similar to the original printed page. This
represents a good solution to question 2. Likewise, since they are both
text based formats, simple tools are available to convert them back into
plain text as insurance against the day when the last HTML/LaTeX expert
is laid to rest or for those who want the information, not the
presentation.
I would suggest, therefore, that HTML or LaTeX is used to store data for
archival purposes. In order to make the `presentation' correct, some work
will be required to configure how the markup will be presented: which
fonts should be used, sizes of character and page. However, I see that as
a separate and separable task: Perhaps someone could come up with a style
sheet or LaTeX package which resembles the standard (if there is such a
thing) Acorn manual in terms of layout. This could be the same person who
prepares the textual content or someone else.
Eventually, I would like to see there being a BBC documentation server
providing the manuals in user-selectable format: plain text, HTML styled
to resemble the original, HTML styled according to user preference, PDF
styled according to the original and PDF styled according to user
preference. Such a server should store an HTML / LaTeX version of the
text and a variety of style files (including plain text) and return the
manual in the chosen format.
Which is better, HTML or LaTeX? I don't know. Both are markup languages
rather than formatting languages. Perhaps LaTeX is a little more flexible but
then perhaps HTML is simpler and available to more potential contributors.
Whatever the answer, this should not delay the archival of documentation
in either format: conversion from one to the other is possible. Likewise,
conversion from HTML/LaTeX to plain text, RTF or even PDF is possible.
I propose that HTML or LaTeX is the format which should be used for the
archival, storage and presentation of documentation.
Yours,
Frank
--
Frank Lee