<< Previous Message Main Index Next Message >>
<< Previous Message in Thread This Month Next Message in Thread >>
Date   : Wed, 23 Aug 2006 21:07:27 +0100
From   : Sprow <info@...>
Subject: Re: Hard disk backup - another challenge

In article <44ECA2AA.2090909@...>,
   Philip Pemberton <philpem@...> wrote:
> Sprow wrote:
> > Just a crude run length encoding is required, can't be more than a 
> > couple of lines of code.
> > Reserve a byte as a token eg. FF. Output the token plus the repeated 
> > byte plus the number of repeats (maybe a 2 byte field?). Escape the
> > token with itself to send just the token.
>
> Why not use Packbits?

Indeed, that's equally applicable, and I'm sure there are a good number of
other algorithms which are lightweight (for handy BASIC implementation) and
taylored to the typical contents of a harddisc.

[snip]

> It's a lot more efficient than normal RLE - in RLE, if you have to encode 
> a group of (say) ten different bytes, you prefix all ten of them with a
> 0x01 byte (which means "a one byte long run follows"). That ends up 
> doubling the size of the run. With Packbits, you encode that as a single 
> -9 byte, followed 
> by the ten bytes of data, which means you're only adding another byte to 
> the data. Only a ten percent increase, versus 100% for RLE.

That would be an especially lame RLE implementation. Reread my random
proposal and note the only pathalogical case is a run of FF's, or more
generally a run of your chosen tokens. As you have (some) a priori knowledge
of the contents of your disc a sensible token can be chosen for a good
overall gain: choosing 00 and EE are probably both just as bad.

So a run of 10 random bytes (not including the token) encodes as 10 bytes in
the output stream,
Sprow.
<< Previous Message Main Index Next Message >>
<< Previous Message in Thread This Month Next Message in Thread >>