Date : Wed, 21 Sep 2011 22:35:21 +0100
From : darren.grant@... (Darren Grant)
Subject: Text conversion?
On 21 Sep 2011, at 22:17, Rob wrote:
> For the most part, the Beeb /IS/ ASCII ... The main exception being in MODE 7,
>
> If it helps, this is the code I use as a quick translate of those to HTML......
>
> $longtext = str_replace(array("#","_","[","]","{","\\","}","~","`"),
> array("£","#","«","»","¼","½","¾","÷","-")
> , $longtext);
What I am trying to do is convert all of the domesday articles into XML files,
that I can then use lucene to search them.
Seems there are only two differences in modes 0-6 according to the wiki article
http://beebwiki.mdfs.net/ASCII
The other odd thing about these files is every line starts with a space.
Darren