BBC BASIC PHP Detokeniser and Syntax Highlighter ================================================ Original by Ben Ryves Updates by J.G.Harston bbc.php and bbctok.php are a pair of PHP programs that will parse a BBC BASIC program file on a web server and detokenise it into text, and do syntax colouring. The original code was written by Ben Ryves, and forms the majority of the functionality. The J.G.Harston updates are some slight changes to transparently parse both Acorn and Russell format BASIC files, and a few tweeks to catch some edge cases. Quick & Simple Installation --------------------------- The PHP files are supplied with .phps "source file" extensions so they don't get executed. Create a directory: /bin Upload the files: bbc.phps as /bin/bbc.php bbctok.phps as /bin/bbctok.php Edit the /.htaccess file (or create one if it doesn't exist). If there is no RewriteEngine section, add to the end: RewriteEngine On RewriteBase / Add to the end of the RewriteEngine section: RewriteRule ^(.*?)\.(bas|BAS)$ /bin/bbc.php?file=$1.$2 Once you have done this, any access to a file "filename.bas" in any directory will look for a BBC BASIC file named "filename", "filename.bbc" or "filename.src" and generate a webpage to display it. If there is a file called "filename.bas" it will be ignored as the rewrite engine will act before it is seen. Detailed information -------------------- The rewrite rule causes any access to a file ending in ".bas" to be passed to the detokeniser. I typically put links on my website in the following form: program (L) which gives the following display: program (L) Selecting the 'program' link fetches the actual binary file, selecting the '(L)' link generates a listing. The detokeniser examines the passed file to determine what type of BBC BASIC it is. Acorn BASIC and Russell BASIC programs have a slight difference, and Acorn uses two-byte extended tokens whereas Russell BASIC uniformly uses one-byte tokens. The listing is formatted as though LISTO 3 has been used. There is a single space after the line number, and FOR/NEXT, REPEAT/UNTIL, WHILE/ENDWHILE, CASE/ENDCASE and IF/ENDIF structures are indented. The generated page includes a 'download' link so the user can fetch the source file from the listing page. Customisation ------------- bbctok.php does all the heavy lifting of generating the detokenised listing. You should not need to change any of this. bbc.php parses the filename passed to it and calls bbctok.php with all the settings it needs. If you need to customise anything you should only need to change bbc.php. bbc.php starts by parsing the filename passed to it and converting it to a form to pass to bbctok.php. It is passed a filename ending ".bas" and needs to work out what the source file is to generate a listing from. On my site I have various BBC BASIC files, programs are almost all named "filename" or "filename.bbc", some are named "filename.src" or "filename.s" which are machine code source programs, so bbc.php starts by looking for a file with a name in one of these formats. If you wish to search for other filenames you would add additional tests here. The next section sets up the syntax colouring and layout. If you want to change from the colouring used - which is the default used in the BBC BASIC for Windows editor - you would change this section. The names of the styles should be obvious what they control. Bugs ---- If a structure has multiple closing commands, the indentation will get messed up, for example: FOR A=1 TO 5 FOR B=1 TO 5 IF B=2 THEN NEXT B NEXT B NEXT A NEXT with multiple variables are not recognised, for example: FOR A=1 TO 5 FOR B=1 TO 5 NEXT B,A REM next line See also -------- * BBC BASIC program file format: http://beebwiki.mdfs.net/Program_format * Ben Ryves' projects: https://benryves.com/projects History ------- v1.00 05-Jul-2014 Received Ben Ryves' version v1.10 ??-???-???? JGH: Added Acorn Basic, inline spaces become   v1.11 ??-???-???? JGH: Multi-line IFs indented, expands tokens<&20 v1.12 ??-???-2020 JGH: Examines file to discover format, expands two-byte tokens and deals with slight differences between one-byte tokens.