Tokenising BBC BASIC code ========================= BBC BASIC programs are tokenised, that is, BASIC keywords are stored as one or two byte values. This result in programs which execute faster and are more compact. A tokenised line can easily be detokenised, or expanded, as there is a one-to-one mapping between token values and the expanded string. For example, code similar to the following would expand a tokenised line: quote%=FALSE REPEAT IF ?addr%<128 OR quote% THEN VDU ?addr% ELSE P.token$(?addr%); IF ?addr%=34 quote%=NOT quote% addr%=addr%+1 UNTIL ?addr%=13 Tokenising, however, is more fiddly. Tokens can be abbreviated on entry and characters are only tokenised at certain parts of the line. For instance, in the following line: ON NOON GOTO 1,2 the fist 'ON' is the token ON, but the second 'ON' is part of the variable 'NOON'. The second 'ON' must be left untokenised. EVAL tokenises the supplied string and evaluates it as an expression. Usefully, the tokenised string can be retrived from where BASIC has stored it. In RISC OS BASIC: SYS "XOS_GenerateError",0,STRING$(255,"*") TO ,A% B%=EVAL("0:"+A$) token$=$(A%-14) In Windows BASIC: B%=EVAL("0:"+A$) token$=$(!332+2) In 6502 BASIC: A%=EVAL("0:"+A$) token$=$((!4 AND &FFFF)-LENA$-1) By preceding the code you want to tokenise with "0:" you can safely pass it to EVAL without provoking a Syntax error. You can then extract the tokenised code from memory, so long as you do it immediately after calling EVAL. This can be written as functions as follows: DEFFNTokenise_ARM(A$):LOCAL A%,B% SYS "XOS_GenerateError",0,STRING$(255,"*") TO ,A% B%=EVAL("0:"+A$):=$(A%-13) : DEFFNTokenise_Win(A$):LOCAL A%,B% WHILELEFT$(A$,1)=" ":A$=MID$(A$,2):ENDWHILE B%=EVAL("0:"+A$):=$(!332+2) : DEFFNTokenise_65(A$):LOCAL A% A%=EVAL("0:"+A$):=$((!4 AND &FFFF)-LENA$-1) These functions are used in full in the 'Tokenise' BASIC library at http://mdfs.net/System/Library/BLib. References ---------- Richard Russell, "Using the tokeniser", yahoogroups.com/group/bb4w message 86