Writing machine code utilities that interact with BASIC ======================================================= J.G.Harston - 12-Dec-2012 Based on original unpublished Micro User article 15-Feb-1990 When writing a machine code utility to be run as a *command, you need to decide where in memory it will run from. The standard location is the serial/tape buffers in page 9 and 10, and most short disk-based *commands run there. To ensure they function correctly when a second processor is active the load and execution address are set to &FFFFxxxx to ensure the code is loaded into the I/O processor. This is fine for utilities that perform I/O actions, such as examining files or memory, or manipulating data in I/O memory - such as the screen. However, for utilities that interact with BASIC they have to run in the same memory as BASIC. On initial examination, the simplest way to do this would be to set the load and execution addresses to &0000xxxx. However, PAGE on the 6502 second processor defaults to &0800. A machine code utility that loads to &0900 will work satisfactorily when the second processor is switched off as it will load to the tape buffer at &0900, and the current BASIC program will typically be at &0E00 or higher - for example &1900 with a DFS-only BBC B. However, with a 6502 second processor switched on that utility will load on top of the BASIC program that will typically be in memory from &0800 upwards. The usual solution to this problem is to have two versions of the utility, one written to load at &FFFF0900 and one written to load at a suitable location in the second processor, often &C000 or &F600 just above BASIC in memory. This has two disadvantages. The first one being that you have two copies of the same code, relocated to different locations. The second is the difficulty in deciding exactly where to load the second processor version. &C000-&F7FF is ok when using standard BASIC, but with HiBASIC the BASIC interpreter is at &B800-&F7FF. Also, even when using LoBASIC, program relocation techniques can put the BASIC program or variables above BASIC at &C000-&F7FF. A better method is to find a memory location that is in the same place regardless of which side of the Tube BASIC is running on. BASIC memory workspace ---------------------- The only memory that is in the same location regardless of where BASIC is executing is BASIC's workspace[1]. &0000-&008F : Zero page workspace &0400-&047F : Integer variables &0480-&04FF : Pointers to dynamic variables &0500-&05FF : FOR/GOSUB/REPEAT stacks &0600-&06FF : String buffer &0700-&07FF : INPUT and command buffer Examining this memory in detail shows some areas that are unused, and some specifically reserved for the user to use when in BASIC. &0050-&008F : 64 bytes, with &0070-&008F officially reserved for the user when calling from BASIC, though &50-&5F should be avoided. &046C-&047F : 20 bytes, calculator workspace, free when not evaluating an expression &0480-&0481 : 2 bytes, theoretical pointer to variables starting with @ &04B6-&04BD : 8 bytes, theoretical pointer to variables starting with [\]^ &04FA-&04FF : 6 bytes spare &0596-&05A3 : 14 bytes spare So, a very short utility could load to &0050 if it was no more than 64 bytes long, or &046C if it was no more than 20 bytes long. For instance, the following short piece of code displays the contents of any comment in the first line of the program. OSASCI=&FFE3 FOR I%=0 TO 3 STEP 3 P%=&60 [OPT I% LDA #0:STA &8E LDA &18:STA &8F :\ &8E/F=>PAGE LDY #4 .loop1 LDA (&8E),Y:INY :\ Get byte from first line CMP #&F4:BEQ loop2 :\ Check for REM token CMP #13:BNE loop1 :\ Loop until end of line RTS :\ No REM found, just exit .loop2 LDA (&8E),Y:INY :\ Get byte from line JSR OSASCI :\ Print it CMP #13:BNE loop2 :\ Loop until end of line RTS ]NEXT However, for something of comparable size to the 512 bytes in the tape buffer we need to look in more detail. String buffer ------------- The string buffer at &0600 holds the results of string evaluations, and is used for the parameters for CALL. At other times it is free. Consequently, it can be used for a machine code utility that is no more than 256 bytes long with no impact on the rest of the BASIC environment. The following example creates a command *AS that saves the current program with a filename in a REM comment on the first line of the program. REM > AS/SRC DIM mcode% &200 :REM memory to assemble to load%=&600 :REM Address to load to osbyte=&FFF4 ptr=&70:cr=&0D:rem=&F4 FOR L%=4 TO 7 STEP 3 P%=load%:O%=mcode% [OPT L% .exec% LDA #21:LDX #0:JSR osbyte :\ Clear keyboard buffer LDA &18:STA ptr+1 LDY #0:STY ptr+0 :\ ptr=>PAGE LDA (ptr),Y CMP #13:BNE exit :\ No program in memory INY:LDA (ptr),Y CMP #&FF:BEQ exit :\ Empty program LDY #3 .remloop INY:LDA (ptr),Y :\ Get byte from line CMP #cr:BEQ exit :\ Found , no REM CMP #rem:BNE remloop :\ Loop until REM found .spaceloop INY:LDA (ptr),Y CMP #32:BEQ spaceloop :\ Step past any spaces CMP #ASC">":BEQ foundname :\ REM > filename CMP #34:BEQ foundname :\ REM "filename DEY :\ No prefix, step back .foundname TYA:PHA:LDX #0 :\ Remember offset to filename .loop LDY save,X:JSR osbyte138 :\ Insert characters from INX:CPX #4:BNE loop :\ SAVE PLA:TAY :\ Get back offset to filename .getname INY:TYA:PHA:LDA (ptr),Y :\ Get filename character CMP #cr:BEQ addcr :\ End of line CMP #34:BEQ addcr :\ Terminating quote TAY:JSR osbyte138 :\ Insert the character PLA:TAY:BNE getname .addcr PLA LDY #34:JSR osbyte138 :\ Insert terminating quote LDY #cr :\ Insert .osbyte138 TXA:PHA :\ Save X LDA #138:LDX #0:JSR osbyte :\ Insert Y into kbd buffer PLA:TAX :\ Restore X .exit RTS .save EQUS "SA.""" ]:NEXT OSCLI "SAVE AS "+STR$~mcode%+" "+STR$~O%+" "+STR$~exec%+" "+STR$~load% Checking for 6502 BASIC environment ----------------------------------- Before we go any further, some important things must be covered. You can issue *commands from any language, not just from BASIC, so code should check that BASIC is the current language before trying to access BASIC's memory. For example, it would make no sense if you were using View and entered a *command to examine a BASIC program in memory as there wouldn't be a BASIC program there to look at. Usefully, BASIC is considered a special-case ROM so there is an OSBYTE variable that holds the ROM number of the the onboard BASIC to use. Comparing this with the current language ROM number will check if BASIC is the current language. \ Check that BASIC is the current language LDA #187 :\ Read BASIC ROM number .chkbasic STX tmp :\ Store ROM number from first pass LDX #0:LDY #255:JSR OSBYTE :\ Read OSBYTE variable EOR #71:CMP #187:BNE chkbasic :\ Loop back to read current language ROM TXA:EOR tmp:AND #63 :\ Compare ROM numbers BEQ basicok :\ They match ... BRK:EQUB 249:EQUS "Not in BASIC":BRK The top two bits of the ROM number are masked out as they are used as flags. b7 indicates that the *BASIC command is passed on as a *command instead of entering the ROM directly. b6 indicates that any automatic ROM relocation is supressed.[2] Error number 249 is the "Bad language ROM" error number and is the appropriate error number here. When machine code is run on the second processor it is temporarily made the current application and the top of memory is temporarily moved to below the code. This is so that if the code is a language and claims user memory, it is re-entered on Break. If the called code returns with RTS the caller becomes the current application again and the top of memory is restored. However, if the code ends with an error the caller's error handler is called, but the the code remains the current application and the top of memory is not restored. So, if code ran at &0600 and generated an error, if you did MODE 7 you would find HIMEM had dropped to &0600. If you pressed Break the code would be re-run. If code wants to abort with an error it must restore the second processor caller's environment. When the code is called, if the return address is &F800-&FEFF it has been called on the second processor and the next stacked item is the old program environment. The following code should be used to restore the caller's environment before generating an error .error PLA:PLA:TAY:INY:CPY #&F9:BCC error2 PLA:STA &EE:STA &F2:PLA:STA &EF:STA &F3 .error2 BRK:EQUB errnum:EQUS "errstring":BRK If multiple errors will be generated, or the code calls an operating system call that may genrate an error (eg, saving a file), the environment should be restored when the code starts, leaving the stack unchanged for when the code returns normally. .chkstack TSX:LDY &102,X:INY:CPY #&F9 BCC chkstack2 LDA &103,X:STA &EE:STA &F2 LDA &104,X:STA &EF:STA &F3 .chkstack2 In addition to checking that BASIC is the current language, the code needs to be prevented from attempted execution on a non-6502 second processor. This is done by starting the code with a sideways ROM header indicating that the code contains 6502 code which is checked by the Tube Client code. The simplest such header is the following twelve bytes: .exec% JMP start :\ Entry point BRK:BRK:BRK :\ No service entry EQUB &42 :\ &40=Executable + &02=6502 EQUB 8 :\ Offset to copyright string EQUB 0:EQUS "(C)" :\ Copyright string .start If space allows it is best to have a full code header as this allows you to include a title and version string identifying the code. .exec% JMP start :\ Entry point BRK:BRK:BRK :\ No service entry EQUB &42 :\ &40=Executable + &02=6502 EQUB copy-exec% :\ Offset to copyright string EQUB &00 :\ Binary version number EQUS "Title" :\ Title string EQUB &00 EQUS "0.00 (01 Jan 1990)" :\ Version string .copy EQUB 0:EQUS "(C)J.G.Harston" :\ Copyright string EQUB 0 .start Using all of BASIC's workspace ------------------------------ If more than 256 bytes of memory are needed then either the variable pointers at &0400, the loop stacks at &0500 or the command buffer at &0700 will be overwritten. In all these cases this means that the simplest solution is to decide that the command must return to the BASIC command prompt after executing, that it cannot be executed multiple times within a program. Consider the following example code: FOR A%=1 TO 4 B=2^A% *test B=B+1 NEXT A% If *test overwrites the variables at &0400-&04FF then on return the interpreter won't be able to find any variables to be able to continue the program. If the loop stacks at &0500-&05FF are overwritten then the interpreter won't be able to complete any loops that *test is within. If *test overwrites the command buffer at &0700 then it can't be called from the command prompt and return correctly. If you use: >*test or >OSCLI "test":PRINT "done" the rest of the line will be overwritten and the interpreter won't be able to get to the end of the line. In all these cases the simplest thing to do is to tell the interpreter to execute an END to return to the command prompt. The easiest way to do this is to point BASIC's program pointer in PTRA to a sequence, which marks the end of a program in memory. .exit LDA #end AND 255:STA &0B LDA #end DIV 256:STA &0C LDA #0:STA &0A RTS .end EQUB 13:EQUB 255 :\ This returns to the command prompt and also clears all the loop stacks so, for instance, typing NEXT A or RETURN at the command prompt won't inadvertantly try to resume a non-existant loop. If the variable pointers at &0480-&04FF are overwritten then the equivalent of CLEAR also needs to be executed. The easiest way to do this is to point PTRA to the following BASIC code instead: EQUD 13:EQUB &D8:EQUB 13:EQUB 255 :\ <00><00><00> However, if you need to abort with an error the corrupted variable pointers will still be in place and need to be cleared manually with the following code: .clear LDA #0:LDX #&7F .clear_lp STA &480,X:DEX:BPL clear_lp :\ Clear variables LDA &00:STA &02 :\ Set VARTOP=LOMEM LDA &01:STA &03 LDA #end AND 255:STA &0B :\ Point PTRA to end marker LDA #end DIV 256:STA &0C LDA #0:STA &0A:RTS .end EQUB 13:EQUB 255 :\ .err_escape JSR clear BRK:EQUB 17:EQUS "Escape":BRK The static integer variable @% that controls print formatting is stored at &0400-&0403. As it is never reset to its default except by a *BASIC command it is best not to overwrite it, so the lowest address any code should start should be &0404. You must be careful that the code that clears the variables isn't itself in the memory about to be cleared at &0480-&04FF. Command line parameters ----------------------- When a *command is called on the Master the *command string is copied into a buffer in private MOS workspace at &DF00. When the Tube is running it is copied into a buffer in the I/O processor at &0700. However, on the BBC B/B+ with no Tube active the *command command string is parsed in memory where it is. A *command in a BASIC program will be somewhere above &0E00. A *command entered at the command prompt will be in the command buffer at &0700. A *command issued via OSCLI will be in the string buffer at &0600. In the latter two examples, if the *command loads to memory at &0600 or &0700 then it will overwrite any command parameters before it can see them. Consequently, a *command that loads to &0600 or &0700 must not look for any command parameters, and vis versa, a *command that looks for parameters must not load to &0600 or &0700. A more complicated method is to test if the Tube is inactive and the returned parameter address is &06xx or &07xx, and only look for parameters if it isn't. The command line parameters will always be in I/O memory, so they have to be collected by calling OSWORD 5 to read them. \ addr = 5-byte buffer in zero page LDA #1:LDY #0:LDX #addr:JSR OSARGS :\ Read address of command line .rdcmdlp LDX #addr:LDY #0:LDA #5:JSR OSWORD :\ Read byte from I/O memory INC addr+0:BNE P%+4:INC addr+1 :\ Increment command line address LDA addr+4 :\ Get byte read from command line ... do something with it CMP #13:BNE rdcmdlp :\ Loop until Sample programs -------------- ShowREM - Generates code to display the contents of the first REM line. AS/SRC - Generates *AS command that saves the current BASIC program from an embedded filename in a REM statement. Loads to string buffer at &0600, so is able to do a simple return. VList/s - Generates *VList command that lists BASIC variables. Loads to &0500 and &0600 so has to return via an END, but does not disturb the variables. XS/SRC - Generates an update of the Micro User *xs checksum program. Loads to the whole of &400-&7FF, so demonstrates clearing the overwritten variable pointers before exiting or generating an error. Summary ------- The following is a generic header that can start any code to execute in BASIC's workspace from &0404 onwards. It clears variables on exit and returns to the command prompt. This adds about 140 bytes of code to the program, but you have 1024 bytes of space to load to in contrast to the 512 bytes at &900-&AFF in the I/O processor, so is a reasonable restriction. FOR opt=4 TO 7 STEP 3 P%=&404:O%=mcode% [OPT opt .exec% JMP start:BRK:BRK:BRK :\ Header identifies EQUB &42:EQUB copy-exec% :\ this as 6502 code EQUB &00:EQUS "Program Name" EQUB &00:EQUS "0.00 (01 Jan 2000)" .copy EQUB &00:EQUS "(C)My Name":EQUB 0 : .start TSX:LDY &102,X:INY:CPY #&F9 :\ Check if on second processor BCC start2 LDA &103,X:STA &EE:STA &F2 :\ Restore caller's environment LDA &104,X:STA &EF:STA &F3 .start2 LDA #187 :\ Read BASIC ROM number .chkbasic STX tmp :\ Store ROM number from first pass LDX #0:LDY #255:JSR OSBYTE :\ Read OSBYTE variable EOR #71:CMP #187:BNE chkbasic :\ Loop back to read current language ROM TXA:EOR tmp:AND #63 :\ Compare ROM numbers BNE errBasic:JSR main :\ Call main code .clear LDX #&7F:LDA #0:STA &0A :\ Clear PTRA offset .clear_lp STA &480,X:DEX:BPL clear_lp :\ Clear variables LDA &00:STA &02 :\ Set VARTOP=LOMEM LDA &01:STA &03 LDA #end AND 255:STA &0B :\ Point PTRA to end marker LDA #end DIV 256:STA &0C:RTS .end EQUB 13:EQUB 255 :\ .errBasic JSR clear :\ Clear variables BRK:EQUB 249:EQUS "Not in BASIC":BRK : .main \ Main program code goes here RTS References ---------- [1]http://mdfs.net/Docs/Comp/BBC/BASIC/Memory [2]http://mdfs.net/Docs/Comp/BBC/BASIC/Osbyte187 [3]http://beebwiki.mdfs.net/Reading_command_line