\chapter{An Overview of {\rm oggin}} {\rm oggin}, written by D.Budgen as a teaching aid, is a simulated 16-bit processor which can be used on the University's VAX computer. It consists of an assembler together with an interpreter which executes an assembled program, informing the user of the state of the processor after each instruction cycle. \section{Introduction} An important topic in Computing Science is concerned with how computers are structured, and as such is very much concerned with taking a `low-level' view of the machine itself. Some of the aspects of the instruction forms, addressing modes etc. are not always easy to visualise, so the {\rm oggin} machine has been simulated on the VAX in order to allow its operations to be examined step by step. \\ \\ \noindent {\rm oggin} is not based on any one real machine. It is simplified (real assemblers and assembly language are more complicated), but, the simplicity aids understanding. And it has features that are found in various machines that students are likely to encounter. The simulation is composed of two parts : \\ \begin{itemize} \item {\bf An Assembler.} This assembles the instructions given to {\rm oggin} from the `source file' and loads them into oggin's `memory'. If there are no errors during assembly then control is passed to the second part, which is : \item {\bf An Interpreter.} This `executes' the instructions stored in {\rm oggin}'s memory and prints out the state of the main components of the machine after each instruction has been executed. In this way the effects of different instructions can be seen and clarified by experimenting with this machine. \end{itemize} \section{Specifications of {\rm oggin}} {\rm oggin} is a 16-bit processor and has five registers in the central processor unit. Its address bus is 10 bits wide, so allowing it to address up to 1 Kwords of main memory. The full 1 Kwords of memory are available to the user. \\ \\ \hspace*{10mm} {\bf Registers} \\ \hspace*{10mm} These are as follows : \begin{enumerate} \item {\em Program Counter} (pc). This is ten bits wide and contains the address in main memory of the next instruction to be executed. \item {\em Accumulator} (acc). This is a 16-bit register used for arithmetic and logical manipulation of data. As will be seen from the instruction set summary, almost all instructions involve some use of the accumulator. \item {\em Stack Pointer} (sp). This is used to provide `stack' facilities in {\rm oggin}. It is used by the instructions which involve transfer of control to subroutines and by a pair of instructions which allow the contents of the accumulator to be stored on the stack and the accumulator to be loaded from the stack. More about this register later. \item {\em Index Register} (ir). This is used to support the `indexed' addressing mode used by some instructions. \item {\em Processor Status}. This is an 8 bit register, though only four bits are significant and the other four can be ignored. These four bits are : \begin{itemize} \item \begin{tabbing} lllllllll \= l \kill N-bit \> : Set whenever the result of the preceding operation was negative. \end{tabbing} \item \begin{tabbing} lllllllll \= l \kill Z-bit \> : Set whenever the result of the preceding operation was zero. \end{tabbing} \item \begin{tabbing} lllllllll \= ll \= ll \kill V-bit \> : Set whenever the result of the preceding operation led to an overflow\\ \> \> condition. \end{tabbing} \item \begin{tabbing} lllllllll \= l \kill C-bit \> : Set whenever a `carry' from the accumulator is generated by an operation. \end{tabbing} \end{itemize} \end{enumerate} \section{Stack Use in {\rm oggin}} The Stack is used in the following way. Whenever a `push' onto the stack occurs, then the sequence of internal operations is : \\ \\ \begin{it} decrement sp, \\ load data from acc into memory location addressed by contents of sp.\\ \end{it} \\ When a `pop' from the stack occurs then these actions are reversed by the sequence : \\ \begin{it} load acc from memory location addressed by contents of sp, \\ increment sp.\\ \end{it} \section{Addressing Modes} Those instructions that need to access memory (those in `group 1'), can mostly do so in four different ways, using the four addressing modes provided by {\rm oggin}. Briefly these are : \subsection{Immediate mode.} The operand is contained in the least significant ten bits of the instruction itself, and so no accesses to main memory are actually needed for this mode. \subsection{Direct mode.} The least significant 10 bits of the instruction define the address in main memory which is to be used. \subsection{Indirect mode.} The least significant 10 bits of the instruction define an address in main memory which itself contains the address of the memory location to be used. \subsection{Indexed mode.} The least significant 10 bits of the instruction are added to the contents of the Index Register to form the address which is to be used. The Index Register contents are unchanged. \section{Initial State of Machine} At the beginning of the interpretation pass the registers of {\rm oggin} are preset as follows :\\ \begin{tabbing} xxxxxx \= hhhhh \kill \> Program Counter (pc) is set to zero.\\ \> Stack Pointer (sp) is set to point to [ top of memory + 1 ].\\ \> Index Register is set to zero.\\ \> Accumulator is set to zero.\\ \> Status bits in the Processor Status Register are all cleared.\\ \end{tabbing} This has the effect that program execution always begins from location zero. Unused locations in main memory are loaded with value zero so that any errors in transfer of control will generally lead to the machine stopping as the op-code for the `stp' instruction is also zero. \section{Instruction Set Overview} The instruction set for {\rm oggin} falls into three groups. A summary of the groups is as follows :\\ \\ {\bf Group 1} \\ The sixteen bit word containing the instruction is divided into the following three fields :\\ \\ \hspace*{10mm}Bits $<$0 : 9$>$ are used as the operand field, to be interpreted according to the addressing mode used.\\ \hspace*{10mm}Bits $<$10 : 11$>$ are used to define the addressing mode to be used. \\ \hspace*{10mm}Bits $<$12 : 15$>$ define the actual function code itself.\\ \\ When using a group 1 instruction, the two bits allocated for the mode field are used as follows :\\ \begin{tabbing} lllll \= llllll \= lll \kill \> 00\> for immediate mode; descriptor `\#' \\ \> 01\> for direct mode; descriptor `d' \\ \> 10\> for indirect mode; descriptor `i' \\ \> 11\> for indexed mode; descriptor `x' \\ \end{tabbing} ( Unfortunately, when reading this field in the octal representation of the complete op-code, the values are shifted one place left in the appropriately octal digit, so becoming 0,2,4,and 6 rather than 0,1,2,3). \\ \\ The Group 1 function codes and their functions are as follows : \\ \\ \begin{tabular}{ccl} {\bf Function} & {\bf Assembler} & {\bf Function} \\ {\bf code} & {\bf Mnemonic} & \\ \hline 01 & lda & Load operand to accumulator \\ 02 & sta & Store contents of accumulator \\ 03 & ldi & Load operand to Index Register \\ 04 & add & Add operand to Accumulator \\ 05 & sub & Subtract operand from acc \\ 06 & cmp & Compare operand with accumulator \\ 07 & and & Logical AND operand with acc \\ 11 & ora & Inclusive OR operand with acc \\ 12 & jsr & Jump to subroutine, storing pc on stack \\ 13 & jmp & Unconditional jump \\ \end{tabular} \newpage {\bf Group 2}\\ \\ \\ These are distinguished by function bits $<$12 : 15$>$ = 00.\\ \\ Bits $<$0 : 11$>$ then define the op-code as there are no arguments. The acc is the implied argument as any operation is carried out on the contents of acc. \\ \\ \begin{tabular}{ccl} {\bf Function} & {\bf Assembler} & {\bf Function} \\ {\bf code} & {\bf Mnemonic} & \\ \hline 000000 & stp & Stop execution (end {\rm oggin} run) \\ 000001 & rts & Return from subroutine \\ 000002 & asr & Arithmetic right shift of acc by one place \\ 000003 & asl & Arithmetic left shift of acc by one place \\ 000004 & neg & 2's complement contents of acc \\ 000005 & psh & Push acc contents on to stack \\ 000006 & pul & `Pop' stack into acc \\ 000007 & clr & Clear (zero) accumulator \\ 000010 & inc & Increment Index Register by 1 \\ 000011 & trp & Perform a synchronous trap \\ 000012 & rtt & Return from trap \\ 000013 & sti & Store Index Register \\ 000014 & rir & Restore Index Register \\ 000015 & spi & Copy stack pointer to Index Register \\ 000016 & lsr & Logical shift right one place \\ 000017 & lsl & Logical shift left one place \\ \end{tabular} \newpage {\bf Group 3} \\ \\ \\ The format for these is similar to those of Group 1, but they are distinguished by having the Function code field set to either 10 or 17 and the address mode bits are used to extend the Function code field, so giving a format : \\ \\ Bits $<$0 : 9$>$ operand(in this case, an address in main memory)\\ Bits $<$10 : 15$>$ Extended function code \\ \\ The instructions in this group are all conditional branching instructions. In each case, if the condition is satisfied then the pc is reloaded with the contents of the operand field and control thereby transferred to that address. \\ \\ \begin{tabular}{ccl} {\bf Function} & {\bf Assembler} & {\bf Function} \\ {\bf code} & {\bf Mnemonic} & \\ \hline 100 & beq & Branch if result was zero \\ 102 & bne & Branch if result was not zero \\ 104 & bgt & Branch if result $>$ zero \\ 106 & bge & Branch if result $>$= zero \\ 170 & blt & Branch if result $<$ zero \\ 172 & ble & Branch if result $<$= zero \\ 174 & bcs & Branch if carry bit set (=1) \\ 176 & bcc & Branch if carry bit clear (=0) \\ \end{tabular} \\ \section{Input Data of {\rm oggin}} The input of {\rm oggin} includes three basic forms of input line, comment lines, instructions and constants : \\ \begin{enumerate} \item {\em comment lines} : start with `;' \item {\em instructions} : obey the format, \\ \hspace*{20mm}{\it \{label\} op-code \{address mode\} \{operand\}\{comment\}} \item {\em constants} : obey the format, \\ \hspace*{20mm}{\it \{label\} constant} \end{enumerate} And all numbers used in an {\rm oggin} source file are integers and treated as octal . No `real' constants (only initialised variables) or label names are accepted by {\rm oggin}. All the labels are octal numbers. Unlike other assemblers, such as PDP-11, {\rm oggin} does not have psuedoinstructions. \\ \\ \noindent A simple example of a very short {\rm oggin} program follows and might make these points clearer. \begin{verbatim} ; ; A simple example of oggin program ; 0 lda,#,50 ; load octal 50 into acc add,d,60 ; add contents of location 60 neg ; and negate jmp,#,20 ; now transfer to location 20 ; ; this is a comment line and will be ignored ; ; the next code will be loaded from location 20 20 asl ; shift left one place stp ; and stop ; ; now some constants ; 60 +177 64 -400 ; ; end of program \end{verbatim} \section{Running the {\rm oggin} machine} {\rm oggin} is run by typing : \\ \hspace*{20mm} {\it {\rm oggin} file1 file2 file3 } \\ \\ file1 is the source file containing the source of the {\rm oggin} program and {\rm oggin} assumes that it has an extension of `.ogg'. This filename must be supplied and is the only one which is essential when using {\rm oggin}. \\ file2 is an optional file to hold the assembler listing. \\ file3 is an optional file to hold the run-time output generated by the {\rm oggin} interpreter. If file2 is not necessary to be specified, but file3 is required, then `tt:' must be specified to replace the value for file2. \section{Output of {\rm oggin}} A listing file is produced by the assembler and a file of executed results is generated by the interpreter. Both files contain information for understanding the operation of instructions. \subsection{Assembler Output} We have already seen that there are three basic forms of input line to the assembler, comment lines, instructions and constants. On the listing, these are all handled in fairly consistent form as follows : \\ \\ Comment lines : Lines that are purely comments are simply echoed in the listing, after stripping any leading spaces or tab characters. \\ Instructions : These are printed as follows : \begin{it} \begin{tabbing} llllllll \= llllllllllllll \= lllllll \= lllllllllllllll \= lllllll \= l \kill \> address\> \> generated code\> \> source line \\ \end{tabbing} \end{it} and emerge looking something like: \\ \begin{verbatim} 000103 012050 103 lda,d,50 ; load direct \end{verbatim} This shows that at location 000103 the instruction {\it lda,d,50} has been assembled into the machine code form {\it 012050}. Looking at the {\it 012050}, you can see that it is made from the {\it lda} instruction (op-code 01), the mode bits for direct mode, (01--shown as 2) and the operand field {\it 50}. The first two columns are generated by the assembler, the rest of the line is the code taken from the source file and simply echoed. So the very short example program that we have already seen will be assembled as : \\ \begin{verbatim} ; A simple example of oggin program ; 000000 010050 0 lda,#,50 ; load octal 50 into acc 000001 042060 add,d,60 ; add contents of 60 000002 000004 neg ; and negate 000003 130020 jmp,#,20 ; now transfer to location 20 ; ; this is a comment line ; the next code will be loaded from location 20 000020 000003 asl ; shift left one place 000021 000000 stp ; and stop ; ; now some constants ; 000060 000177 60 +177 000064 177400 64 -400 ; ; end of program \end{verbatim} \subsection{Interpreter Output} The interpreter will execute each instruction in turn, always beginning execution with the instruction at location 0 of main memory - so if you forget to load into this, the interpreter will immediately stop. For each instruction that is executed, the interpreter will print out a one line message, containing the following information : \begin{enumerate} \item The address of the instruction being executed \item The octal contents of that address (i.e. the instruction) \item The contents of the accumulator after execution \item The contents of the stack pointer after execution \item The contents of the Index Register after execution \item The values of the four status bits after execution \item The new value loaded into the program counter \end{enumerate} So that a typical line will look like : \\ \\ pc=000002 inst=010577 acc=000577 sp=002000 index=000000 CNZV=0000 pc$<$-000003 \\ \\ (Note that as the memory range runs from 0 to 1777, the stack pointer initially points to the maximum address plus one (2000) as we always decrement the stack pointer before using it.) \\ When the stack actually contains data, and the stack pointer is not pointing above the top of available memory, then the contents of the stack will be printed out on further lines below these first two standard lines. This printout will be from top of stack down, so that the top (and oldest) value is printed first and the most recently added value is printed last. \subsection{Limiting Output} Before beginning execution, the {\rm oggin} interpreter will prompt the user with the message : \\ \begin{it} \hspace*{15mm} do you want to limit run-time output ? \\ \end{it} \\ to which the answer should be either : \\ \begin{it} \hspace*{15mm} y$<$cr$>$ or n$<$cr$>$ \\ \end{it} \\ If the response is `n' then the interpreter will print out the details of every instruction that it executes. If `y' then the user is again prompted to input upper and lower bounds and only when instructions stored at addresses within those bounds are executed will printout be generated. \\ For example \\ \\ \begin{it} \hspace*{15mm} do you want to limit run-time output ? y \\ \hspace*{15mm} enter lowest pc value to be printed (octal) : 10 \\ \hspace*{15mm} enter max pc value to be printed (octal) : 12 \\ \end{it} \\ then only when instruction stored in location 10,11,12 is executed will any printout of register contents be generated. \\ \\ Similarly, at the end of program execution, the interpreter gives out the user the chance to `dump' the contents of a block of memory. If the answer to the prompt \\ \\ \begin{it} \hspace*{15mm} do you want a memory dump ? \end{it} \\ \\ is `y$<$cr$>$' then the user will again be prompted for boundary value and the contents of the selected block of memory will be printed out (`dumped') as a sequence of octal values. If the response is `n$<$cr$>$' then no dumping will occur.