Toi is an imperative, type-sensitive language that provides the basic functionality of a programming language. The language was designed and developed from the ground-up by Paul Longtine.[1] Written in C, Toi was created with the intent to be an educational experience and serves as a learning tool (or toy, hence the name) for those looking to familiarize themselves with the inner-workings of a programming language.[2]
0 VOID - Null, no data 1 ADDR - Address type (bytecode) 2 TYPE - A `type` type 3 PLIST - Parameter list 4 FUNC - Function 5 OBJBLDR - Object builder 6 OBJECT - Object/Class 7 G_PTR - Generic pointer 8 G_INT - Generic integer 9 G_FLOAT - Generic double 10 G_CHAR - Generic character 11 G_STR - Generic string 12 S_ARRAY - Static array 13 D_ARRAY - Dynamic array 14 H_TABLE - Hashtable 15 G_FIFO - Stack
The runtime context keeps track of an individual threads metadata, such as:
This context gives definition to an 'environment' where code is executed.
A key part to any operational computer language is the notion of a 'Namespace'.This notion of a 'Namespace' refers to the ability to declare a name, along withneeded metadata, and call upon the same name to retrieve the values associatedwith that name.
In this definition, the namespace will provide the following key mechanisms:
The scope argument is a single byte, where the format is as follows:
Namespace|Scope 0000000 |0
Scopes are handled by referencing to either the Global Scope or the Local Scope.The Local Scope is denoted by '0' in the scope argument when referring to names,and this scope is initialized when evaluating any new block of code. When a different block of code is called, a new scope is added as a new Namespace level. Namespace levels act as context switches within function contexts. For example, the local namespace must be 'returned to' if that local namespace context needs to be preserved on return. Pushing 'Namespace levels' ensures that for every n function calls, you can traverse n instances of previous namespaces. For example, take this namespace level graphic, where each Level is a namespace instance:
Level 0: Global namespace, LSB
'0'.
When a function is called, another namespace level is created and the locallevel increases, like so:
Level 0: Global namespace, LSB
'0'.
Global scope names (LSB
The Namespace argument refers to which Namespace the variable exists in.When the namespace argument equals 0, the current namespace is referenced.The global namespace is 1 by default, and any other namespaces must be declaredby using the
Variables in this definition provide the following mechanisms:
For a given variable V, V defines the following attributes
V -> Ownership V -> Type V -> Pointer to typed space in memory
Each variable then can be handled as a generic container.
In the previous section, the notion of Namespace levels was introduced. Muchlike how names are scoped, generic variable containers must communicate theirscope in terms of location within a given set of scopes. This is what is called'Ownership'. In a given runtime, variable containers can exist in the followingstructures: A stack instance, Bytecode arguments, and Namespaces
The concept of ownership differentiates variables existing on one or more of thestructures. This is set in place to prevent accidental deallocation of variablecontainers that are not copied, but instead passed as references to thesestructures.
Functions in this virtual machine are a pointer to a set of instructions in aprogram with metadata about parameters defined.
In this paradigm, objects are units that encapsulate a separate namespace andcollection of methods.
Bytecode is arranged in the following order:
<opcode>, <arg 0>, <arg 1>, <arg 2>
Where the
A bytecode instruction is a single-byte opcode, followed by at maximum 3arguments, which can be in the following forms:
Below is the specification of all the instructions with a short description foreach instruction, and instruction category:
Keywords: TOS - 'Top Of Stack' The top element TBI - 'To be Implemented' S<[variable]> - Static Argument. N<[variable]> - Name. A<[variable]> - Address Argument. D<[variable]> - Dynamic bytecode argument.---- Hex | Mnemonic | arguments - description
These subroutines operate on the current-working stack(1).---- 10 POP S<n> - pops the stack n times. 11 ROT - rotates top of stack 12 DUP - duplicates the top of the stack 13 ROT_THREE - rotates top three elements of stack
20 DEC S<scope> S<type> N - declare variable of type 21 LOV S<scope> N - loads reference variable on to stack 22 STV S<scope> N - stores TOS to reference variable 23 CTV S<scope> N D<data> - loads constant into variable 24 CTS D<data> - loads constant into stack
Types are in the air at this moment. I'll detail what types there are whenthe time comes---- 30 TYPEOF - pushes type of TOS on to the stack TBI 31 CAST S<type> - Tries to cast TOS to <type> TBI
OPS take the two top elements of the stack, perform an operation and pushthe result on the stack.---- 40 ADD - adds 41 SUB - subtracts 42 MULT - multiplies 43 DIV - divides 44 POW - power, TOS^TOS1 TBI 45 BRT - base root, TOS root TOS1 TBI 46 SIN - sine TBI 47 COS - cosine TBI 48 TAN - tangent TBI 49 ISIN - inverse sine TBI 4A ICOS - inverse consine TBI 4B ITAN - inverse tangent TBI 4C MOD - modulus TBI 4D OR - or's TBI 4E XOR - xor's TBI 4F NAND - and's TBI
Things for comparison, < > = ! and so on and so forth.Behaves like Arithmetic instructions, besides NOT instruction. Pushes booleanto TOS---- 50 GTHAN - Greater than 51 LTHAN - Less than 52 GTHAN_EQ - Greater than or equal to 53 LTHAN_EQ - Less than or equal to 54 EQ - Equal to 55 NEQ - Not equal to 56 NOT - Inverts TOS if TOS is boolean 57 OR - Boolean OR 58 AND - Boolean AND
60 STARTL - Start of loop 61 CLOOP - Conditional loop. If TOS is true, continue looping, else break 6E BREAK - Breaks out of loop 6F ENDL - End of loop
These instructions dictate code flow.---- 70 GOTO A<addr> - Goes to address 71 JUMPF A<n> - Goes forward
80 GETN N<name> - Returns variable associated with name in object 81 SETN N<name> - Sets the variable associated with name in object Object on TOS, Variable on TOS1 82 CALLM N<name> - Calls method in object 83 INDEXO - Index an object, uses argument stack 84 MODO S<OP> - Modify an object based on op. [+, -, *, /, %, ^ .. etc.]
FF DEFUN NS<type> D<args> - Un-funs everything. no, no- it defines a function. D is its name, S<type> is the return value, D<args> is the args.
FE DECLASS ND
00 NULL - No-op 01 LC N<name> - Calls OS function library, i.e. I/O, opening files, etc. TBI 02 PRINT - Prints whatever is on the TOS. 03 DEBUG - Toggle debug mode 0E ARGB - Builds argument stack 0F PC S - Primitive call, calls a subroutine A. A list of TBI primitive subroutines providing methods to tweak objects this bytecode set cannot touch. Uses argstack.
Going from code to bytecode is what this section is all about. First off anabstract notation for the code will be broken down into a binary tree as so:
<node> /\ / \ / \ <arg> <next>
node> can be an argument of its parent node, or the next instruction.Instruction nodes are nodes that will produce an instruction, or multiple basedon the bytecode interpretation of its instruction. For example, this line ofcode:
int x = 3
would translate into: def /\ / \ / \ / \ / \ int set /\ /\ / \ / \ null 'x' 'x' null /\ / \ null 3
Functions are expressed as individual binary trees. The root of any file istreated as an individual binary tree, as this is also a function.
The various instruction nodes are as follows:
The various instruction nodes within the tree will call specific functionsthat will take arguments specified and lookahead and lookbehind to formulate thecorrect bytecode equivalent.
The developer of the language, Paul Longtine, operates a publicly available website and blog called banna.tech, named after his online alias 'banna'.