AVM documentation

Table of contents

Stack

Since AVM is a stack based VM, it has a stack which is the main part of the VM. All the operations are done on the stack (arithmetic, logic...).

The stack holds words, which are 64 bit integers (since AVM is 64 bit), but they can also be interpreted as floats when needed.

Its 65536 bytes in size (thats 8192 words, which should be enough for everyone). The stack is indexed by the stack pointer register, which points to the top element of the stack. This register (as all the other registers) cannot be directly modified, but there are instructions that modify it internally.

Memory

The stack is used for operations and temporary data, but we also need to store data for a longer time somewhere. Thats what the memory is for.

It holds bytes (8 bit integers), but we can store 16, 32 and 64 bit integers in it too using the appropriate instructions.

The first byte of the memory should be ignored, because its address is 0, which is used for null pointers.

Call stack

The call stack is similar to the normal stack, but its only for pushing instruction addresses. Specifically, the CAL and RET instructions use the call stack.

Executable format

AVM can, of course, read programs from a file. These files have a special structure.

The top of the file on unix can contain a shebang so we can simply just execute this file as a normal program.

The first part of the actual structure is the header. It begins with a "magic" sequence of 3 bytes that form the string AVM, which basically says that this is an AVM executable file.

The next 3 bytes are the major, minor and patch version, in order. These specify for which AVM version this executable is. AVM will usually warn if the major version is different, or if the minor version of the executable is greater than the minor version of AVM. Patch is usually ignored.

After this follow 3 words (64 bit integers) stored in the big-endian format. First one is the size of the program (the count of the instructions), second one is the size of the initial memory segment (in bytes) and the third one is the entry point of the program - the instruction at which the program starts.

The header ends here, and the memory segment begins. The size of the memory segment was specified in the header. This segment is for initializing the memory with some default values, for example strings.

After the memory segment comes the program itself - the list of the instructions. The instruction format is later described here

All data after the end of the instruction list is ignored.

Instructions

Instructions are 9 bytes in size. The first byte is the opcode of the instruction, and the following 8 bytes is a parameter - some data again stored in the big-endian format.

Each instruction also has a name assigned, which has to be 3 characters in length. The available instructions are sorted by type and described below.

Instruction documentation format

Parameters are the values required on the stack for the instruction to work. The right-most parameter is the one at the top of the stack.

Return values are the values pushed onto the stack by the instruction. The right-most return value is the one that was pushed the last.

The parameter and return values are gonna be in a single line under the instruction title, parameters being on the left and return values on the right, separated by a . Both parameters and return values are gonna have the format NAME: TYPE, separated from others by a comma. See the table of types below

Type Description
u64 An unsigned 64bit integer
i64 A signed 64bit integer
f64 A 64bit floating point number
bool A true/false (1/0) value
any Any 64bit data

If a bool value is anything other than 1 or 0, its interpreted as true.

If the instruction takes no parameters or returns nothing, - is written on the corresponding side.

Stack operations

PSH (0x10)

-pushed: any

Push some constant data on the stack. The data to push is the parameter of the instruction.

POP (0x11)

data: any-

Pop off data

DUP (0x50)

-duped: any

Duplicate an element on the stack. The parameter is an offset from the top of the stack. If the offset is 0, this instruction duplicates the top element. If its 1, it duplicates the element below the top element, and so on.

SWP (0x51)

top_value: anyswped: any

Swap an element with the top of the stack. The parameter is an offset that works the same as the DUP instruction offset, except that its shifted by one; offset 0 means that it swaps the top element with the element below the top element, and offset 1 means it swaps the top element with the element below the element below the top element, and so on.

EMP (0x52)

-empty: bool

Checks if the stack is empty. empty is true if the stack is empty, otherwise false.

Memory operations

R08 (0x60)

addr: u64read: any

Reads a byte from the memory. addr is the address of the byte, read is the data read.

R16 (0x61)

addr: u64read: any

\

Similar to the R08 instruction, except this reads 2 bytes.

R32 (0x62)

addr: u64read: any

Similar to the R08 instruction, except this reads 4 bytes.

R64 (0x63)

addr: u64read: any

Similar to the R08 instruction, except this reads 8 bytes.

W08 (0x64)

addr: u64, data: any-

Writes a byte in the memory. data is the data to write and addr is the address of it in memory.

W16 (0x65)

addr: u64, data: any-

Similar to the W08 instruction, except this writes 2 bytes.

W32 (0x66)

addr: u64, data: any-

Similar to the W08 instruction, except this writes 4 bytes.

W64 (0x67)

addr: u64, data: any-

Similar to the W08 instruction, except this writes 8 bytes.

SET (0x53)

addr: u64, val: any, size: u64-

Set memory at addr of size size to val.

CPY (0x54)

to: u64, from: any, size: u64-

Copy memory of size size from from to to.

File operations

Max 256 files can be opened at once

OPE (0x70)

name_addr: u64, name_len: u64, mode: anyfd: u64

Opens a file. mode is the mode to open the file in, name_len is the length of the file name string and name_addr is the address in memory of the file name string. fd is -1 if an error occured while opening the file. Otherwise, its the file descriptor.

See the table of modes below

Mode (in binary) Meaning
0b0001 Reading
0b0010 Writing
0b0100 Appending
0b1000 Binary

These modes can be combined.

CLO (0x71)

fd: u64-

Closes a file using the fd file descriptor.

WRF (0x72)

addr: u64, size: u64, fd: u64ok: bool

Writes a buffer to a file. fd is the file descriptor of the file to write to, size is the amount of the bytes to write and addr is the address of the bytes in memory. If an error occured, ok is false, otherwise its true. Writing is buffered.

RDF (0x73)

addr: u64, size: u64, fd: u64ok: bool

Read a byte from a file. fd is the file descriptor of the file to read from, size is the amount of the bytes to read and addr is the address of the byte buffer to read into. If an error occured or the EOF was reached, ok is false, otherwise its true. Reading is buffered.

SZF (0x74)

fd: u64size: u64

Get the file size. fd is the file descriptor of the file, size is the size of the file.

FLU (0x75)

fd: u64-

Flush the file buffer. fd is the file descriptor of the file.

Arithmetic

ADD (0x20)

a: u64, b: u64result: u64

Adds together a and b. result is the result.

SUB (0x21)

a: u64, b: u64result: u64

Similar to the ADD instruction, except this subtracts.

MUL (0x22)

a: u64, b: u64result: u64

Similar to the ADD instruction, except this multiplies.

DIV (0x23)

a: u64, b: u64result: u64

Similar to the ADD instruction, except this divides.

MOD (0x24)

a: u64, b: u64result: u64

Similar to the ADD instruction, except this is the modulus operation.

INC (0x25)

num: u64incremented: u64

Increments num by 1. incremented is the incremented num.

DEC (0x26)

num: u64decremented: u64

Decrements num by 1. decremented is the decremented num.

FAD (0x27)

a: f64, b: f64result: f64

Same as the ADD instruction, but for floating point numbers

FSB (0x28)

a: f64, b: f64result: f64

Same as the SUB instruction, but for floating point numbers

FMU (0x29)

a: f64, b: f64result: f64

Same as the MUL instruction, but for floating point numbers

FDI (0x2a)

a: f64, b: f64result: f64

Same as the DIV instruction, but for floating point numbers

FIN (0x2b)

num: f64incremented: f64

Same as the INC instruction, but for floating point numbers

FDE (0x2c)

num: f64decremented: f64

Same as the DEC instruction, but for floating point numbers

NEG (0x2d)

num: i64result: i64

result is num negated.

Bit operations

BAN (0x80)

a: any, b: anyresult: any

Performs the bit and operation on a and b. result is the result.

BOR (0x81)

a: any, b: anyresult: any

Performs the bit or operation on a and b. result is the result.

BSR (0x82)

data: any, by: anyresult: any

Bit shifts data to the right by by bits. result is the result.

BSL (0x83)

data: any, by: anyresult: any

Bit shifts data to the left by by bits. result is the result.

Logic

JMP (0x30)

--

Jumps to an instruction. The address of the instruction to jump to is the parameter of the instruction.

JNZ (0x31)

condition: bool-

Similar to the JMP instruction, but jumps only if condition is not false.

CAL (0x38)

--

Similar to the JMP instruction, but this also pushes the address of where to return (the next instruction) onto the call stack.

RET (0x39)

--

Pops off the top address of the call stack and jumps to it.

EQU (0x32)

a: i64, b: i64result: bool

Checks if a is equal to b. If it is, result is true.

NEQ (0x33)

a: i64, b: i64result: bool

Similar to the EQU instruction, except this checks if the integers are not equal.

GRT (0x34)

a: i64, b: i64result: bool

Similar to the EQU instruction, except this checks if a is greater than b.

GEQ (0x35)

a: i64, b: i64result: bool

Similar to the EQU instruction, except this checks if a is greater than or equal to b.

LES (0x36)

a: i64, b: i64result: bool

Similar to the EQU instruction, except this checks if a is less than b.

LEQ (0x37)

a: i64, b: i64result: bool

Similar to the EQU instruction, except this checks if a is less than or equal to b.

UEQ (0x3a)

a: u64, b: u64result: bool

Same as the EQU instruction, except this compares two unsigned integers.

UNE (0x3b)

a: u64, b: u64result: bool

Same as the NEQ instruction, except this compares two unsigned integers.

UGR (0x3c)

a: u64, b: u64result: bool

Same as the GRT instruction, except this compares two unsigned integers.

UGQ (0x3d)

a: u64, b: u64result: bool

Same as the GEQ instruction, except this compares two unsigned integers.

ULE (0x3e)

a: u64, b: u64result: bool

Same as the LES instruction, except this compares two unsigned integers.

ULQ (0x3f)

a: u64, b: u64result: bool

Same as the LEQ instruction, except this compares two unsigned integers.

FEQ (0x40)

a: f64, b: f64result: bool

Same as the EQU instruction, except this compares two floating point numbers.

FNE (0x41)

a: f64, b: f64result: bool

Same as the NEQ instruction, except this compares two floating point numbers.

FGR (0x42)

a: f64, b: f64result: bool

Same as the GRT instruction, except this compares two floating point numbers.

FGQ (0x43)

a: f64, b: f64result: bool

Same as the GEQ instruction, except this compares two floating point numbers.

FLE (0x44)

a: f64, b: f64result: bool

Same as the LES instruction, except this compares two floating point numbers.

FLQ (0x45)

a: f64, b: f64result: bool

Same as the LEQ instruction, except this compares two floating point numbers.

NOT (0x2e)

data: boolresult: bool

If data is false, result is true, otherwise its false.

AND (0x46)

a: bool, b: boolresult: bool

Performs the boolean and operation on a and b. result is the result.

ORR (0x47)

a: bool, b: boolresult: bool

Performs the boolean or operation on a and b. result is the result.

Debug

These instructions are unreliable, because they might not exist for long, since they are just for debugging.

DMP (0xF0)

--

Dumps the registers, stack, call stack and the current instruction address

PRT (0xF1)

data: i64-

Prints out data into stdout.

FPR (0xF2)

data: f64-

Same as the PRT instruction, but for floating point numbers.

Shared libraries

LOL (0x90)

name: u64, name_len: u64ld: u64

Loads a shared library. name is the library path, name_len is the library path length, ld is the library ID.

CLL (0x90)

ld: u64-

Closes a shared library. ld is the library ID.

LLF (0x90)

name: u64, name_len: u64, ld: u64fnd: u64

Loads a shared library function. name is the function name, name_len is the function name length, ld is the library ID and fnd is the function ID.

ULF (0x90)

fnd: u64, ld: u64-

Unloads a shared library function. fnd is the function ID, ld is the library ID.

CLF (0x90)

fnd: u64, ld: u64-

Calls a shared library function. fnd is the function ID, ld is the library ID.

Other

HLT (0xFF)

ex: u64-

Halts the virtual machine and returns with exitcode ex.

NOP (0x00)

--

Does nothing.

Runtime errors

Stack overflow (0x01)

This error happens when the stack pointer is greater or equal to the max stack size

Stack underflow (0x02)

This error happens when the stack is empty and an instruction attempts to pop off the top of the stack

Call stack overflow (0x03)

This error happens when the call stack pointer is greater or equal to the max call stack size

Call stack underflow (0x04)

This error happens when the call stack is empty and an instruction attempts to pop off the top of the call stack

Invalid instruction (0x05)

This error happens when an unknown instruction (undefined opcode) is found

Invalid instruction access (0x06)

This error happens when an instruction attempts to jump outside of the program

Invalid memory access (0x07)

This error happens when an instruction attempts to write/read from memory beyond the last byte of the memory

Division by zero (0x08)

This error happens when an integer is divided by zero (floating point number division by zero wont cause this)

Reached max limit of files open (0x09)

This error happens when there are more files open than is allowed

Invalid file mode (0x0a)

This error happens when the file mode is not writing, reading nor appending

Invalid file descriptor (0x0b)

This error happens when there is no file open with such file descriptor