CPUSim64 Architecture

Overview

CPUSim64 is an emulation of a simple 64-bit microprocessor. All instructions are 64-bit as are all CPU registers. Memory access is done using 64-bit at a time. It is a Load/Store architecture which means that only the load, store, push and pop instructions interact with memory. All other instructions operate on registers.

CPU Model

The CPUSim64 CPU has 32 64-bit integer registers that can be used for signed integers or addresses in memory. It also has 32 64-bit floating point registers that can only be used for IEE 754 floating point values. It also has a special status register used to indicate attributes of the last value loaded or computed.

All of the floating point registers can be used in user programs. None ar reserved. Three of the integer registers are reserved for CPU operation leaving 29 integer registers for use in user programs. The three reserved integer registers are:

CPU Model Diagram
CPU Model Diagram

Program Counter (PC)

The program counter register is used by the CPU to keep track of the next instruction to execute. Each CPU cycle, the instruction in the memory address stored in the PC is loaded, decoded and executed. Then the PC is incremented to point to the next instruction. Control instructions like the JMP, CALL, RETURN and INTERRUPT instructions can modify the SP to an arbitrary value.

Stack Pointer (SP)

The stack is a region of memory used by the CPU for temporary values. The stack pointer register is used to keep track of where the top of the stack is located in memory. The stack pointer will be modified by the PUSH, POP, CALL and RETURN instructions.

Stack Frame (SF)

When making function calls local variables are often created on the stack. The stack frame keeps track of where the stack pointer was when the function started so that all local variables can be easily removed from the stack before the function returns.

Memory Model

There are three regions of memory used by programs run by the CPU. They are Code, Heap and Stack.

Code Region

Your assembled machine code for your user program is loaded into memory for the CPU to execute. It is loaded into the code region of memory. This code region starts at address 0x1 and ends at the beginning of the heap. Not only are machine instructions stored in the code region, but also any floating point or string literals defined in your program. It is condsidered an error to modifiy code or data in the code region.

Heap

The heap is the region of memory where blocks of memory can be dynamically be allocated for use by your program. Typically arrays or other complex data structures would be allocated in the heap. The heap begins at the end of the code region and ends at the maximum size allocated for the stack.

Stack

The stack is a region of memory used by the CPU and user programs for temporary storage. It is a last-in/first-out (LIFO) data structure. The stack resembles a stack of plates where the last plate put on the stack is the first one taken off. The stack starts at the top of available memory 0xFFFFFFFFFFFF and grows down toward the heap. It is considered an error if you allow the stack to grow so much that it overwrites some of the heap. This is known as a stack/heap collision.

The SP register always points to the next free location on the stack. The SP should always point to a memory location between the stack base at the top of memory and the stack limit. The SF points to the location of the SP when the latest function call was invoked. It points to the beginning of function local variables and should always be between the stack base and the SP.

The diagram below illustrates an example of the layout of memory assuming that 0x10000 (or 65,536) 64-bit words are available to your application. Unlike many CPUs that address memory byte-by-byte, CPUSim64 can only access memory as 64-bit words. This simplifies memory addressing removing the limits other processors have that they can only load on word boundaries every eight bytes, i.e. eight addresses. In CPUSim64 there is no need to multiply all our address locations by eight to get a valid address.

Memory Model Diagram
Memory Model Diagram
Each 64-byte value in memory has an unique integer address. Usually addresses are written in hexadecimal form to differentiate them from other ordinary integers.
> assembler.sh test > run.sh test
test.asm
mov R0, R2 tst R1