CPUSim64 Cycle Timing Model

Instructions on a modern RISC processor each take multiple cycles to pass through the CPU's execution stages (this is the instruction's latency). However, because these stages are pipelined—allowing several instructions to be processed simultaneously at different stages—the processor can often complete roughly one instruction per cycle in terms of throughput.

Some instructions break this ideal. Operations such as division are typically not fully pipelined and take many cycles, while events like interrupts incur additional overhead from flushing the pipeline and saving processor state. This table documents the effective number of cycles attributed to each instruction.

InstructionFormCycles
Simple ALU
NOP1
CLEAR1
MOVEreg-reg1
COMPL1
AND, OR, XOR1
TEST, CMP1
LSHIFT, RSHIFT, ARSHIFT1
LROTATE, RROTATE1
PACK, PACK64, UNPACK, UNPACK641
ENDIAN1
READONLY1
Arithmetic
NEGATEinteger1
NEGATEFP3
ADD, SUBTRACTinteger1
ADD, SUBTRACTFP3
MULTIPLYinteger3
MULTIPLYFP3
DIVIDEinteger12
DIVIDE, RECIPFP10
Memory
LOAD2
STORE2
PUSH2
POP2
SAVEN registers1 + N
RESTOREN registers1 + N
CAS3
Control Flow
JUMPunconditional1
JUMPconditional, taken2
JUMPconditional, not taken1
CALLunconditional3
CALLconditional, taken4
CALLconditional, not taken1
RETURN3
STOP1
I/O & System
IN1
OUT1
INTERRUPT11
INTERRUPTconditional, not taken1
DEBUG1

Notes