An overview of the instruction set of
the MIPS32 architecture
as implemented by the mipsy and SPIM emulators.
Adapted from reference documents from
the University of Stuttgart and Drexel University,
from material in the appendix of
Patterson and Hennessey's
Computer Organization and Design,
and from the MIPS32 (r5.04) Instruction Set reference.
Registers
As implemented by mipsy,
MIPS has 32× 32-bit general purpose registers
as well as two special registers
Hi and Lo
for manipulating 64-bit integer quantities.
The 32 general purpose registers
can be referenced $0 through $31,
or by symbolic names, and are used as follows:
Regs
Names
Description
$0
$zero
the value 0; writes are discarded
$1
$at
assembler temporary;
reserved for assembler use
$2$3
$v0$v1
value from expression evaluation or function return
$4$5 $6$7
$a0$a1 $a2$a3
first four arguments to a function/subroutine
$8$9 $10$11 $12$13 $14$15
$t0$t1 $t2$t3 $t4$t5 $t6$t7
temporary;
callers relying on their values
must save them before calling subroutines
as they may be overwritten
$16$17 $18$19 $20$21 $22$23
$s0$s1 $s2$s3 $s4$s5 $s6$s7
saved;
subroutines must guarantee their values are unchanged
(by, for example, restoring them)
$24$25
$t8$t9
temporary;
callers relying on their values
must save them before calling subroutines
as they may be overwritten
$26$27
$k0$k1
for kernel use;
may change unexpectedly —
avoid using in user programs
$28
$gp
global pointer (address of global area)
$29
$sp
stack pointer (top of stack)
$30
$fp
frame pointer (bottom of current stack frame);
if not using a frame pointer, becomes a save register
$31
$ra
return address of most recent caller
Memory
mipsy's memory is partitioned as follows:
Segment
Base
Description
text
0x00400000
where user program code resides;
In mipsy, it is the only area of memory
where instructions are executable;
its initial size is 256 kiB.
This is the only area of memory
where instructions are executable.
In mipsy,this area of memory is also writeable.
On a real system, this area of memory would generally be read-only.
data
0x10000000
where user data resides;
its initial size is 256 kiB,
but its size is not fixed,
and can be changed with
the sbrk syscall
up to a maximum of 1 MiB.
This area of memory is not executable.
stack
0x7ffffeff
the function call stack;
grows towards negative addresses.
its initial size is 64 kiB,
but it will grow as needed
up to a maximum of 256 kiB.
This area of memory is not executable.
k_text
0x80000000
protected executable code,
not accessible in user mode;
in a real system,
the operating system kernel's code
would be mapped here.
In mipsy, the entry point is loaded here;
its initial size is 64 kiB
k_data
0x90000000
protected data,
not accessible in user mode;
in a real system,
the operating system's data
would be mapped here.
In mipsy, the entry point's data is loaded here;
its initial size is 64 kiB;
but it will grow as needed
up to a maximum of 1 MiB.
Syntax
Each instruction is written on a single line,
and has the general format
The number of operands for each instruction varies,
but could be between zero and three.
In the descriptions below,
the following notation is used
to describe instruction operands.
Operand
Description
Rn
a register —
commonly, Rs
and Rt are sources,
and Rd is a destination;
registers may be specified either
by a numeric name ($0 to $31), or
by a symbolic name ($sN, $tN, etc.)
Imm
a literal constant value, or immediate:
may be specified as an octal, decimal, hexadecimal,
or character literal;
if followed by a number
(e.g., Imm16)
that specifies the width in bits
and implies the range of the value.
Label
a symbolic name which is associated with a memory address
Addr
a memory address, in one of the formats described below
Many instructions have an address operand;
these may be written in a number of formats:
Format
Address
Label
a symbolic name which is associated with a memory address
(Rn)
the value stored in register Rn (indirect address)
Imm(Rn)
the sum of Imm and
the value stored in register Rn
Useful for accessing the stack.
Label(Rn)
the sum of Label's address and
the value stored in register Rn
Useful for accessing arrays.
Label ± Imm
the sum of Label's address and Imm
Useful for accessing structs.
Label ± Imm(Rn)
the sum of Label's address and Imm and
the value stored in register Rn
Useful for accessing arrays of structs.
Instructions
The mipsy emulator implements instructions from
the MIPS32 instruction set,
as well as pseudo-instructions
(which look like MIPS instructions,
but which aren't provided on real hardware).
Real MIPS instructions are marked with a ✓.
All other instructions are pseudo-instructions.
Operators in expressions have the same meaning
as their C counterparts.
UNSIGNED COMPARISONIF Rs >= Rt THEN PC += Offset16 << 2
SLTU $at, Rs, Rt BEQ $0, $at, Offset16
BGEU
Rs, Imm, Offset16
UNSIGNED COMPARISONIF Rs >= Imm THEN PC += Offset16 << 2
pseudo-instruction
✓
BGEZ
Rs, Offset16
IF Rs >= 0 THEN PC += Offset16 << 2
000001sssss00001OOOOOOOOOOOOOOOO
BGT
Rs, Rt, Offset16
IF Rs > Rt THEN PC += Offset16 << 2
SLT $at, Rt, Rs BNE $0, $at, Offset16
BGT
Rs, Imm, Offset16
IF Rs > Imm THEN PC += Offset16 << 2
pseudo-instruction
BGTU
Rs, Rt, Offset16
UNSIGNED COMPARISONIF Rs > Rt THEN PC += Offset16 << 2
SLTU $at, Rt, Rs BNE $0, $at, Offset16
BGTU
Rs, Imm, Offset16
UNSIGNED COMPARISONIF Rs > Imm THEN PC += Offset16 << 2
pseudo-instruction
✓
BGTZ
Rs, Offset16
IF Rs > 0 THEN PC += Offset16 << 2
000111sssss00000OOOOOOOOOOOOOOOO
BLT
Rs, Rt, Offset16
IF Rs < Rt THEN PC += Offset16 << 2
SLT $at, Rs, Rt BNE $0, $at, Offset16
BLT
Rs, Imm, Offset16
IF Rs < Imm THEN PC += Offset16 << 2
pseudo-instruction
BLTU
Rs, Rt, Offset16
UNSIGNED COMPARISONIF Rs < Rt THEN PC += Offset16 << 2
SLTU $at, Rs, Rt BNE $0, $at, Offset16
BLTU
Rs, Imm, Offset16
UNSIGNED COMPARISONIF Rs < Imm THEN PC += Offset16 << 2
pseudo-instruction
✓
BLTZ
Rs, Offset16
IF Rs < 0 THEN PC += Offset16 << 2
000001sssss00000OOOOOOOOOOOOOOOO
BLE
Rs, Rt, Offset16
IF Rs <= Rt THEN PC += Offset16 << 2
SLT $at, Rt, Rs BEQ $0, $at, Offset16
BLE
Rs, Imm, Offset16
IF Rs <= Imm THEN PC += Offset16 << 2
pseudo-instruction
BLEU
Rs, Rt, Offset16
UNSIGNED COMPARISONIF Rs <= Rt THEN PC += Offset16 << 2
SLTU $at, Rt, Rs BEQ $0, $at, Offset16
BLEU
Rs, Imm, Offset16
UNSIGNED COMPARISONIF Rs <= Imm THEN PC += Offset16 << 2
pseudo-instruction
✓
BLEZ
Rs, Offset16
IF Rs <= 0 THEN PC += Offset16 << 2
000110sssss00000OOOOOOOOOOOOOOOO
✓
J
Address26
PC = PC[31-28] && Address26 << 2
000010AAAAAAAAAAAAAAAAAAAAAAAAAA
✓
JAL
Address26
$ra = PC + 4 PC = PC[31-28] && Address26 << 2
000011AAAAAAAAAAAAAAAAAAAAAAAAAA
✓
JR
Rs
PC = Rs
000000sssss0000000000hhhhh001000
✓
JALR
Rs
$ra = PC + 4 PC = Rs
000000sssss0000011111hhhhh001001
✓
JALR
Rd, Rs
Rd = PC + 4 PC = Rs
000000sssss00000dddddhhhhh001001
CPU Trap Instructions
✓
SYSCALL
perform a system call
000000cccccccccccccccccccc001100
✓
BREAK
trigger a breakpoint
000000cccccccccccccccccccc001101
✓
TEQ
Rs, Rt
IF Rs == Rt THEN
trigger a breakpoint
000000ssssstttttcccccccccc110100
✓
TEQI
Rs, Imm16
IF Rs == Imm16 THEN
trigger a breakpoint
000001sssss01100IIIIIIIIIIIIIIII
✓
TNE
Rs, Rt
IF Rs != Rt THEN
trigger a breakpoint
000000ssssstttttcccccccccc110110
✓
TNEI
Rs, Imm16
IF Rs != Imm16 THEN
trigger a breakpoint
000001sssss01110IIIIIIIIIIIIIIII
✓
TGE
Rs, Rt
IF Rs >= Rt THEN
trigger a breakpoint
000000ssssstttttcccccccccc110000
✓
TGEU
Rs, Rt
UNSIGNED COMPARISONIF Rs >= Rt THEN
trigger a breakpoint
000000ssssstttttcccccccccc110001
✓
TGEI
Rs, Imm16
IF Rs >= Imm16 THEN
trigger a breakpoint
000001sssss01000IIIIIIIIIIIIIIII
✓
TGEIU
Rs, Imm16
UNSIGNED COMPARISONIF Rs >= Imm16 THEN
trigger a breakpoint
000001sssss01001IIIIIIIIIIIIIIII
TGT
Rs, Rt
IF Rs > Rt THEN
trigger a breakpoint
pseudo-instruction
TGTU
Rs, Rt
UNSIGNED COMPARISONIF Rs > Rt THEN
trigger a breakpoint
pseudo-instruction
TGTI
Rs, Imm16
IF Rs > Imm16 THEN
trigger a breakpoint
pseudo-instruction
TGTIU
Rs, Imm16
UNSIGNED COMPARISONIF Rs > Imm16 THEN
trigger a breakpoint
pseudo-instruction
✓
TLT
Rs, Rt
IF Rs < Rt THEN
trigger a breakpoint
000000ssssstttttcccccccccc110010
✓
TLTU
Rs, Rt
UNSIGNED COMPARISONIF Rs < Rt THEN
trigger a breakpoint
000000ssssstttttcccccccccc110011
✓
TLTI
Rs, Imm16
IF Rs < Imm16 THEN
trigger a breakpoint
000001sssss01010IIIIIIIIIIIIIIII
✓
TLTIU
Rs, Imm16
UNSIGNED COMPARISONIF Rs < Imm16 THEN
trigger a breakpoint
000001sssss01011IIIIIIIIIIIIIIII
TLE
Rs, Rt
IF Rs <= Rt THEN
trigger a breakpoint
pseudo-instruction
TLEU
Rs, Rt
UNSIGNED COMPARISONIF Rs <= Rt THEN
trigger a breakpoint
pseudo-instruction
TLEI
Rs, Imm16
IF Rs <= Imm16 THEN
trigger a breakpoint
pseudo-instruction
TLEIU
Rs, Imm16
UNSIGNED COMPARISONIF Rs <= Imm16 THEN
trigger a breakpoint
pseudo-instruction
CPU Control Instructions
NOP
do nothing
00000000000000000000000000000000 SLL $0, $0, 0
System Services
The mipsy emulator provides
a number of mechanisms for
interacting with the host system,
to provide input and output, file operations,
and other miscellaneous services,
which we refer to as system calls or syscalls.
These are invoked via the syscall instruction
after storing the service code in the register $v0.
$v0=
Arguments
Result
Description
Printing Values
1
$a0: int
print_int:
Print the integer in $a0
to the console as a signed decimal.
2
$f12: float
print_float:
Print the float in $f12
to the console as a %.8f.
3
$f12/$f13: double
print_double:
Print the double in $f12/$f13
to the console as a %.18g
4
$a0: char *
print_string:
Print the nul-terminated array of bytes
referenced by $a0
to the console as an ASCII string.
11
$a0: char
print_character:
Print the character in $a0,
analogous to putchar(3).
Reading Values
5
$v0: int
read_int:
Read an integral value from the console,
with atol(3)'s semantics,
into register $v0
6
$f0: float
read_float:
Read a floating-point value from the console,
with atof(3)'s semantics,
into register $f0
7
$f0/$f1: double
read_double:
Read a double-precision floating-point value
from the console,
with atof(3)'s semantics,
into registers $f0/$f1
8
$a0: char *; $a1: int
read_string:
Read a string into the provided buffer
(referenced by $a0);
up to size (given in $a1) bytes are read,
and the result is nul-terminated.
12
$v0: char
read_character:
Read the next character from the console
into register $v0;
analogous to getchar(3)
File Manipulation
13
$a0: char *; $a1: int; $a2: mode_t
$v0: fd
open:
Open the file specified by name
(referenced by $a0)
in a particular access mode
as specified by flags
(given by $a1),
and, if it is to be created,
with mode mode
(given by $a2).
Returns a file descriptor,
a small non-negative int.
Effectively, open(2).
14
$a0: fd; $a1: void *; $a2: int
$v0: int
read:
On the file given by
the file descriptor fd
(given in $a0),
read len bytes
(given by $a2)
into buffer
(given by $a1).
Returns the number of bytes read,
or -1 if an error occurred.
Effectively, read(2).
15
$a0: fd; $a1: void *; $a2: int
$v0: int
write:
On the file given by
the file descriptor fd
(given in $a0),
write len bytes
(given by $a2)
from buffer
(given by $a1).
Returns the number of bytes written,
or -1 if an error occurred.
Effectively, write(2).
16
$a0: fd
$v0: int
close:
Close the file given by
the file descriptor fd
(given in $a0).
Returns 0 if successful,
or -1 if an error occurred.
Effectively, close(2).
Process Services
9
$a0: int
sbrk:
Extend the .data segment
by adding $a0 bytes;
a primitive useful for, e.g.,
implementing malloc(3)
10
exit:
The program exits with code 0.
17
$a0: int
exit2:
The program exits with code
(given in $a0).
Directives
The mipsy assembler supports a number of directives,
which allow things to be specified at assembly time.
Directive
Description
.text
the instructions following this directive
are placed in the text segment of memory
.data
the data defined following this directive
is placed in the data segment of memory
.ktext
the instructions following this directive
are placed in the kernel text segment of memory
.kdata
the data defined following this directive
is placed in the kernel data segment of memory
.align N
arrange that the next datum is
stored with appropriate alignment
(that the lower N bits
of its address are set to zero)
by inserting enough padding —
nearly always automatically done;
a half word requires .align 1 (for two bytes),
a word requires .align 2 (for four bytes), and
a double requires .align 3 (for eight bytes).
.ascii "string"
store an ASCII string without a '\0'-terminator
at the next location(s) in the current data segment.
nearly always not what you want;
use .asciiz instead!
.asciiz "string"
store a '\0'-terminated ASCII string
at the next location(s) in the current data segment
.space n
allocate n uninitialised bytes of space
at the next location in the current segment
.byte val [, ...]
store values in successive byte(s)
at the next location(s) in the current segment
.half val [, ...]
store values in successive half word(s)
at the next location(s) in the current segment
.word val [, ...]
store values in successive word(s)
at the next location(s) in the current segment
.float val [, ...]
store values in successive float(s)
at the next location(s) in the current segment
.double val [, ...]
store values in successive double(s)
at the next location(s) in the current segment
.globl label [, ...]
Declare the listed label(s) as global
to enable referencing from other files