Architecture/assembler summary(This is not intended to be either a comprehensive reference or a tutorial. More information is available from www.mips.com.)
RegistersThere are 32 general-purpose registers and 3 special registers on the MIPS r2k itself. There are also up to 32 registers each on up to four coprocessors. For CS161 purposes, there is only one coprocessor, coprocessor 0, which is the "system coprocessor"; it takes care of exceptions and virtual memory issues.
Any of the 32 general-purpose registers can be used in any instruction that takes register operands. The special registers are accessed using special instructions; the coprocessor registers can be accessed by using special coprocessor instructions to move their values to general registers and back.
Description General registers $0 z0, ZERO N/A Always contains 0, no matter what's written to it. $1 AT caller Assembler temporary. See below. $2 v0 caller Value 0. Used for computations; function return value is placed here. $3 v1 caller Value 1. Used for computations; upper word of 64-bit return value is placed here. Also holds the system call number on syscall entry. $4 a0 caller Argument 0. First function argument goes here. $5 a1 caller Argument 1. Second function argument goes here. $6 a2 caller Argument 2. Third function argument goes here. $7 a3 caller Argument 3. Fourth function argument goes here. Also used as a flag value on system call return. $8 t0 caller General-purpose temporary register. $9 t1 caller General-purpose temporary register. $10 t2 caller General-purpose temporary register. $11 t3 caller General-purpose temporary register. $12 t4 caller General-purpose temporary register. $13 t5 caller General-purpose temporary register. $14 t6 caller General-purpose temporary register. $15 t7 caller General-purpose temporary register. $16 s0 callee General-purpose saved register. $17 s1 callee General-purpose saved register. $18 s2 callee General-purpose saved register. $19 s3 callee General-purpose saved register. $20 s4 callee General-purpose saved register. $21 s5 callee General-purpose saved register. $22 s6 callee General-purpose saved register. $23 s7 callee General-purpose saved register. $24 t8 caller General-purpose temporary register. $25 t9 caller General-purpose temporary register. $26 k0 nobody Kernel scratch register. $27 k1 nobody Kernel scratch register. $28 gp global Global pointer. Constant for any given process. $29 sp N/A Stack pointer. $30 s8 callee Saved register #8 - conventionally, but not always, a frame pointer. $31 ra caller Return address of function. Special registers HI - caller High-order word of 64-bit multiply result, or remainder of divide result. LO - caller Low-order word of 64-bit multiply result, or quotient of divide result. PC - N/A Program counter. Coprocessor 0 cop0 $0 c0_index N/A TLB entry index register. cop0 $1 c0_random N/A TLB randomized access register. cop0 $2 c0_entrylo N/A Low-order word of "current" TLB entry. cop0 $4 c0_context N/A Page-table lookup address. cop0 $8 c0_vaddr N/A Virtual address associated with certain exceptions. cop0 $10 c0_entryhi N/A High-order word of "current" TLB entry. cop0 $0 c0_status N/A Processor status register. cop0 $13 c0_cause N/A Exception cause register. cop0 $14 c0_epc N/A PC at which exception occurred.
Register $31 is the "link register". Most of the instructions for calling subroutines are hardwired to store the return address into this register. (The jalr instruction is, for some reason, an exception.)
The coprocessor 0 registers have various bit fields in them. These are:
Bits Name Description 31 P Set by the tlbp instruction if the probe fails. 14-30 unused 8-13 Index TLB entry number for tlbwi, tlbr, and tlbp. 0-7 unused
Bits Name Description 14-31 unused 8-13 Random Semi-random TLB entry number used by tlbwr. Updated by processor. Never has a value between 0-7. 0-7 unused
Bits Name Description 12-31 PFN Physical page number (bits 12-31 of address) for VM mapping. 11 N Non-cacheable; if set, RAM cache is disabled accessing this page. 10 D Dirty; if set, page may be written to. 9 V Valid; if set, page may be accessed. 8 G Global; if set, valid in every address space. 0-7 unused
Bits Name Description 21-31 PTEBase Base address of page table. Untouched by hardware; maintained by software. 20-0 BadVPN Offset into page table for a kuseg fault (bits 12-30 of c0_vaddr), set by hardware.
Bits Name Description 0-31 vaddr Failing virtual address; set by certain exceptions.
Bits Name Description 12-31 VPN Virtual page number (bits 12-31 of address) for VM mapping. 6-11 ASID ID of address space in which virtual address exists. 0-5 unused
Bits Name Description 28-31 CU If these bits are set, the corresponding coprocessors are usable. If clear, use of said coprocessors will generate a coprocessor unusable exception. 23-27 unused 22 BEV If set the "bootstrap" exception handler addresses are used. 21 TS If set to 1, the processor is dead in the water and needs to be reset. 20 PE Set to 1 if a cache parity error occurs. Clear by writing 1. 19 CM Set to 1 if the most recent data cache load missed, but only if IsC is set. 18 PZ If set to 1, uses space parity for outgoing data. 17 SwC If set, the cache control lines affect the instruction cache rather than the data cache. 16 IsC If set, the data cache is detached from main memory. (For flushing.) 8-15 IntMask While these bits are set, the corresponding interrupts are masked and do not cause interrupt exceptions. 6-7 unused 5 KUo Old kernel/user mode bit (1 = user mode) 4 IEo Old interrupt enable bit (0 = mask all interrupts) 3 KUp Previous kernel/user mode bit (1 = user mode) 2 IEp Previous interrupt enable bit (0 = mask all interrupts) 1 KUc Current kernel/user mode bit (1 = user mode) 0 IEc Current interrupt enable bit (0 = mask all interrupts)
Bits Name Description 31 BD Set if last exception occurred in a branch delay slot. 30 unused 28-29 CE Coprocessor number resulting from a coprocessor unusable exception. 16-27 unused 10-15 IP Bits reflecting the state of the external hardware interrupt lines. Bit 10 is irq 0. 8-9 Sw Software interrupts. Like IP, but controlled by software. 6-7 unused 2-5 ExcCode An exception code, from the list below. 0-1 unused
Bits Name Description 0-31 epc Program counter for restarting after exception.
InstructionsThis table uses the following symbols:
These are the instructions (there are a few not listed, including all the floating-point operations, but this should include anything we'll see in CS161.)
RD, RS, RT Up to three general registers ($0-$31) HI, LO The special "hi" and "lo" registers HI:LO "hi" and "lo" as a single 64-bit value C0_REG A coprocessor 0 register signed-IMM Immediate value IMM, sign-extended to 32 bits unsigned-IMM Immediate value IMM, zero-extended to 32 bits offset Branch or memory-access offset (always signed) signed- Value is interpreted as signed unsigned- Value is interpreted as unsigned address Immediate address for jump
In the opcode names, "u" means "unsigned"; "i" means immediate; the "al" in some jump instructions means "and link", meaning "function call".
Instruction Operation Notes add RD, RS, RT RD = RS + RT; exception on overflow addi RT, RS, IMM RT = RS + signed-IMM; exception on overflow addiu RT, RS, IMM RT = RS + signed-IMM addu RD, RS, RT RD = RS + RT and RD, RS, RT RD = RS & RT andi RS, RT, IMM RT = RS & unsigned-IMM beq RS, RT, branch-offset if (RS == RT) NEXTPC += (branch-offset << 2) bgez RS, branch-offset if (signed-RS <= 0) NEXTPC += (branch-offset << 2) bgezal RS, branch-offset $31 = NEXTPC; if (signed-RS >= 0) NEXTPC += (branch-offset << 2) bgtz RS, branch-offset if (signed-RS > 0) NEXTPC += (branch-offset << 2) blez RS, branch-offset if (signed-RS <= 0) NEXTPC += (branch-offset << 2) bltz RS, branch-offset if (signed-RS < 0) NEXTPC += (branch-offset << 2) bltzal RS, branch-offset $31 = NEXTPC; if (signed-RS < 0 NEXTPC += (branch-offset << 2) bne RS, RT, branch-offset if (RS != RT) NEXTPC += (branch-offset << 2) break breakpoint (immediate breakpoint exception) with no delay slot div RS, RT LO = signed-RS / signed-RT; HI = signed-RS % signed-RT divu RS, RT LO = unsigned-RS / unsigned-RT; HI = unsigned-RS % unsigned-RT j address NEXTPC = (NEXTPC & 0xf0000000) | (address << 2) jal address $31 = NEXTPC; NEXTPC = (NEXTPC & 0xf0000000) | (address << 2) jalr RD, RS RD = NEXTPC; NEXTPC = RS. RD is normally $31. jr RS NEXTPC = RS lb RT, offset(RS) RT = signed-8-memory[RS + offset] lbu RT, offset(RS) RT = unsigned-8-memory[RS + offset] lh RT, offset(RS) RT = signed-16-memory[RS + offset] lhu RT, offset(RS) RT = unsigned-16-memory[RS + offset] lui RT, IMM RT = unsigned-IMM << 16 lw RT, offset(RS) RT = 32-memory[RS + offset] lwl RT, offset(RS) RT = unaligned-32-memory[RS + offset] 1 lwr RT, offset(RS) RT = unaligned-32-memory[RS + offset] 1 mfc0 RT, C0_REG RT = C0_REG mfhi RD RD = HI mflo RD RD = LO mtc0 RT, C0_REG C0_REG = RT mthi RS HI = RS mtlo RS LO = RS mult RS, RT HI:LO = signed-RS * signed-RT multu RS, RT HI:LO = unsigned-RS * unsigned-RT nor RD, RS, RT RD = ~(RS | RT) or RD, RS, RT RD = RS | RT ori RT, RS, IMM T = RS | unsigned-IMM rfe return from exception 2 sb RT, offset(RS) 8-memory[RS + offset] = RT sh RT, offset(RS) 16-memory[RS + offset] = RT sll RD, RT, IMM RD = RT << unsigned-IMM sllv RD, RT, RS RD = RT << RS slt RD, RS, RT RD = signed-RS < signed-RT slti RT, RS, IMM RT = signed-RS < signed-IMM sltiu RT, RS, IMM RT = unsigned-RS < unsigned-signed-IMM
Yes, according to my reference it actually takes the 16-bit immediate, sign-extends it, and then reinterprets it as an unsigned value. Don't ask me.
4 sltu RD, RS, RT RD = unsigned-RS < unsigned-RT sra RD, RT, IMM RD = signed-RT >> unsigned-IMM srav RD, RT, RS RD = signed- RT >> RS srl RD, RT, IMM RD = unsigned-RT >> unsigned-IMM srlv RD, RT, RS RD = unsigned-RT >> RS sub RD, RS, RT RD = RS - RT; exception on overflow subu RD, RS, RT RD = RS - RT sw RT, offset(RS) 32-memory[RS + offset] = RT swl RT, offset(RS) unaligned-32-memory[RS + offset] = RT 1 swr RT, offset(RS) unaligned-32-memory[RS + offset] = RT 1 syscall make system call; immediate syscall exception with no delay slot tlbp probe tlb: search TLB for entry matching c0_entryhi; set probe-failed bit and index field in c0_index. 3 tlbr read tlb entry: load the TLB entry named by the index field of c0_index into c0_entryhi and c0_entrylo. 3 tlbwi write tlb entry indexed: store c0_entryhi and c0_entrylo into the TLB entry named by the index field of c0_index. 3 tlbwr write tlb entry "random": store c0_entryhi and c0_entrylo into the TLB entry named by the index field of c0_random. 3 xor RD, RS, RT RD = RS ^ RT xori RT, RS, IMM RD = RS ^ unsigned-IMM
- lwl/lwr and swl/swr are for accessing unaligned words in
memory. The actual specification is complicated, but what
it boils down to is that
lwl RT, offset(RS)loads the 32-bit value starting at RS+offset, no matter what the alignment of that address is. swl/swr behave analogously.
lwr RT, (offset+3)(RS)
- RFE rotates the lower six bits of the status register by two to
the right, so the "previous" interrupt/usermode state becomes the
current state and the "old" state is copied into the "previous"
state. This inverts what happens on an exception. RFE is normally
found in the delay slot of a jump instruction of some kind.
- For an explanation of these, see the comments in src/kern/arch/mips/include/tlb.h.
Synthetic instructionsBecause all instructions are exactly 32 bits wide, it's not possible to perform certain logical operations in a single instruction. The assembler will cover for these by emitting multiple actual instructions as needed.
For instance, the "lc" (load constant) and "la" (load address) instructions, both of which load 32-bit constants, will be expanded by the assembler into a "lui" instruction to load the upper half of the word, and then usually an "ori" or "addiu" to set the lower half of the word.
Some of these combinations require an extra register to hold intermediate values. Register $1 is reserved for this purpose. You can prevent the assembler from using $1 by putting ".set noat" in the assembler source.
Delay slotsThe MIPS is a pipelined architecture, and certain aspects of the pipeline are exposed to the programmer. In general, "slow" instructions are not finished until the instruction *two* spaces after them is being fetched. The instruction in between is referred to as a "delay slot".
There is no pipeline stall logic; the delay slots must be filled out appropriately in the machine code. If they aren't, the behavior is undefined.
The assembler will attempt to fill delay slots for you; however, it isn't very bright about it and usually inserts nops. Also, in some cases it cannot tell what you mean and can silently mangle code that you thought was using delay slots efficiently. For this reason, when coding OS/161, I turned off this behavior with ".set noreorder".
Delay slots apply chiefly to two classes of instructions:
- Loads and stores involving memory.
lw $9, 0($8) ; load value into $9 nop ; $9 won't be ready here addiu $10, $9 ; now we can use $9
- Branches and jumps.
jal myfunc ; call function move a0, s0 ; executes BEFORE jump happens addiu s0,s0,v0 ; executes AFTER function returns
ExceptionsWhen an exception occurs, information about the exception is recorded in some of the coprocessor 0 registers and execution contains from a known hardwired address.
The following registers are updated on exception:
- c0_cause: the BD, CE, and ExcCode fields are updated.
- c0_context: the BadVPN field is updated in the same cases c0_vaddr is updated.
- c0_vaddr: updated on some exceptions (see list).
- c0_status: the lower six bits are shifted left by two bits, shifting in zeros for the bottom two bits. This disables interrupts and puts the processor in kernel mode.
- c0_epc: set to suitable PC for restarting the instruction that failed.
The exceptions are:
Address Description 0x80000000 UTLB miss exception 0x80000080 Other exceptions 0xbfc00000 Processor reset 0xbfc00100 UTLB miss exception, if BEV is set in c0_status 0xbfc00180 Other exceptions, if BEV is set in c0_status
An address error results from either use of an inadequately aligned pointer (an N-bit quantity must be aligned on an N-bit address boundary, unless the lwl/lwh/swl/swh instructions are used) or an attempt to access kernel memory from user mode.
Description 0 no Interrupt (hardware or software) 1 yes TLB protection fault ("modification request") 2 yes TLB miss or UTLB miss on load or instruction fetch. 3 yes TLB miss or UTLB miss on store. 4 yes Address error on load or instruction fetch. 5 yes Address error on store. 6 no External bus error on instruction fetch 7 no External bus error on data load or store 8 no SYSCALL instruction 9 no BREAK instruction 10 no Reserved (illegal) instruction 11 no Coprocessor unusable 12 no Arithmetic overflow
A TLB entry is "matching" if its VPN field is the same as the page number portion of the virtual address being looked up, and either the G (global) bit is set or the ASID field matches the ASID field in c0_entryhi.
If no matching TLB entry is found, a TLB miss exception occurs, unless the address is in the user mode range (0-0x80000000) in which case a UTLB exception occurs. If a matching entry is found, but it is not marked valid (the V bit is clear), a TLB miss exception (never a UTLB miss exception) occurs. Then, if the dirty (D) bit is not set on a write access, a TLB protection fault occurs.
A UTLB miss exception uses (potentially) different exception handling code from a TLB miss exception, but is otherwise the same. The purpose, in conjunction with the c0_context register, is to enable fast-path TLB refill handling. Note that the UTLB exception applies to user addresses, not user mode - if the miss address is below 0x80000000, a UTLB exception occurs whether or not the miss was generated in kernel or user mode.
SegmentsThe MIPS divides its address space into several regions that have hardwired properties. These are:
- kseg2, TLB-mapped cacheable kernel space
- kseg1, direct-mapped uncached kernel space
- kseg0, direct-mapped cached kernel space
- kuseg, TLB-mapped cacheable user space
The top of kuseg is 0x80000000. The top of kseg0 is 0xa0000000, and the top of kseg1 is 0xc0000000.
The memory map thus looks like this:
Address Segment Special properties 0xffffffff kseg2 0xc0000000 0xbfffffff kseg1 0xbfc00180 Exception address if BEV set. 0xbfc00100 UTLB exception address if BEV set. 0xbfc00000 Execution begins here after processor reset. 0xa0000000 0x9fffffff kseg0 0x80000080 Exception address if BEV not set. 0x80000000 UTLB exception address if BEV not set. 0x7fffffff kuseg 0x00000000