|
|
The applications-level programmer who doesn't write device drivers, interrupt service routines or assembly-language routines is usually insulated from the processor's inner-workings by the compiler. However, since a great deal of our work with chip companies and RTOS/tool vendors deals directly with the processor, we often tackle the very issues that the compiler handles for you. This page provides some information we thought would be helpful to those exploring the MIPS architecture. Whether you're new to MIPS or just curious about some of the nuances of the MIPS architecture, this page might be of interest to you. (Note: there
are many implementations of MIPS processors, each with its own unique features
and capabilitites. The best source of information for the processor
you're interested in is documentation from the semiconductor vendor or
MIPS, Inc.. This page is intended to help introduce you to MIPS programming
and architecture.)
Instruction Set
Co-processor model The MIPS architecture defines support for up to 4 "co-processors". Every MIPS processor implements co-processor 0 (CP0, see below). Other co-processors can be added to support an FPU, DSP-like processing, etc... A co-processor's registers are split into two categores: general registers and control registers. Co-processor general registers can be loaded from one of the processor's registers, or directly from memory. However, a co-processor's control registers must be loaded directly from a processor register. The same rules apply going in the other direction, i.e. on store operations. MIPS coprocessor 0 (CP0) is known as the "system control coprocessor" and handles the virtual memory subsystem and exception processing for the CPU. As mentioned above, all MIPS processors support an implementation of CP0. The CP0 register set and programming model varies across the CPU families, depending on the specific implementation. Because of CP0's fundamental role, all of its registers are treated like control registers, i.e. they cannot be loaded or stored directly from system memory.
Virtual Address Spaces MIPS processors support multiple virtual address spaces, each divided into segments. The processor operating mode (user, kernel, and starting with the R4000, supervisor mode) determines the accessibility and mapping of the segments in the virtual address space. The uppermost bits of the virtual address determine which segment is accessed. The processor's MMU translates all virtual addresses generated by the CPU through its translation lookaside buffer (TLB), which is essentially a fully-associative cache of recently translated virtual page numbers. The virtual page number is the upper portion of the virtual address. Each TLB entry holds the virtual page number, an address space identifier (see below), and the page frame number (as well as some control bits.) The number of TLB entries varies with the processor family. The operating system typically maintains and services the TLB entries. In order to minimize TLB re-loads, the virtual address is "extended" through an address space identifier (ASID). A CP0 register is loaded by the operating system with the ASID for the current process. The current ASID in CP0 and the virtual address generated by the processor are compared against the ASID and virtual page number in all TLB entries. If a TLB entry is found that matches the current ASID and virtual page number, the TLB entry's page frame number is used as the upper part of the physical address. Note how the ASID minimizes TLB re-loads, since several TLB entries can have the same virtual page number, but different ASID's. In a multi-processing system, most processes will run in the same virtual address space, but each can have its own ASID.
Addressing Mode(s) MIPS processors support a single addressing mode: indexed addressing. A signed 16-bit immediate offset is encoded in the instruction word along with a base register. The offset is sign-extended and then added to the base register's contents to form an effective address for the instruction. "Synthetic" instructions (see below) allow the programmer to "use" other addressing modes by expanding a single assembly instruction into multiple machine instructions, one of which will finally use the indexed addressing mode.
Synthetic Instructions In many assembly languages, there is a one-to-one correspondence between the assembly mnemonic and the machine instructions/code. This is particularly true in CISC architectures. A single statement in a high-level language such as 'C' usually translates into multiple assembly statements by the compiler. In a similar fashion, MIPS assemblers provide a broad range of synthetic instructions that allow code to be written with "intuitive" mnemonics. Some MIPS assembly mnemonics are actually translated into multiple machine-level instructions by the assembler. This allows assembly code to be written at a slightly higher, functional level, and helps to de-couple the code from the underlying machine code instruction set (ideal when the instruction set is constantly being extended and enhanced.) This also makes the assembly code more readable, and perhaps most importantly, it helps ensure the correctness of the code by freeing the programmer from the burden of always using the correct sequence of machine instructions to perform a specific function. For example, to load a 32-bit value into a register, the synthetic instruction li t0,0x12345678 # load t0 with 0x12345678will expand to lui t0,0x1234 # load t0 with 0x12340000 Delay Slots MIPS CPUs implement a delay slot for load and branch instructions. Branches and loads require extra cycles to complete before they exit the pipeline. For this reason, the instruction after the branch/load is executed while awaiting completion of the branch/load instruction. (The instruction after the load/branch instruction is said to reside in the delay slot.) The instruction in a load delay slot must not "use" the results of the preceding load instruction. Starting with the R4000 family, hardware interlocks were implemented, allowing the delay slot instruction to use the results of the preceding instructions without unpredictable results. However, the performance suffers when this is done. "Proper" use of the delay slot is still the best way to go. While the compiler will hide the reality of delay slots from the programmer, anyone debugging or programming at the assembly level needs to be aware of this. (Don't delete those NOP's inserted by the compiler!!!) And be careful of assembler instructions that expand into multiple machine instructions in the delay slot after a branch. In the case that the branch is not taken, only the first machine instruction of the expanded assembly instruction will be executed.
Conditions MIPS processors don't implement a dedicated status / condition register reflecting ALU conditions such as overflow, carry, negative result, zero result, etc... Specifically, there are no "cmp" instructions whose results are stored in a special condition register. Comparisons are implemented in software through general-purpose registers. For example, the Set on Less Than (slt) instruction compares the contents of a general purpose register to another general purpose register, and set another general purpose register to 0 or 1, based on the results. Example: assume t0 = 33, t1 = 44, and t2 = 55 The code slt t2,t0,t1will set t2 to 1, since the condition t0 < t1 is true, and the code slt t2,t1,t0will set t2 to 0. Note that the contents of t2 are overwritten, regardless of the result. ( C programmers: the instruction slt rd,rs,rt effectively performs the logic: rd = ( rs < rt ? 1 : 0 ) ) Several synthetic instructions can be created by using these simple, streamlined instructions. For example, an sge (Set of Greater on Greater Than or Equal) instruction can be synthesized by using an slt instruction and swapping the two registers being compared. For example, the synthetic instruction sge t2,t0,t1 is equivalent to the machine instruction slt t2,t1,t0.
Semaphore support MIPS R4000 series and later provide the Load Linked (ll) and Store Conditional (sc) instructions to implement mutual exclusion primitives. Paired ll/sc instructions can be used to attempt atomic memory accesses. The sc instruction should always be preceded by an ll instruction to the same address. The store can fail if the
ll
and sc
instructions are separated by an exception return (eret/rfe)
or if the processor determines that another device has altered the memory
contents in between the ll
and sc
instructions.
Want more MIPS information? Integrated Device Technology, Inc. provides application notes, sample code, etc... for MIPS processors. Galileo Technology provides silicon solutions for high-end embedded systems, with a focus on data communications. |
||