The Datasheet Archive - 100 Million Datasheets from 7500 Manufacturers.    


Datasheet Search Engine   
 
Part # or Description: • 5V RS232 Driver • 2SC5066* • "Real Time Clock" • "USB connector" • "blue led" 5mm • 10 watt zener diode • 2N3055* motorola
 
Search Tip: Try entering the part number only. Include a wildcard (eg. lm317* or 1n4148*)

 

 

Version October 1996 2975 Stender Way, Santa Clara, California 95


Datasheet Thumbnail

  

Download PDF



Top Searches for this datasheet



MIPS Microprocessor Family Software Reference Manual
Version October 1996
2975 Stender Way, Santa Clara, California 95054 Telephone: (800) 345-7015 TWX: 910-338-2070 FAX: (408) 492-8674 Printed U.S.A. ©1996
Integrated Device Technology, Inc. reserves right make changes products specifications time, without notice, order improve design performance supply best possible product. does assume responsibility circuitry described other than circuitry embodied product. Company makes representations that circuitry described herein free from patent infringement other rights third parties which result from use. license granted implication otherwise under patent, patent rights other rights,
LIFE SUPPORT POLICY Integrated Device Technology's products authorized critical components life support devices systems unless specific written agreement pertaining such intended executed between manufacturer officer IDT. Life support devices systems devices systems which intended surgical implant into body support sustain life whose failure perform, when properly used accordance with instructions provided labeling, reasonably expected result significant injury user. critical component components life support device system whose failure perform reasonably expected cause failure life support device system, affect safety effectiveness.
logo registered trademark, BiCameral, BurstRAM, BUSMUX, CacheRAM, DECnet, Double-Density, FASTX, Four-Port, FLEXI-CACHE, Flexi-PAK, Flow-thruEDC, IDT/c, IDTenvY, IDT/sae, IDT/sim, IDT/ux, MacStation, MICROSLICE, PalatteDAC, REAL8, R3041, R3051, R3052, R3071, R3081, R36100, R3721, R4600, R4640, R4650, R4700, R4761, R4762, R5000, RISController, RISCore, RISC Subsystem, RISC Windows, SARAM, SmartLogic, SyncFIFO, SyncBiFIFO, SPC, TargetSystem WideBus trademarks Integrated Device Technology, Inc. MIPS registered trademark, RISCompiler, RISComponent, RISComputer, RISCware, RISC/os, R3000, R3010 trademarks MIPS Computer Systems, Inc. Postscript registered trademark Adobe Systems, Inc. AppleTalk, LocalTalk, Macintosh registered trademarks Apple Computer, Inc. Centronics registered trademark Genicom, Inc. Ethernet registered trademark Digital Equipment Corp. registered trademark Corp.
About This Manual
About This Manual
This manual provides introduction design overview well more detailed descriptions instructions following product families: IDT79R30xx family 32-bit RISC controllers IDT79R4xxx ORION family high-performance 64-bit CPUs IDT79R5000 family MIPS-4 compatible devices
Summary Contents
Chapter "Introduction," presents overview IDT's microprocessor families, including discussion Pipeline, comparison MIPS CISC architecture. Chapter "MIPS Architecture," discusses high-level architecture from programmer's point view, including comparisons basic address space R30xx, R4600/4700, R4650. Chapter "System Control Co-Processor Architecture," discusses aspects MIPS architecture that must managed operating system, including details about Control Co-Processor Chapter "Exception Management," examines software techniques used manage exceptions, includes several code examples. Chapter "Cache Management," discusses IDT's implementation on-chip caches instructions (I-cache) data (D-cache). Chapter "Memory Management," discusses memory management Translation Lookaside Buffer (TLB). Also included discussion R4650's simple base-bounds mechanism, which uses instead TLB. Chapter "Reset Initialization," reviews reset, compares exception, includes information bootstrap sequences starting application. Chapter "Floating Point Co-Processor," describes operation floating points, compares implementations various MIPS microprocessors. Chapter "Assembler Language Programming," discusses techniques conventions reading writing MIPS assembler code, including complete table assembler instructions. Chapter Programming," provides overview principles designing efficient run-time environment, including discussion optimization. Chapter "Portability Considerations," discusses main facets designing portability. Chapter "Writing Power-On Diagnostics," provides pragmatic, hands-on look producing usable diagnostics MIPS environment. Chapter "Instruction Timing Optimization," discusses scheduling implications using MIPS instructions, includes information about additional hazards.
About This Manual
About This Manual
Chapter "Software Tools Board Bring-Up," describes software tools typically used when debugging board. Chapter "Software Design Examples," contains examples programs applications embedded systems. Chapter "Assembly Language Programming Tips," contains tips optimizing your programming MIPS environment. Appendix "CPU Instruction Set," provides overview instruction set. Following overview, alphabetical order, command pages describing individual instructions. Appendix "FPU Instruction Set," provides overview floating point instruction set. Following overview, alphabetical order, command pages describing individual floating point instructions.
Table Contents
List Tables
INTRODUCTION
OVERVIEW IDT's MICROPROCESSOR FAMILIES PIPELINE 32-bit 64-bit CPUs MIPS ARCHITECTURE LEVELS MIPS COMPARED WITH CISC ARCHITECTURES Instruction encoding features Addressing memory accesses Operations directly supported Multiply divide operations Programmer-visible pipeline effects NOTE MACHINE ASSEMBLER LANGUAGE
CHAPTER
MIPS ARCHITECTURE
CHAPTER
2-10 2-11 2-11 2-12 2-14 2-15 2-18
PROGRAMMER'S VIEW PROCESSOR ARCHITECTURE Registers Conventional names uses general-purpose registers Integer multiply unit registers Instruction types Loading storing: addressing modes Data types memory registers BASIC ADDRESS SPACE R30xx SUMMARY R30xx SYSTEM ADDRESSING Kernel user mode Memory CPUs without hardware R36100 Address Translation BASIC ADDRESS SPACE R4600/R4700 BASIC ADDRESS SPACE R4650
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CONTROL SUMMARY CONTROL "CO-PROCESSOR control instructions Standard control registers Control Register Formats Processor-specific registers Registers System Operation Support
CHAPTER
3-12 3-23
EXCEPTION MANAGEMENT
EXCEPTIONS Precise Exceptions Exception Timing Exception Vectors Exception Handling Basics Nesting Exceptions Exception Routines
CHAPTER
Table Contents
Table Contents
INTERRUPTS Software Interrupts
4-16 4-16
CACHE MANAGEMENT
CACHES CACHE MANAGEMENT R30xx cache characteristics Cache locking Cache isolation swapping R30xx R4600/R4700/R4650/R5000 Cache Characteristics Initializing Sizing Caches Initializing R30xx cache Initializing R4xxx cache Invalidation Locking R4650 caches Testing probing Configuration (R3041/71/81 only) WRITE BUFFER Implementing wbflush()
CHAPTER
5-11 5-12 5-14 5-16 5-17 5-17 5-18 5-19
MEMORY MANAGEMENT
MEMORY MANAGEMENT TRANSLATION LOOKASIDE BUFFER (TLB) MEMORY MANAGEMENT BASE-BOUNDS REGISTERS Description Registers CONTROL INSTRUCTIONS PROGRAMMING refills occur Memory translation setup exception sample code Simulating dirty bits DEBUGGING MANAGEMENT UTILITIES
CHAPTER
6-13 6-13 6-13 6-14 6-15 6-16 6-17 6-17
RESET INITIALIZATION
STARTING Probing recognizing Bootstrap sequences Starting application
CHAPTER
7-10
FLOATING POINT CO-PROCESSOR
CHAPTER
WHAT FLOATING POINT? IEEE STANDARD BACKGROUND IEEE exponent field bias IEEE mantissa normalization Reserved Exponent Values MIPS Data formats MIPS IMPLEMENTATION IEEE FLOATING POINT REGISTERS (R30xx) FLOATING POINT REGISTERS (R4xxx/R5000) FLOATING POINT EXCEPTIONS/INTERRUPTS FLOATING POINT CONTROL/STATUS REGISTER FLOATING-POINT IMPLEMENTATION/REVISION REGISTER
Table Contents
Table Contents
GUIDE INSTRUCTIONS Load/store Move between registers 3-operand arithmetic operations 4-operand arithmetic operations Unary (sign-changing) operations Conversion operations Conditional branch test instructions Other floating point instructions INSTRUCTION TIMING REQUIREMENTS INSTRUCTION TIMING SPEED INITIALIZATION ENABLE DEMAND FLOATING POINT EMULATION
8-10 8-10 8-11 8-11 8-11 8-12 8-13 8-13 8-14 8-14 8-14
ASSEMBLER LANGUAGE PROGRAMMING
CHAPTER
9-10 9-11 9-11 9-13 9-15 9-16 9-18 9-20 9-21 9-35 9-39 9-42 10-1 10-1 10-1 10-2 10-3 10-3 10-4 10-4 10-9 10-9
SYNTAX OVERVIEW points note REGISTER-TO-REGISTER INSTRUCTIONS IMMEDIATE (CONSTANT) OPERANDS MULTIPLY/DIVIDE LOAD/STORE INSTRUCTIONS Unaligned load store ADDRESSING MODES GP-relative addressing JUMPS, SUBROUTINE CALLS BRANCHES CONDITIONAL BRANCHES Coprocessor conditional branches COMPARE COPROCESSOR TRANSFERS Coprocessor Hazards ASSEMBLER DIRECTIVES Sections Data definition alignment Symbol binding attributes Function directives Assembler control (.set) Listing controls COMPLETE GUIDE ASSEMBLER INSTRUCTIONS ALPHABETIC LIST ASSEMBLER INSTRUCTIONS ALPHABETIC LIST R4xxx ASSEMBLER INSTRUCTIONS ALPHABETIC LIST R5000 ASSEMBLER INSTRUCTIONS
PROGRAMMING
CHAPTER
STACK, SUBROUTINE LINKAGE, PARAMETER PASSING Stack argument structure Which arguments which registers? Examples from library Passing Structures printf() varargs work Returning value from function Stack-frame allocation SHARED NON-SHARED LIBRARIES Sharing code single-address space systems
Table Contents
Table Contents
Sharing code across address spaces INTRODUCTION OPTIMIZATION Common optimizations prevent unwanted effects from optimization Optimizer-unfriendly code avoid
10-10 10-11 10-11 10-14 10-15
PORTABILITY CONSIDERATIONS
CHAPTER
11-1 11-2 11-3 11-4 11-5 11-5 11-6 11-6 11-8 11-9 11-9 11-12 11-13 11-15 11-16 12-1 12-2 12-3 12-3 12-3 12-4 12-4 12-5 12-5 12-5 12-5 12-6 12-9 13-4 13-4 13-4 13-4 13-5 13-6 13-6
WRITING PORTABLE DATA REPRESENTATIONS ALIGNMENT Notes structure layout padding ISOLATING SYSTEM DEPENDENCIES Locating system dependencies Fixing dependencies Using assembler ENDIANNESS What means programmer Changing endianness MIPS Designing specifying configurable endianness Portability endianness-independent code COMPATIBILITY WITHIN MIPS FAMILY PORTING MIPS: FREQUENTLY ENCOUNTERED ISSUES CONSIDERATIONS PORTABILITY FUTURE DEVICES
WRITING POWER-ON DIAGNOSTICS
CHAPTER
GOLDEN RULES DIAGNOSTICS PROGRAMMING WHAT SHOULD TESTS TEST DIAGNOSTIC TESTS? OVERVIEW ALGORITHMICS' POWER-ON SELFTEST Starting points Control Environment variables Reporting Unexpected exceptions during test sequence Driving test output devices Restarting system Standard test sequence Notes test sequence Annotated examples from test code Notes examples ADDITIONAL HAZARDS Early modification Bitfields control registers Hazards specific R4xxx R5000 Hazards specific R4650 Non-obvious Hazards
INSTRUCTION TIMING OPTIMIZATION CHAPTER
SOFTWARE TOOLS BOARD BRING-UP
TOOLS USED DEBUG INITIAL DEBUGGING PORTING MICROMONITOR RUNNING MICROMONITOR INITIAL IDT/SIM ACTIVITY FINAL NOTE IDT/KIT
CHAPTER
14-1 14-2 14-2 14-2 14-2 14-3
viii
Table Contents
Table Contents
SOFTWARE DESIGN EXAMPLES
APPLICATION SOFTWARE Memory Starting Library functions Running program Debugging program EMBEDDED SYSTEM SOFTWARE Memory Starting Embedded system library functions Debugging UNIX-LIKE SYSTEM Terminology Components process System calls protection What kernel does Virtual memory implementation MIPS Interrupt handling MIPS
CHAPTER
15-1 15-1 15-1 15-3 15-5 15-5 15-6 15-6 15-6 15-8 15-10 15-11 15-11 15-12 15-13 15-14 15-15 15-16
ASSEMBLY LANGUAGE PROGRAMMING TIPS
32-bit Address Constant Values "Set" Instructions
CHAPTER
16-1 16-1
List Figures
Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure
Figure Title MIPS 5-stage pipeline MIPS relationships pipeline branch delays pipeline load delays Virtual-to-physical address translation R36100 Kernel Mode Address Space Kernel Mode Address Space PRId Register fields Status Register Fields (SR) (R3xxx) Status Register (4600/4700) Status Register (4650) Fields Cause Register (R3xxx R4600/R4700 Cause Register Format (R4650) Fields R3071/81 Config Register Fields R3041 Config (Cache Configuration) Register Config Register Format (R4600/R4700) Config Register Format (R4650) Fields R3041 Control (BusCtrl) Register Context Register Format XContext Register Format Register Format CacheErr Register Format ErrorEPC Register Format IWatch Register Format DWatch Register Format Direct mapped cache Cache partitioning example (R36100) Two-way set-associative cache EntryHi EntryLo register fields EntryHi EntryLo register fields R4xxx 64-bit EntryHi register fields R4xxx 64-bit EntryLo0 EntryLo1 register fields Index Register Fields Index register Random Register Fields Random register Wired Register Boundary Wired Register Fields Context Register R30xx Fields Context Register R4600/R4700 XContext Register Format IBase Register IBound Register DBase Register DBound Register CAlg Register R30xx control/status register fields implementation/revision register
Page 2-15 2-17 2-19 3-10 3-11 3-13 3-14 3-15 3-15 3-17 3-18 3-19 3-19 3-20 3-22 3-22 3-23 6-10 6-10 6-11 6-11 6-11 6-12
Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18 6.10 6.11 6.12 6.13 6.14 6.15 6.16 6.17 6.18
xiii
List Figures Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure 10.1 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 15.1
List Figures Program segments memory Stackframe non-leaf function Structure layout padding memory Data representation with #pragma pack(1) Data representation with #pragma pack(2) Typical big-endians picture Bitfields big-endian Bitfields little-endian Garbled string storage when mixing modes Byte-lane swapper Memory layout process 9-11 10-5 11-3 11-4 11-4 11-7 11-8 11-9 11-10 11-11 15-12
List Tables
Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table 3.10 3.11 3.12 3.13 3.14 6.10 6.11 6.12
Table Title Embedded Microprocessor Family Conventional Names Registers with Usage Mnemonics Multiply Divide Instruction Cycle Timing Naming Conventions Virtual Physical Address Relationships Base Versions Cacheability Coherency Attributes Control Register Summary (not MMU) "Imp" "Rev" values Status Register Fields (4600/4700) Bits 4650 Status Register ExcCode Values: R3xxx/R4600/R4700 Exception Differences Cause Register fields (R4650) Config Register Fields (R4600/R4700) Config Register Fields (R4650) Context Register Fields XContext Register Fields Register Fields CacheErr Register Fields IWatch Register Fields DWatch Register Fields Reset Exception Entry Points (Vectors) MIPS CPUs Interrupt Bitfields Interrupt Pins Control Registers Memory Management Page Coherency Values Index Register Field Descriptions Random Register Field Descriptions Mask Field Values Page Sizes Wired Register Field Descriptions XContext Register Fields IBASE Register Fields IBound Register Fields DBase Register Field DBound Register Fields CAlg Register Fields Floating point data formats Rounding modes encoded control/status register load/store instructions move instructions 3-operand arithmetic 4-operand arithmetic sign-changing operators data conversion operations test instructions Special characters Multiply/divide instruction descriptions
Page 2-15 2-18 3-10 3-11 3-12 3-15 3-16 3-18 3-19 3-20 3-21 3-22 3-23 4-16 6-10 6-10 6-11 6-11 6-12 6-12 8-10 8-10 8-11 8-11 8-11 8-12
List Tables Table Table Table Table Table Table Table Table Table Table Table Table Table 12.1 13.1 13.2 13.3 16.1 16.2 16.3
List Tables Load/store instruction descriptions %hi_addr/%lo_addr Coprocessor instruction descriptions Special symbols Assembler register identifier conventions Assembler instructions Test Sequence brief Instructions with scheduling implications R5000 Floating Point Unit Execution Rate R4xxx/R5000 Coprocessor Hazards 32-bit immediate values Add-with-carry Subtract-with-borrow operation 9-10 9-13 9-21 9-22 12-5 13-1 13-3 13-5 16-1 16-2 16-3
INTRODUCTION
CHAPTER
OVERVIEW
offers variety MIPS ISA-compatible CPUs targeted embedded applications. variety price performance points enables system developers design products around various family members quickly, reducing time-to-market development cost. applications segments these products typically serve, software development increasingly larger part system development. This manual intended augment various device interface manuals, targeted firmware developer using CPUs. manual covers MIPS architecture seen programmer attempts address most common issues facing developers. This manual draws upon concepts embodied various software development products: most notably, IDT/c-a multi-host, multi-target compiler microprocessor family-and IDT/sim-the target resident monitor/debugger IDT-based systems. Many IDT/MIPS architecture concepts discussed here supported similar fashion toolchains from other vendors. ultimate choice toolchain beyond scope this manual; purpose this manual guide developers toward tool over another. more information, your local sales representative about "AdvantageIDT" program.
IDT's MICROPROCESSOR FAMILIES
currently offers wide variety microprocessors. these devices based MIPS architecture, software developed processor should easily portable other family members. However, MIPS architecture does allow kernel specific features varied implementation; thus, minor changes reset code, cache management code, even exception code need occur when changing between certain family members. addition, instruction architecture undergoes "constant improvement" whereby later cores offer architectural features found earlier generations. Management these features also affects portability. currently offers families MIPS architecture: R30xx family 32-bit RISC microcontrollers includes R3051, R3052, R3071, R3081 R3041 processors. different members family offer different price/performance trade-offs varying presence and/or TLB, varying cache sizes. these based around original MIPS-I R3000A core. R36100 integrated RISC microprocessor/microcontroller. This device features MIPS-I R3000A core integrated with cache with system functions such communications channels, memory controllers, controllers/channels. general, descriptions R30xx operations also apply this device. R4xxx Orion family high-performance 64-bit CPUs. These devices realized around proprietary implementation R4400 compatible core. They MIPS-3 ISA. Some devices feature extension applications.
INTRODUCTION
CHAPTER
R5xxx family MIPS-4 compatible devices. this family, current device features multiple instruction issue, large caches, high frequency operation. descriptions R4xxx operations (particularly kernel operations) also apply these devices. Although most programming occurs using high-level language (usually "C"), with little awareness underlying system processor architecture, certain operations require programmer assembly programming, and/or aware underlying system processor structure. This manual designed consulted when addressing these types issues.
Number Core R3041 R3051 R3052 R3071 R3081 R36100 R3000A R3000A R3000A R3000A R3000A R3000A I-Cache D-Cache Size Size 8KB/ 16KB 8KB/ 16KB 512B
Level MIPS-1 MIPS-1 MIPS-1 MIPS-1 MIPS-1 MIPS-1
Optional Optional Optional Optional
Comments Variable Port Width Interface
Half-frequency option Half-frequency option Integrated system controller peripherals
R4600 R4700 R4650 R4640 R5000
Proprietary MIPS-3 Orion Enhanced Orion Orion Orion R5000 MIPS-3 MIPS-3 MIPS-3 MIPS-4
16KB 16KB 32KB
16KB 16KB 32KB
Singleprecision Singleprecision
BaseBounds Basebounds Enhanced multiply performance Cost reduced Orion 32-bit width Multi-issue execution core
Table Embedded Microprocessor Family
INTRODUCTION
CHAPTER
PIPELINE
I-cache register file D-cache register file
instr
Pipelined processors operate breaking instruction execution into multiple small independent "stages"; since stages independent, multiple instructions varying states completion cycle. Also, this organization tends facilitate higher frequencies operation, since very complex activities broken down into "bitesized" chunks. result that multiple instructions executing time, that instructions initiated (and completed) very high frequency. Pipelining success depends caches, which reduce amount time spent waiting memory. current offerings separate instruction data caches, fetch instruction read write memory variable same clock phase. combining high-frequency operation with high memory-bandwidth, very high-performance achieved. normally runs from cache cache miss (where data instructions have fetched from memory) seen infrequent event. Figure shows typical pipeline CPU. This model assumes that instruction fetches data accesses satisfied from processor caches processor operation frequency. instructions rigidly defined follow same sequence pipestages, even where instruction does nothing some stage. result that, long keeps hitting cache, starts instruction every clock. pipeline stages are: Instruction fetch (IF) gets next instruction from instruction cache (I-cache). Read registers (RD) decodes instruction fetches contents registers uses. Arithmetic/logic unit (ALU) performs arithmetic logical operation clock (floating point math integer multiply/ divide can't done clock handled differently; this described later). instruction read/write memory variables data cache (D-cache). typical programs, three four instructions nothing this stage, allocating stage each instruction ensures that processor never instructions wanting data cache same time. Write back (WB) :store value obtained from operation back register file.
Instruction sequence
instr
instr
Time
Figure MIPS 5-stage pipeline
INTRODUCTION
CHAPTER
pipeline limits kinds things instructions example: Instruction length instructions bits (exactly machine ``word'') long, that they fetched constant time. This itself discourages complexity; there enough bits instruction encode really complicated addressing modes, example. arithmetic memory variables data from cache memory obtained only stage which much late available ALU. Memory accesses occur only simple load store instructions which move data from registers (this described ``load/store architecture''). MIPS CPUs have general-purpose registers, 3-operand arithmetical/ logical instructions, avoid complex special-purpose instructions that compilers usually cannot generate. This makes easy target efficient optimizing compilers.
32-bit 64-bit CPUs
offers both 32-bit 64-bit CPUs; MIPS architecture defines 64-bit CPUs such that they cleanly 32-bit applications. 32-bit 64-bit processors operate same, with respect 8-bit 16bit data, described later this manual. MIPS architecture, 64-bit CPUs implicitly sign-extend most 32-bit values, that value interpreted same when used either 32-bit value 64-bit value. Additional instructions provided when size data important-for example, when performing loads/stores operations, when testing arithmetic carry 32bit values. resulting architecture allows either 32-bit applications 64-bit applications 64-bit processors. reprogrammable computing world, need 64-bit architecture largely driven needs support large programs large address spaces. embedded applications typically served families, 64-bit addressing rarely necessary. However, ability directly load, store, manipulate 64-bit datums improves performance applications such internetworking equipment image decompression, which operate large, volatile, data streams. Since 64-bit addressing rarely needed, 64-bit datums sometimes are, most compiler tool chains allow programmer implement either "A32D32"or "A32D64" model: that 32-bit addresses 32bit datums, 32-bit addresses with 64-bit datums. Control over these widths typically achieved combination variable declarations ("long long" "double") and/or compiler switches.
MIPS ARCHITECTURE LEVELS
There multiple generations MIPS architecture. most commonly discussed MIPS-1, MIPS-2, MIPS-3, MIPS-4 architectures. Successive generations implement features previous generation, along with instructions designed solve problems enhance performance. Note that these levels necessarily imply particular structure MMU, caches, exception model, other kernel specific resources. Thus, different implementations compatible chips require different kernels.
INTRODUCTION
CHAPTER
Figure illustrates relationship MIPS levels.
MIPS MIPS MIPS MIPS
MIPS Architecture Extensions
Figure MIPS relationships
MIPS-1 found R2000 R3000 generation CPUs. 32-bit ISA, defines basic instruction set. user application written with MIPS-1 instruction will operate correctly generations architecture. MIPS-2 also 32-bit. adds some instructions speed floating point data movement, eliminate software interlocks, compiler driven branch-prediction, other minor enhancements. This first implemented MIPS R6000 microprocessor. MIPS-3 64-bit ISA. addition supporting MIPS-1 MIPS-2 instructions, MIPS-3 contains 64-bit equivalents certain earlier instructions that sensitive operand size (e.g. load double load word both supported), including doubleword (64-bit) data movement arithmetic. This first implemented R4000 clean transition from existing 32-bit architecture. MIPS-4 adds instructions improve floating point performance, such multiply-add, conditional move instructions. This first found MIPS R8000, also present R10000 R5000. 64-bit ISA. addition, implemented small extensions ISA, notably R4650 R4640. Although they strictly "MIPS extensions," they were added cooperation with MIPS allocation opcodes.
MIPS COMPARED WITH CISC ARCHITECTURES
Although MIPS architecture fairly straight-forward, there features, visible only assembly programmers, that appear surprising first. addition, operations familiar CISC architectures irrelevant MIPS architecture. example, MIPS architecture does mandate stack pointer stack usage; thus, programmers surprised find that push/pop instructions exist directly.
Instruction encoding features
instructions 32-bits long mentioned above. This means, example, that impossible incorporate 32-bit constant into single instruction. ``load immediate'' instruction limited 16bit value; special ``load upper immediate'' must followed ``or immediate'' 32-bit constant value into register. Note that this true even 64-bit instructions. That opcodes remain encoded 32-bits, even though data operated upon 64-bit.
INTRODUCTION
CHAPTER
Instruction actions must pipeline actions only carried designated pipeline phase, must complete clock. example, register writeback phase provides just value stored register file, instructions only change register. 3-operand instructions arithmetic/logical operations don't have specify memory locations, there plenty instruction bits define independent source destination register. Compilers love 3-operand instructions, which give optimizers more scope improve code which handles complex expressions. registers compilers like large (but necessarily large) number registers, there cost context-saving encoding registers used instruction. Register always returns zero, give compact encoding that useful constant. condition codes MIPS architecture does provide condition code flags implicitly arithmetical operations. motivation make sure that execution state stored place register file. Conditional branches MIPS) test single register sign/zero, pair registers equality/inequality.
Addressing memory accesses
Memory references always register loads stores arithmetic memory variables complicates, therefore, slows down pipeline. Memory references only occur explicit load store instructions. large register file allows useful working data registers. Only data addressing mode1 loads stores define memory location with single base register value modified 16-bit signed displacement. Note that assembler compiler tools register, along with immediate value, synthesize additional addressing modes from this directly supported mode. Byte-addressing instruction includes load/store operations 16-bit variables (referred byte halfword). Partialword load instructions come flavors sign-extend zeroextend. Loads/stores must address-aligned memory word operations only load store data from single 4-byte aligned word; halfword operations must aligned half-word addresses. Techniques handle unaligned data efficiently will explained later. Jump instructions op-code field MIPS instruction bits; leaving bits define target jump. Since instructions 4-byte aligned memory least-significant address bits need stored, allowing address range 256Mbytes. Rather than make this branch PC-relative, this interpreted absolute address within 256Mbyte ``segment''. theory, this could impose limit size single program; reality, hasn't been problem. Branches segment achieved using instruction, using contents register target. Conditional branches have 16-bit displacement field (218 byte range since instructions 4-byte aligned) which interpreted signed PC-relative displacement. Compilers only code simple conditional branch instruction, they know that target will within 128Kbytes instruction following branch. MIPS-4 does allow register+register addressing floating-point operands.
INTRODUCTION
CHAPTER
Operations directly supported
byte halfword arithmetic arithmetical logical operations performed 32-bit 64-bit) quantities. Byte and/or halfword arithmetic would require significant extra resources, many more opcodes. Where program explicitly does arithmetic short char, compiler must insert extra code ensure that wraparound overflows have appropriate effect. special stack support conventional MIPS assembler usage does define register, hardware treats just like other register. There recommended format stack frame layout subroutines, that programs modules from different languages compilers. recommended that programmers stick these software conventions, there hardware requirements. Minimal subroutine overhead There special feature; jump instructions have ``jump link'' option which stores return address into register. default, convenience, convention, becomes ``return address'' register. Minimal interrupt overhead MIPS architecture makes very presumptions about system exception handling, allowing fast response wide variety software models. R30xx family, stashes away restart location special register EPC, modifies machine state just enough signal trap happened, disallow further interrupts; then jumps single predefined location. Everything else determined software. Note: interrupt trap, MIPS does store anything stack, write memory, preserve registers itself. convention, registers ($k0, $k1) reserved that interrupt/ trap routines ``bootstrap'' themselves-it impossible anything MIPS without using some registers. program running system which takes interrupts traps, values these registers change time, thus should used.
Multiply divide operations
MIPS does have asynchronous integer multiply/divide unit. With special output registers, multiply unit relatively independent rest CPU.
Programmer-visible pipeline effects
Programmers MIPS CPUs must also aware certain MIPS pipeline effects. Specifically, results certain operations available next instruction; programmer needs explicitly aware such cases.
branch
branch addr
branch delay
branch target
Figure pipeline branch delays
INTRODUCTION
CHAPTER
Delayed branches pipeline structure MIPS (see pipeline branch delays) means that when jump instruction reaches ``execute'' phase program counter generated, instruction after jump will already have been decoded. Rather than discard this potentially useful work, architecture rules state that instruction after branch always executed before instruction target branch. "branch likely" instructions introduced MIPS-2 ISA, delay slot "nullified" conditional branch taken. pipeline branch delays show that special path provided through make branch address available half-clock early, ensuring that there only cycle delay before outcome branch determined appropriate instruction flow (branch taken taken) initiated. responsibility compiler system assemblerprogrammer allow for, frequently, instruction which would otherwise have been placed before branch moved into delay slot. Where nothing useful done, delay slot filled with ``nop'' (no-op, no-operation) instruction. Many MIPS assemblers will hide this feature from programmer unless explicitly told described later. Load data available next instruction another consequence pipeline that load instruction's data arrives from cache/ memory system AFTER next instruction's phase starts possible data from load following instruction. Figure pipeline load delays sequence. MIPS-1 architecture, programmer must insure that this rule violated.
load
D-cache
load delay
data
Figure pipeline load delays
Again, most assemblers will hide this they can. Frequently, assembler move instruction which independent load into load delay slot; worst case, insert insure proper program execution. MIPS-2 does require placed unfilled load delay slots.
INTRODUCTION
CHAPTER
NOTE MACHINE ASSEMBLER LANGUAGE
simplify assembly level programming, MIPS Corp's assembler (and many other MIPS assemblers) provides "synthetic" instructions. synthetic instruction common assembly level operation that assembler will into more operating instruction. This mapping more intelligent than mere macro expansion. example, immediate load into instruction datum small enough, multiple instructions datum larger. These instructions dramatically simplify assembly level programming assembly code readability. This obviously useful, confusing. This manual will synthetic instructions sparingly, indicate when happens. Moreover, instruction tables below will consistently distinguish between synthetic machine instructions. These features help human programmers; most compilers generate instructions which correspond one-for-one with machine code. However, some compilers will generate synthetic instructions. These some helpful operations that assembler perform: 32-bit load immediates programmer code load with value (including memory location which will computed link time), assembler will break down into instructions load high half value. Load from memory location programmer code load from memory-resident variable. assembler will normally replace this loading temporary register with high-order half variable's address, followed load whose displacement loworder half address. course, this does apply variables defined inside functions, which implemented either registers stack. Efficient access memory variables some programs contain many references static extern variables, two-instruction sequence load/store them expensive. Some compilation systems, with run-time support, around this. Certain variables selected compile/assemble time default MIPS Corp's assembler selects variables which occupy less bytes storage) kept together single section memory which must smaller than 64Kbytes. run-time system then initializes register ($28 (global pointer) convention) point middle this section. Loads stores these variables coded relative load store. More types branch condition assembler synthesizes full branches conditional arithmetic test between registers. Simple different forms instructions unary operations such produced with zero-valued register Two-operand forms 3-operand instructions written; assembler will result back into first-specified register. Hiding branch delay slot normal coding most assemblers will allow access branch delay slot, re-organize instruction sequence substantially search something useful delay slot. assembler directive, .set noreorder, available where this must happen. Hiding load delay: many assemblers will detect attempt result load next instruction, will either move code around insert (for MIPS-1).
INTRODUCTION
CHAPTER
Unaligned transfers: ``unaligned'' load/store instructions will fetch halfword word quantities correctly, even target address turns unaligned. Other pipeline corrections: some instructions (such those which integer multiply unit) have additional constraints that implementation specific (see Appendix hazards). Many assemblers will just "handle" these cases automatically, least warn programmer about possible hazards violations. Other optimizations: some MIPS instructions (particularly floating point) take multiple clocks produce results. However, hardware ``interlocked'', programmer does need aware these delays write correct programs. MIPS Corporation's assembler particularly aggressive these circumstances will perform substantial code movement make faster. This need considered when debugging. general, best dis-assembler utility disassemble resulting binary during debug. This will show system designers true code sequence being executed "uncover" modifications made assembler.
MIPS ARCHITECTURE
CHAPTER
PROGRAMMER'S VIEW PROCESSOR ARCHITECTURE
This chapter describes assembly programmer's view architecture, terms registers, instructions, computational resources. This viewpoint corresponds assembly programmer writing user applications. Information about kernel software development (such handling interrupts, traps, cache memory management) described later chapters.
Registers
There general purpose registers: $31. These bits wide R30xx, bits wide R4xxx R5000. Two, only two, special hardware: always returns zero, writes ignored. used normal subroutine-calling instructions (jal, bgezal, bttzal) return address. Note that call-by-register version (jalr) register return address, though commonly also uses $31. other respects, registers identical used instruction. There programmer visible program counter. subroutine transfer instructions store link register, which used return from subroutine. Also, there condition codes status bits needed user-level programmer. There registers associated with integer multiplier. These registers-referred "HI" "LO"-contain product result multiply operation quotient remainder divide. result multiplication 128-bits case R4xxx 64-bits case R30xx. HI/LO also function accumulators "multiply-accumulate" instructions mad/madu R4650. R4650 also true operand multiply instruction which does HI/LO registers all. floating point math co-processor (called floating point accelerator, also some times referred this manual), available, adds floating point registers; simple assembler language they just called again fact that these floating point registers implicitly defined instruction. Actually, case R30xx, only even-numbered registers usable math; they used either single-precision bit) double-precision (64-bit) numbers. When performing double-precision arithmetic, higher numbered register holds low-order bits even numbered register specified instruction. Only moves between integer FPA, load/store instructions, will refer odd-numbered registers.
also different registers called ``co-processor registers'' control purposes. These typically used manage actions/state FPA, should confused with data registers.
CHAPTER
MIPS ARCHITECTURE
R4600/4700/R5000 offers full 64-bit operations floating point unit configured following ways: When Status register equals floating point unit configured sixteen 64-bit registers double-precision values thirty-two 32-bit registers single-precision values. When Status register equals floating point unit configured thirty-two 64-bit registers. Each register hold single- double-precision values. R4650 supports single precision floating point math only. floating point unit configured following ways: When Status register equals floating point unit configured sixteen 32-bit single-precision registers. When Status register equals floating point unit configured thirty-two 32-bit single-precision registers. Some processors also support (R30xx R4600, R4700, R5000). R4650 only supports base-bounds translation. There dedicated registers handle memory address translation.
Conventional names uses general-purpose registers
Although hardware makes rules about registers, their practical governed number conventions. These conventions allow inter-changeability tools operating systems well library modules compiler calling conventions that must strictly followed. With conventional uses registers, conventional names. Given need with conventions, conventional names pretty much mandatory. common names described Table 2.1.
8-15 24-25 16-23
Name zero v0-v1 a0-a3 t0-t7 t8-t9 s0-s7
Used Always returns writes ignored. (assembler temporary) Used assembler (for synthetic instruction expansion) Values (except returned subroutine (arguments) First four parameters subroutine (temporaries) subroutines without saving
Subroutine ``register variables''; subroutine, which will change these, must save value restore before exits, calling routine sees their values preserved. Reserved interrupt/trap handler. global pointer some runtime systems maintain this give easy access ``static'' ``extern'' variables. stack pointer register variable. Subroutines which need this ``frame pointer''. Return address subroutine
26-27
k0-k1 s8/fp
Table 2.1. Conventional names registers with usage mnemonics
MIPS ARCHITECURE
CHAPTER
Notes conventional register names this register often used inside synthetic instructions generated assembler. programmer must explicitly, directive .set noat stops assembler from using (there some synthetic instructions that cause assembler issue warnings). v0-v1 used when returning non-floating-point values from subroutine. return anything bigger than registers, memory must used (described later chapter). a0-a3 used pass first four integer parameters subroutine, different mixture integer floating point parameters. actual convention fully described later chapter. t0-t9 convention, subroutines these values without preserving them. This makes them easy ``temporaries'' when evaluating expressions caller must assume that they will destroyed subroutine call. s0-s8 convention, subroutines must guarantee that values these registers exit same they were entry either using them, saving them stack restoring before exit. k0-k1 reserved trap/interrupt routines, which will restore their original value; they little anyone else. (global pointer). compilation systems loaders support supported, will point load-time-determined location midst your static data. This means that loads stores data lying within 32Kbytes either side value performed single instruction using base register. Without global pointer, loading data from static memory area takes instructions: load most significant bits 32bit constant address computed compiler loader, data load. compiler must know compile time that datum will linked within 64Kbyte range memory locations. practice only guess. usual practice ``small'' global data items area pointed linker fail gets big. definition what "small" typically specified with compiler switch (most compilers "-G"). most common default size bytes less. (stack pointer). Since takes explicit instructions raise lower stack pointer, generally done only subroutine entry exit; responsibility subroutine being called this. normally adjusted, entry, lowest point that stack will need reach point subroutine. compiler access stack variables constant offset from Stack usage conventions explained later chapter. (also known s8). subroutine will ``frame pointer'' keep track stack extends stack run-time. Some languages this explicitly (for many toolchains); programs, which ``alloca'' library routine, will this case, possible access stack variables from initialized function prologue constant position relative function's stack frame. Note that ``frame pointer'' subroutine call called subroutines that frame pointer; subroutine must preserve value (return address). entry subroutine, holds address which control should returned subroutine typically ends with instruction ``jr ra''. Subroutines, which themselves call subroutines, must first save usually stack.
CHAPTER
MIPS ARCHITECTURE
Integer multiply unit registers
multiply unit consumes small amount area dramatically improves performance (and cache performance) over "multiply step" operations. It's basic operation multiply 32-bit values together produce 64-bit result, which stored 32-bit registers (called ``hi'' ``lo'') which private multiply unit. Instructions mfhi, mflo defined copy result into general registers. R4xxx, 64-bit values multiplied produce 128-bit result. However, case R4xxx, operands 32-bits long only, they must valid sign-extended values. high level language programming this issue, compiler will take care sign extension requirements; should checked when porting assembler-level code from R30xx R4xxx. Unlike results integer operations, multiply result registers interlocked. attempt read results before multiplication complete results being stopped until operation completes. integer multiply unit will also perform integer division between values general-purpose registers; this case ``lo'' register stores quotient, ``hi'' register remainder. R30xx family, multiply operations take clocks division takes Instruction cycle timing multiply double multiply (64-bit) well divide double divide members R4xxx family listed Table 2.2. 3-operand multiply (MUL) multiply-add (MAD) available R4650 only.
Instruction R4600 R4650 R4700 R3000 R5000 MULT/U DIV/U DMULT/U DDIV/U MAD/U
Table 2.2. Multiply divide instruction cycle timing
assembler synthetic multiply operation which starts multiply then retrieves result into ordinary register. Note that assembler even substitute series shifts adds multiplication constant, improve execution speed. Multiply/divide results written into ``hi'' ``lo'' soon they available; effect deferred until writeback pipeline stage, with writes general purpose (GP) registers. mfhi mflo instruction interrupted some kind exception before reaches writeback stage pipeline, will aborted with intention restarting However, subsequent multiply instruction which passed stage will continue parallel with exception processing) would overwrite ``hi'' ``lo'' register values, that re-execution mfhi would wrong (i.e. new) data. this reason recommended that multiply should started within instructions mfhi/ mflo. assembler will avoid doing this when possible. Compilers will often generate code trap errors, particularly divide zero. Frequently, this instruction sequence placed after divide initiated, allow execute concurrently with divide (and avoid performance loss).
MIPS ARCHITECURE
CHAPTER
Instructions mthi, mtlo defined setup internal registers from general-purpose registers. They essential restore values ``hi'' ``lo'' when returning from exception, probably anything else. R4650 provides couple multiplication instructions that apart from other members family. (multiply accumulate) instruction unsigned counterpart madu "hi" "lo" registers accumulators. addition these, another instruction offers true operand multiplication eliminates extra step moving result from "lo" register general purpose register.
Instruction types
full list R30xx family integer instructions presented Appendix Floating point instructions listed Appendix this manual. integer floating point instructions listed appendixes this manual. MIPS uses three instruction encoding formats. most part, instructions numerical order. Occassionally, simplify reading, list re-ordered clarity. Instruction terminology instruction encodings have been chosen facilitate design high-frequency CPU. Specifically: instruction encodings reveal portions internal design. Although there variable encodings, those fields which required very early pipeline encoded very regular way: Source registers always same place that fetch instructions from integer register file without conditional decoding. Some instructions need both registers since register file designed provide source values every clock nothing been lost. 16-bit constant always same place permitting appropriate instruction bits directly into ALU's input multiplexer, without conditional shifts. Throughout this manual, description various instructions will also refer various subfields instruction, follows: basic op-code, bits long. Instructions with large subfields (for example, large immediate values, such required ``long'' j/jal instructions, arithmetic with 16-bit constant) have unique ``op'' field. Other instructions classified groups sharing ``op'' value, distinguished other fields (``op2'' etc.). rs1, fields identifying source registers. register written this instruction. Shift-amount: shift, used shift-by-constant instructions. Sub-code field used 3-register arithmetic/logical group instructions value zero). offset 16-bit signed word offset defining destination ``PCrelative'' branch. branch target will instruction offset words away from delay slot instruction; branch-to-self offset target 26-bit word address jumped corresponds 28bit byte address, which always word-aligned).
CHAPTER
MIPS ARCHITECTURE
constant
high-order bits target address can't specified this instruction, taken from address jump instruction. This means that these instructions reach anywhere 256Mbyte region around instructions' location. jump further (jump register) instruction. 16-bit integer constant ``immediate'' arithmetic logic operations. Arithmetic logical sign extended (such sign-xtnd zero-xtnd). another extended opcode field, this time used ``coprocessor'' type instructions. Field which hold source destination register. Field hold number control register (different from integer register file). Called ``crs''/``crd'' contexts where must source/destination respectively.
Loading storing: addressing modes
mentioned above, there only basic addressing mode. load store machine instruction written
operation dest-reg, offset(src-reg) e.g.:lw offset($2); offset($4)
integer registers used destination source. offset sign extended integer, 16-bit number anywhere between -32768 32767); program address used load dest-reg offset. This address mode normally enough select particular member structure (``offset'' being distance between start structure member required); array indexed constant; also enough reference function variables from stack frame pointer; provide reasonable sized global area around value static extern variables. assembler synthesizes simple direct addressing mode, load values memory variables whose address computed link time. More complex modes such double-register scaled index must implemented with more instructions.
Data types memory registers
R30xx family CPUs load store between bytes single operation. Naming conventions used documentation build instruction mnemonics:
``C'' name long long long short char MIPS name doubleword word word halfword byte Size(bytes) Assembler mnemonic ``w'' ``w'' ``h'' ``b''
Notes: MIPS-III instruction; R4xxx R5000 only. Some compilers R4xxx will allow efficient
64-bit integer math with special compile-time switch (e.g. -mint64 switch IDT/C), where integer size bytes assembler instruction "ld/sd" used load/store bytes time.
Table 2.3. Naming conventions
MIPS ARCHITECURE
CHAPTER
Integer data types Byte halfword loads come flavors: Sign-extend load value into least significant bits 32/64-bit register, fill high order bits copying ``sign bit'' (bit byte, half-word). This correctly converts signed value 32/64-bit signed integer. Zero-extend instructions load value into least significant bits 32/64-bit register, with high order bits filled with zero. This correctly converts unsigned value memory corresponding 32/64-bit unsigned integer value; byte value becomes 32/64-bit value 254. example, value 0xFE (-2, interpreted unsigned), then:
0(t1) 0(t1)
will leave holding value 0xFFFF FFFE signed 32-bit) holding value 0x0000 00FE (254 signed unsigned 32-bit). Subtle differences shorter integers extended longer ones historical cause portability problems, modern standard elaborate rules. machines like MIPS, which does support 16-bit precision arithmetic directly, expressions involving short char variables less efficient than word operations. Unaligned loads stores using assembler Loads stores MIPS architecture must aligned. Half-words must loaded from 2-byte boundaries, words from 4-byte boundaries; R4xxx family, double words must loaded from 8-byte boundaries. load instruction with unaligned address will cause trap. needed, software provide trap handler which will emulate desired load operation hide this feature from application, substantial performance cost. MIPS architecture provides hardware mechanism access unaligned data. machine instructions (load word left), (load word right), (store word left) (store word right). R4600/4700/5000, equivalent 64-bit instructions (load double left), (load double right), (store double left) (store double right) which deal with bytes opposed described this section. loads four bytes from least significant portion word starting from specified address high (left) portion destination register; loads from four bytes from most significant portion word starting from specified address (right) portion register. load word into register from arbitrary address register sequence
0(a0) 3(a0) 0(a0) 3(a0)
endian machine sequence little endian machine (see diagram below). This sequence generated macro-instruction (unaligned load word). macroinstruction (unaligned load half) also provided, synthesized loads shift. Note that allows instruction pairs same destination register without intervening instruction; however, least instruction must executed between instruction pair using value destination register. stores four bytes from high (left) portion source register least significant portion word starting from specified address; stores from four bytes from (right) portion
CHAPTER
MIPS ARCHITECTURE
Memory
Register
register most significant portion word starting from specified address. store word from register arbitrary address register sequence
0(a0) 3(a0) 0(a0) 3(a0)
endian machine sequence little endian machine (see diagram below). Note that uses Memory Register
hardware control effect partial word writes; will work destination device does honor byte enables, whereas will work with word-wide device. Unaligned loads stores using data items declared code will correctly aligned default. certain embedded applications such intelligent networking datacom, data structures forced have unaligned data data structures packed bytes between data structures between fields within structure force alignment minimize memory usage. such cases, programmer required descend assembler coding deal with unaligned data accesses. Some compilers, such IDT/C compiler, provide mechanism achieve unaligned data accesses through itself. keyword _attribute_ allows programmer specify special attributes variables structure fields. This keyword followed attribute specification inside double parentheses.
MIPS ARCHITECURE
CHAPTER
attribute interest achieving unaligned data accesses "packed". "packed" attribute forces byte alignment fields data structure. compiler uses lwl/lwr loading swl/swr storing unaligned data. following code does "packed" attribute. Study assembler code generated after compiling: Begin code struct char x[2] foo; Here generated assembler code when "packed" used: Begin code struct char x[2] _attribute_ ((packed)) main() foo.a foo.x[0] foo.x[1] code; begin partial listing assembler code generated from above code*/ 800201c8 <main+18> $v1,65 800201cc <main+1c> $v1,0($v0) 800201d0 <main+20> $v1,18 800201d4 <main+24> $v1,1($v0) note offset byte 800201d8 <main+28> $v1,4($v0) 800201dc <main+2c> $v1,37 800201e0 <main+30> $v1,5($v0) 800201e4 <main+34> $v1,8($v0) assembler code IDT/C compiler efficient enough recognize that field larger than certain number bytes, better lwl/lwr swl/swr pairs entire data transfer, that smarter pairs only point reaching word alignment beyond which regular instructions prove more efficient until point where less than bytes remain transferred using lwl/lwr swl/swr again. Note that "packed" attribute works only structures simple variables such char. achieve packing simple variable, inside structure with that variable only element. Floating point data memory This allows programmer load single-precision values load into even-numbered floating point register; programmer also load double-precision value macro instruction, that:
ldc1 $f2, 24(t1)
CHAPTER
MIPS ARCHITECTURE
expanded loads consecutive registers:
lwc1 lwc1 $f2, 24(t1) $f3, 28(t1)
compiler aligns 8-byte long double-precision floating point variables 8-byte boundaries. R30xx family hardware does require this alignment; done avoid compatibility problems with implementations MIPS-2 MIPS-3 CPUs such R4600 (Orion), where ldc1 instruction machine instruction alignment necessary.
BASIC ADDRESS SPACE R30xx
which MIPS processors handle addresses subtly different from that traditional CISC CPUs, appear confusing. Read first part this section carefully. Here some guidelines: addresses into programs rarely same physical addresses which come chip (sometimes they're close, same). This manual will refer them program addresses physical addresses respectively. more common name program addresses "virtual addresses"; note that term "virtual address" does necessarily imply that operating system must perform virtual memory management (e.g. demand paging from disks.), rather that address undergoes some transformation before being presented physical memory. Although virtual address proper term, this manual will typically term "program address" avoid confusing virtual addresses with virtual memory management requirements. However, should remembered that always uses virtual (program) addresses, which translated physical addresses. typical operating modes: user kernel. user mode, address above 2Gbytes (most-significant address set) illegal causes trap. Also, some instructions cause trap user mode. 32-bit program address space divided into four areas with traditional names; different things happen according area address lies kuseg 0000 0000 7FFF FFFF (low 2Gbytes): these addresses permitted user mode. machines with MMU, they will always translated (more about later chapter). Software should attempt these addresses unless R30xx CPUs without MMU, kuseg "program address" transformed physical address adding offset; address transformations "base versions" R30xx family described later this chapter. Note, however, that many embedded applications this address segment (those applications which require that kernel resources protected from user tasks). kseg0 0x8000 0000 9FFF FFFF (512 Mbytes): these addresses ``translated'' into physical addresses merely stripping bit, mapping them contiguously into Mbytes physical memory. This transformation operates same both "base" family members. This segment referred "unmapped" because version devices cannot redirect this translation different area physical memory.
2-10
MIPS ARCHITECURE
CHAPTER
kseg1
kseg2
Addresses this region always accessed through cache, used until caches properly initialized. They will used most programs data systems using "base" family members; will used kernel systems which ("E" version devices). 0xA000 0000 BFFF FFFF (512 Mbytes): these addresses mapped into physical addresses stripping leading three bits, giving duplicate mapping Mbytes physical memory. However, kseg1 program address accesses will cache. kseg1 region only chunk memory which guaranteed behave properly from system reset; that's after-reset starting point (0xBFC0 0000, commonly called "reset exception vector") lies within physical address starting point 0x1FC0 0000 which means that hardware should place boot this physical address. Software will therefore this region initial program ROM, most systems also registers. general, devices should always mapped addresses that accessible from Kseg1, system always mapped contain reset exception vector. Note that code then accessed uncacheably (during boot using kseg1 program addresses, also accessed cacheably (for normal operation) using kseg0 program addresses. 0xC000 0000 FFFF FFFF Gbyte): this area only accessible kernel mode. kuseg, devices program addresses translated into physical addresses; thus, these addresses must referenced prior initialization. "base versions", physical addresses generated same program addresses kseg2. Note that many systems will need this region. versions, frequently contains structures such page tables; simpler OS'es probably will have little need kseg2.
SUMMARY R30xx SYSTEM ADDRESSING
MIPS program addresses rarely simply same physical addresses, simple embedded software will probably addresses kseg0 kseg1, where program address related obvious unchangeable physical addresses. Physical memory locations from 0x2000 0000 (512Mbyte) upward difficult access. versions R30xx family, only reach these addresses through MMU. "base" family members, certain these physical addresses reached using kseg2 kuseg addresses: address transformations base R30xx family members described later this chapter.
Kernel user mode
kernel mode (the resets into this state), program addresses accessible. user mode: Program addresses above 2Gbytes (top set) illegal will cause trap. Note that MMU, this means valid user mode addresses must translated MMU; thus, User mode devices typically requires memory-mapped
2-11
CHAPTER
MIPS ARCHITECTURE
"base" CPUs, kuseg addresses mapped distinct area physical memory. Thus, kernel memory resources (including devices) made inaccessible User mode software, without requiring memory-mapping function from Alternately, hardware choose "ignore" high-order address bits when performing address decoding, thus "condensing" kuseg, kseg2, kseg1, kseg0 into same physical memory. Instructions beyond standard user become illegal. Specifically, kernel prevent User mode software from accessing on-chip (system control coprocessor, which controls exception machine state performs memory management functions CPU). Thus, primary differences between User Kernel modes are: User mode tasks inhibited from accessing kernel memory resources, including data structures devices. This also means that various user tasks protected from each other. User mode tasks inhibited from modifying basic machine state, prohibiting accesses CP0. Note that kernel/user mode does change interpretation anything just some things cease allowed user mode. kernel mode access addresses just user mode, they will translated same way.
Memory CPUs without hardware
treatment kseg0 kseg1 addresses same R30xx CPUs. system implemented using only physical addresses 512Mbytes, system software written only kseg0 kseg1, then choice "base" versions R30xx family relevant. versions without ("base versions"), addresses kuseg kseg2 will undergo fixed address translation, provide system designer option provide additional memory. base members R30xx family provide following address translations kuseg kseg2 program addresses: kuseg: this region (the 2Gbytes program addresses) translated contiguous 2Gbyte physical region between 1-3Gbytes. effect, offset added each kuseg program address. hex:
Program address 0x0000 0000 0x7FFF FFFF
Physical Address 0x4000 0000 0xBFFF FFFF
kseg2: these program addresses genuinely untranslated. program addresses from 0xC000 0000 0xFFFF FFFF emerge identical physical addresses. This means that "base" versions generate most physical addresses (without MMU), except between 512Mbyte 1Gbyte (0x2000 0000 through 0x3FFF FFFF). noted above, many systems ignore high-order address bits when performing address decoding, thus condensing physical memory into lowest 512MB addresses.
2-12
MIPS ARCHITECURE
CHAPTER
Subsegments R3041 memory width configuration R3041 configured access different regions memory either 32-, 8-bits wide. Where program requests 32-bit operation narrow memory (either with uncached access, cache miss, store), break transaction into multiple data phases, match datum size memory port width. width configuration applied independently subsegments normal kseg regions, follows: kseg0 kseg1: usual, these both mapped onto 512Mbytes. This common region split into subsegments (64Mbytes each), each which programmed 32-bits wide. width assignment affects both kseg0 kseg1 accesses (that view these subsegments corresponding "physical" addresses). kuseg: divided into four 512Mbyte subsegments, each independently programmable width. Thus, kuseg broken into multiple portions, which have varying widths. example this 32-bit main memory with some 16-bit PCMCIA font cards 8-bit NVRAM. kseg2: divided into 512Mbyte subsegments, independently programmable width. Again, this means that kseg2 support multiple memory subsystems, varying port width. Note that once various memory port widths have been configured (typically boot time), software does have aware actual width memory system. choose treat memory 32-bit wide, will automatically adjust when access made narrower memory region. This simplifies software development, also facilitates porting various system implementations (which choose same memory port widths). Kernel Mode Virtual Addressing 36100 When 36100 processor operating Kernel mode, four distinct virtual address segments simultaneously available. segments are: kuseg. kernel assert same virtual address user process, have same virtual-to-physical address translation performed translation user task. This facilitates kernel having direct access user memory regions. virtual-tophysical address translation, including Port Size attributes, identical with User mode addressing this segment. kseg0. Kseg0 512MB segment, beginning virtual address 0x8000_0000. This segment always translated linear 512MB region physical address space starting physical address references through this segment cacheable. When most significant three bits virtual address "100", virtual address resides kseg0. physical address constructed replacing these three bits virtual address with value "000". these references cacheable, kseg0 typically used kernel executable code some kernel data. kseg1. Kseg1 also 512MB segment, beginning virtual address 0xa000_0000. This segment also translated directly 512MB physical address space starting address references through this segment uncacheable. When most significant three bits virtual address "101", virtual address resides kseg1. physical address constructed replacing these three bits virtual address with value "000". Unlike kseg0, references through kseg1 cacheable. This segment typically used registers, boot code, operating system data areas such disk buffers.
2-13
CHAPTER
MIPS ARCHITECTURE
kseg2. This segment analogous kuseg, accessible only from kernel mode. This segment contains linear addresses, beginning virtual address 0xc000_0000. with kuseg, virtual-to-physical address translation depends whether processor base extended architecture version. When most significant bits virtual address "11," virtual address resides 1024MB segment kseg2. virtual-tophysical translation done either through (extended versions processor) through direct segment mapping (base versions). operating system would typically this segment stacks, per-process data that must re-mapped context switch, user page tables, some dynamically allocated data areas. Base versions R30xx family (including R36100) distinguishable from extended versions software examining (TLB Shutdown) Status Register after reset, before used. immediately after reset, indicating that non-functional, then current processor base version architecture. cleared after reset, then software executing extended architecture version processor. Processor Revision Identifier (PRId) register used distinguish R36100 from other members R30xx family.
R36100 Address Translation
Processors that only implement base versions memory management perform direct segment mapping virtual-to-physical addresses, illustrated Figure 2.1. mapping kuseg kseg2 performed follows: Kuseg always translated contiguous region physical address space, beginning location 0x4000_0000. That value "00" highest order bits virtual address space translated value "01", "01" translated "10", with remaining bits virtual address unchanged. Virtual addresses kseg2 directly output physical addresses; that references kseg2 occur with physical address unchanged from virtual address. Virtual addresses kseg0 kseg1 both translated identically same physical address region. base versions architecture allow kernel software protected from user mode accesses, without requiring virtual page management software. User references kernel virtual address will result address error exception. Note that special areas virtual address space shown Figure translated physical addresses identically with remainder their virtual address segment. R30xx family, these address areas were indicated "reserved" compatibility with future devices.
2-14
MIPS ARCHITECURE
CHAPTER
VIRTUAL 0xffffffff 0xfff00000 0xffefffff
On-chip registers (uncached)
Kernel Cached (kseg2) Kernel Uncached (kseg1) Kernel Cached (kseg0)
PHYSICAL On-chip registers (uncached) Kernel Cached Tasks 1023
0xffffffff 0xfff00000 0xffefffff
0xc0000000 0xbfffffff 0xa0000000 0x9fffffff 0x80000000 0x7fffffff 0x7ff00000 0x7fefffff
Cache Miss Space
0xc0000000 0xbfffffff 0xbff00000 0xbfefffff
Cache Miss Space
Kernel/User Cached Tasks 2047
Kernel/User Cached (kuseg)
Inaccessible
Kernel Boot
0x40000000 0x3fffffff 0x20000000 0x1fffffff 0x00000000
0x00000000
Figure
Virtual-to-physical address translation R36100
Some systems elect protect external physical memory well. That system include distinct memory devices which only accessed from kernel mode. physical address output determines whether reference occurred from kernel user mode, according Table 2.4. Some systems wish limit accesses some memory devices those physical address bits which correspond kernel mode virtual addresses. Alternately, some systems wish have kernel user tasks share common areas memory. Those systems could choose have their address decoder ignore high-order physical address bits, compress memory into lower region physical memory. high-order physical address bits useful privilege mode status outputs these systems.
Physical Address (31:29) `000' `001' '01x' '10x' '11x' Table 2.4. Virtual Address Segment Kseg0 Kseg1 Inaccessible Kuseg Kuseg Kseg2
Virtual Physical Address Relationships Base Versions
BASIC ADDRESS SPACE R4600/R4700
Readers interested R4x00 have skipped preceding sections because sections pertain R30xx, advised review those sections before proceeding. Some general comments regarding MIPS architecture those sections relevant even R4xxx processors.
2-15
MIPS ARCHITECURE
CHAPTER
Unlike R30xx family, R4xxx family does have "base versions." R4600/R4700 processors have memory management units (MMU). R4600/R4700 uses on-chip Translation Lookaside Buffer (TLB) translate program addresses physical addresses. R4600/R4700 modes operation: User, Supervisor Kernel. R4600/R4700, program address space either 32bits 64-bits wide depending mode operation setting corresponding extended address Status Register (UX, KX); addresses 32-bits wide, they 64-bits wide. With 36-bit Physical Address, total Gigabytes physical address space available. Depending mode operation processor, different program address spaces become available follows:
User User mode, single, contiguous program address space called available. size Gbytes (231) 32-bit mode called useg. 64-bit mode size 1Tbyte (240) space label xuseg. Legal 32-bit addresses 0x0000 0000 0x7FFFF FFFF, 64-bit addresses 0x0000 0000 0000 0000 0x0000 00FF FFFF FFFF. Presenting addresses outside these ranges while processor User mode results Address Error exception. Cache accessibility controlled settings entries. Supervisor mode designed layered operating systems which true kernel runs Kernel mode described later, rest runs Supervisor mode. 32-bit Supervisor mode, spaces named User Space Supervisor Space addressed. Their labels suseg sseg respectively. Gbytes suseg between 0x0000 0000 0x7FFF FFFF. sseg Mbytes, from 0xC000 0000 0xDFFF FFFF. 64-bit Supervisor mode, three spaces named User Space (xsuseg), Current Supervisor Space (xsseg) Separate Supervisor Space (csseg) available. Tbyte xsuseg from 0x0000 0000 0000 0000 0x0000 00FF FFFF FFFF. xsseg goes from 0x4000 0000 0000 0000 till 0x4000 00FF FFFF FFFF, also Tbytes long. Addressing csseg compatible with addressing sseg 32-bit mode; begins 0xFFFF FFFF C000 0000 ends 0xFFFF FFFF DFFF FFFF, covering Mbytes. processor enters Kernel mode when:
Super
Kernel
set, set, Mode Kernel
exceptions, either will set. processor remains exception mode until instruction return from exception (eret) executed, which point mode existing prior detection exception restored. Kernel-mode program address space shown Figure 2.2.
2-16
MIPS ARCHITECURE
CHAPTER
32-bit
0xFFFF FFFF Mapped 00xE000 0000 Mapped 00xC000 0000 Unmapped Uncached Unmapped Cached
kseg1 kseg3
FFFF FFFF FFFF FFFF FFFF FFFF E000 0000 FFFF FFFF 0000
ksseg
64-bit
Mapped Mapped Unmapped Uncached Unmapped Cached Address Error Mapped
ckseg3
cksseg
ckseg1
FFFF FFFF 0000 FFFF FFFF 8000 0000 C000 00FF 8000 0000
kseg0
ckseg0
00xA 0000
xkseg xkphys
00x8000 0000
C000 0000 0000 0000 Unmapped 8000 0000 0000 0000
Mapped
kuseg
4000 0100 0000 0000
Address Error Mapped
xksseg
4000 0000 0000 0000 0000 0100 0000 0000 Address Error Mapped 00x0000 0000 0000 0000 0000 0000
xkuseg
Figure
Kernel Mode Address Space
References kseg0 kseg1 mapped through TLB. physical address defined low-ordered bits program address kseg0 kseg1. cacheability coherency kseg0 determined settings Config register while kseg1 never cacheable. 64-bit xkuseg offers special feature handler. Status register set, segment becomes unmapped, uncached space allowing exception code operate uncached using base register. segment xkphys physical spaces, each bytes long. References these spaces through TLB; physical address taken from bits 35:0. bits 61:59 program address determine cacheability coherency shown Table 2.5. regions cksegx compatible with their 32-bit counterparts ksegx.
2-17
CHAPTER
MIPS ARCHITECTURE
Value (61:59)
Cacheability Coherency Attributes Cacheable, noncoherent, write-through, write allocate Cacheable, noncoherent, write-through, write allocate Uncached Cacheable, noncoherent Reserved
Starting Address 0x8000 0000 0000 0000 0x8800 0000 0000 0000 0x9000 0000 0000 0000 0x9800 0000 0000 0000 0xA000 0000 0000 0000
Table 2.5. Cacheability Coherency Attributes
BASIC ADDRESS SPACE R4650
Readers interested R4650 have skipped sections regarding R30xx addressing pages back, advised review those sections before proceeding. Some general comments regarding MIPS architecture those sections relevant even R4650. R4650 employs simple mechanism support mapping program addresses physical addresses. found R4600/R4700 replaced "base-bounds" mechanism. When program address translated, page number first compared against Bounds register. address range," base register added program address form physical address. There basebound registers instruction addresses (IBase IBounds registers) another data (DBase DBounds). addition these registers, Cache Algorithm (CAlg) register allows cache attributes single system. processor program addresses 32-bits wide; upper 32-bits 64-bit registers ignored. Physical address space Gbytes. R4650 operating modes, User mode Kernel mode. address spaces defined follows: useg address space from 0x0000 0000 0x7FFF FFFF Gbytes) labelled useg User mode. This only space available User mode. same address space available from Kernel mode well, where label kuseg. Mbyte address space 0x8000 0000 through 0x9FFF FFFF defined kseg0 accessible Kernel mode only. Addresses kseg0 mapped using basebounds mechanism; their physical addresses calculated subtracting 0x8000 0000 from program addresses. CAlg register controls cacheability this segment. reset kseg0 cacheable. Mbyte address space 0xA000 0000 through 0xBFFF FFFF defined kseg1 accessible Kernel mode only. Addresses kseg1 mapped using base-bounds mechanism; their physical addresses calculated subtracting 0xA000 0000 from program addresses. CAlg register controls cacheability this segment. reset caches disabled kseg1 address space, this changed later using CAlg register. Gbyte address space 0xC000 0000 through 0xFFFF FFFF defined kseg2 accessible Kernel mode only. Addresses kseg2 mapped using basebounds mechanism; their physical addresses calculated subtracting 0xC000 0000 from program addresses. CAlg register controls cacheability this segment.
2-18
kseg0
kseg1
kseg2
MIPS ARCHITECURE
CHAPTER
Figure shows kernel mode address space.
FFFF FFFF
Unmapped
kseg2
C000 0000
0000
Unmapped Uncached
kseg1
8000 0000
Unmapped Cached
kseg0
Mapped
kuseg
0000 0000
Note:
Default value; changed CAlg register.
Figure
Kernel Mode Address Space
address translation from program physical address takes place using same algorithm data well instructions although different base-bounds registers used each case. addresses above 0x7FFF FFFF generated User mode, address error exception generated. addresses useg, bits 31:12 compared Bound register bits 30:12. program address bigger than bounds address, Bound Exception occurs. Otherwise, physical address equals (program address bits 31:12 Base register bits 31:12) concatenated with program address bits 11:0. Program address bits 31:29 used select appropriate CAlg fields determine cacheability where applicable described earlier.
2-19
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CHAPTER
This chapter describes aspects MIPS architecture that must managed operating system. Most these features transparent application programmer; however, most embedded systems programmers will have view underlying system architecture will find this material important.
Co-processors Opcodes reserved instruction fields defined four "coprocessors". Architecturally, co-processors tightly coupled base integer CPU; example, defines instructions move data directly between memory coprocessor, rather than requiring moved into integer processor first. MIPS uses term "co-processor" both traditional nontraditional sense. device traditional microprocessor coprocessor: optional part architecture, with particular instruction set. MIPS also uses term "co-processor" functions required manage environment, including exception management, cache control, memory management. This segmentation insures that chip architecture varied (e.g. cache architecture, interrupt controller, etc.), without impacting user mode software compatibility. These functions grouped MIPS into on-chip "co-processor "system control co-processor" these instructions implement whole control system. Note that co-processor independent existence, certainly optional. provides standard encoding instructions which access status register; that, although definition status register changes among implementations, programmers same assembler both CPUs. Similarly, exception memory management strategies varied among implementations, these effects isolated particular portions kernel.
CONTROL SUMMARY
This chapter, coupled with chapters cache management, memory management, exception processing, provide details managing machine state. areas interest include: control co-processor privileged instructions organized, with shortform descriptions. There relatively privileged instructions; most low-level control over exercised reading writing bit-fields within special registers. Exceptions external interrupts, invalid operations, arithmetic errors result "exceptions", where control transferred exception handler routine. MIPS exceptions extremely simple hardware does absolute minimum, allowing programmer tailor exception mechanism needs particular system. later chapter describes MIPS exceptions, they precise, exception vectors, conventions about code exception handling routines. Special problems arise with nested exceptions: exceptions occurring while still handling earlier exception. Hardware interrupts have their style rules. Exception Management chapter includes annotated example moderatelycomplicated exception handler.
CHAPTER
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
Caches cache management current R30xx R4xxx implementations have dual caches (the I-cache instructions, D-cache data). On-chip hardware provided manage caches, programmer working with devices, particularly with devices, need explicitly manage caches particular situations. manipulate caches, R30xx allows software isolate them, inhibiting cache/memory traffic allowing processor access cache were simple memory; R30xx swap roles I-cache D-cache (the only make I-cache writable). R4xxx provides direct access both primary caches through cache instruction. Caches must sometimes cleared stale invalid/uninitialized data. Even following power-up, caches random state must cleaned before they used. later chapter will discuss techniques used software manage on-chip cache resources. addition, techniques determine on-chip cache sizes will shown (greatest flexibility achieved software written independent cache sizes). diagnostics programmer, techniques test cache memory probe particular entries will discussed. some implementations system designer make configuration choices about cache (e.g. R3081 R3071 allow cache organization selected between 16kB I-cache/4kB D-cache each cache). cache management chapter will also discuss some considerations apply make proper selection. Write buffer R30xx family CPUs D-cache always write through; writes main memory well cache. This simplifies caches, main memory won't able accept data fast write Much performance loss made using FIFO buffer write cycles (both address data). R30xx family, this FIFO, called write buffer, integrated on-chip. R4xxx, D-cache either write-back writethrough. FIFO store described above also exists R4xxx. System programmers need know that writes happen later than code sequence suggests. chapter cache management discusses this. Reset reset almost nothing defined, software must configure carefully. MIPS CPUs, reset implemented almost exactly same exceptions. later chapter reset initialization discusses ways finding which executing software, program run. example runtime environment, attending stack special registers, provided. Memory management /Base-Bounds: later chapter will discuss address translation managing translation hardware (base-bounds mechanism R4650 others). This section mostly programmers. Power management: R4xxx processors into mode called "standby" mode with WAIT instruction. this mode internal core operates considerably reduced power. more information about, refer RISC Microprocessor Application Guide.
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CHAPTER
CONTROL "CO-PROCESSOR control instructions
Control functions implemented with registers (most which consist multiple bitfields). There several control instructions used memory management implementation, which described later this manual. Aside from MMU, control R30xx defines just instruction beyond necessary move from control registers. mtc0 <nn> -Move co-processor zero Loads "co-processor register number from general register unusual, good practice, refer control registers their number assembler sources; normal practice names listed Table 3.1. some tool-chains names defined C-style "include" file, pre-processor front-end assembler; assembler manual should provide guidance this. This only setting bits control register. mfc0 -Move from co-processor zero General register loaded with values from control register number Once again, common symbolic name macro-processor save remembering numbers. This only inspecting bits control register. -Restore from exception (R30xx) This instruction available R30xx only. Note that this "return from exception". This instruction restores status register back state prior trap. understand what does, refer status register defined later this chapter. only secure returning user mode from exception return with instruction which delay slot. eret -Exception return (R4xxx) This R4xxx instruction which actually returns from exception, interrupt error trap. Unlike branch jump instruction, eret does execute next instruction. R4xxx some additional instructions control. Doubleword counterparts mtc0/mfc0 instructions also available dmtc0/dmfc0 which allow 64-bit transfers. wait instruction puts low-power standby mode. more information about standby mode, refer IDT79R4600 IDT79R4700 ORION Processor Hardware User's Manual.
Standard control registers
Register Mnemonic PRId Cause BadVaddr
Description type level. (status register) mode flags. Describes most recently recognized exception. Exception return address. Contains last invalid program address which caused trap. address errors kinds, even there MMU. configuration (R3071, R3081, R3041, R4xxx only).
Config
Table Control Register Summary (not MMU) Page
CHAPTER Register Mnemonic BusCtrl
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
Description (R3041 only) configure interface signals. Needs setup match hardware implementation. (R3041 only) used flag some program address regions 16-bits wide. Must programmed match hardware implementation. (R3041/R4xxx, read/write) 24-bit counter incrementing with clock. (32-bit R4x00). (R3041/R4xxx, read/write) 24-bit value used wraparound Count value output signal. (32-bit R4xxx). (R4600/R4700 only) pointer kernel virtual page table entry (PTE) 32-bit address spaces. (R4600/R4700 only) pointer kernel virtual page table entry (PTE) 64-bit address spaces. (R4600/R4700/R4650 only) secondary-cache error checking correcting (ECC) Primary parity. (R4600/R4700/R4650 only) Cache Error Status register. (R4600/R4700/R4650 only) Error Exception Program Counter. (R4650 only, read/write) specifies instruction program address that causes Watch exception. (R4650 only, read/write) specifies data program address that causes Watch exception.
PortSize
Count
Compare
Context
XContext
CacheErr ErrorEPC IWatch
DWatch
Table Control Register Summary (not MMU) Page
Control Register Formats
note about reserved fields: many unused control register fields marked "0." Bits such fields guaranteed read zero should written zero. Other reserved fields marked reserved software must write them zero should assume that will back zero, other particular value. Figure shows layout fields PRId register, read-only register. field should related control register set. PRId Register
reserved Figure PRId Register fields
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CHAPTER
encoding described Table 3.2:
type R3000A (including R3051, R3052, R3071, R3081) unique (R3041) R36100 R4600 R4700 R4650 "Imp" value 0x20 0x21 0x22 "Rev" value undefined undefined undefined undefined
Table "Imp" "Rev" values
Note that when field indicates unique, revision number used distinguish among various implementations. Refer R3041 User's manual revision level appropriate that device. Since R3051, kernel compatible with R3000A, they share same value. When printing value this register, conventional print them "x.y" where decimal values Rev, respectively. this register manuals size things establish presence absence particular features. software will more portable robust designed include code sequences that probe existence individual features. This manual will provide examples determine cache sizes, presence absence TLB, FPA, etc. Status Register (R3xxx)
Figure
Status Register Fields (SR) (R3xxx)
Note that there modes such non-translated non-cached MIPS CPUs; translation caching decisions made basis program address. Fields are: CU3, Bits (31:30) control usability "co-processors" respectively. R30xx family, these might enabled software wishes BrCond(3:2) input pins polling, speed exception decoding. Co-processor usable: present, disable. When instructions cause interrupt exception, even kernel. useful turn even when available; also enabled devices which include FPA, intent BrCond(1) polled input. Co-processor usable: some nominally-privileged instructions user mode (this rarely, ever, done). Coprocessor instructions always usable kernel mode, regardless setting this bit.
CHAPTER
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
Reverse endianness user mode. MIPS processors configured, reset time, with either "endianness" (byte ordering convention, discussed various CPU's User's Manuals later this manual). allows applications with byte ordering convention systems with opposite convention, presuming software provided necessary support. When active, user mode software runs been configured with opposite endianness. Boot exception vectors: when uses (kseg1) space exception entry point (described later chapter). usually zero running systems; this relocates exception vectors. addresses, speeding accesses allowing "user supplied" exception service routines. shutdown: devices that implement full R3000A MMU, program address simultaneously matches entries. Prolonged operation this state, some implementations, could cause internal contention damage chip. shutdown terminal, cleared only hardware reset. R30xx base family members, which include TLB, this reset; software rely this determine presence absence TLB. Parity Error: cache parity error occurred. exception generated this condition, which really only useful diagnostics. MIPS architecture cache diagnostic facilities because earlier versions used external caches, this provided verify timing particular system. those implementations cache parity error essential design debug tool. CPUs with on-chip caches this feature rarely needed; only R3071 R3081 implement parity over on-chip caches. Cache Miss: data cache miss occurred while cache isolated. Parity Zero: when set, cache parity bits written zero checked. This useful R3000A systems which required external cache RAMs, little relevance R30xx family. SwC/IsC Swap caches Isolate (data) cache. Cache mode bits cache management diagnostics. more details, chapter cache management. These bits undefined reset. system software should these known values before proceeding. makes loads stores access only data cache. this mode, partial-word store invalidates cache entry. Note that when this set, even uncached data accesses will seen bus; further, this initialized reset. Boot-up software must insure this properly initialized before relying external data references. set: reverses roles I-cache D-cache, that software access invalidate I-cache entries. Interrupt mask: field defining which interrupt sources, when active, will allowed cause exception. interrupt sources external pins (one used
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CHAPTER
FPA, which although lives same chip logically external); other software-writable interrupt bits Cause register. interrupt prioritizing provided CPU. This described greater detail chapter dealing with exceptions. KUc/IEc basic protection bits. when running with kernel privileges, user mode. kernel mode, software whole program address space, privileged ("co-processor instructions. User mode restricts software program addresses between 0x0000 0000 0x7FFF FFFF, denied permission privileged instructions; attempts break rules result exception. prevent taking interrupt, enable. KUp/IEp previous, previous: exception, hardware takes values saves them here; same time changing values KUc, (kernel mode, interrupts disabled). instruction used copy KUp, back into KUc, IEc. KUo/IEo old, old: exception KUp, bits saved here. Effectively, KU/IE bits operated 3-deep, 2-bit wide stack which pushed exception popped rfe. This provides opportunity cleanly recover from exception occurring early exception handling routine that first exception saved particularly useful allow user refill code made shorter, described memory management chapter. Status Register (R4600/R4700) Status register (SR) read/write register that contains operating mode, interrupt enabling, diagnostic states processor. following list describes more important Status register fields; Figure shows status register format field names. Status Register Format (R4600/R4700) Figure shows format Status register. Table 3.3, which follows figure, describes Status register fields.
(Cu3:.Cu0)
Figure
Status Register (4600/4700)
CHAPTER
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
Field
Description Controls usability each four coprocessor unit numbers. always usable when Kernel mode, regardless setting bit. usable unusable Enables additional floating-point registers registers registers Reverse-Endian bit, valid User mode. Controls location refill general exception vectors. normal bootstrap Indicates soft reset occurred. (tag match valid state) miss indication last CACHE Invalidate, Write Back Invalidate, Write Back, Virtual primary cache. miss Contents register modify check bits caches when description register. Specifies that cache parity errors cannot cause exceptions. parity remains disables parity enabled Reserved. Must written zeroes return zeroes when read. Interrupt Mask: controls enabling each external, internal, software interrupts. interrupt taken interrupts enabled, corresponding bits both Interrupt Mask field Status register Interrupt Pending field Cause register. IM[7:2] correspond interrupts Int[5:0] IM[1:0] software interrupts. disabled enabled controls whether Refill Vector XTLB Refill Vector address used misses kernel addresses Refill Vector XTLB Refill Vector Enables 64-bit virtual addressing operations Supervisor mode. extended-addressing refill exception used misses supervisor addresses. 32-bit 64-bit Enables 64-bit virtual addressing operations User mode. extended-addressing refill exception used misses user addresses. 32-bit 64-bit Mode bits User Error Level normal Supervisor error Kernel
Exception Level normal exception Note: When going from should disabled first. This would done when preparing return from exception handler, such before executing ERET instruction. Interrupt Enable disable interrupts enables interrupts
Table Status Register Fields (4600/4700)
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CHAPTER
Status Register Modes Access States Fields Status register modes access states described sections that follow. Interrupt Enable: Interrupts enabled when following conditions true: these conditions met, settings bits identify interrupt. Note: Setting delayed cycles. performing nested interrupts, re-enable first. Operating Modes: following Status register settings required User, Kernel, Supervisor modes (see Chapter more information about operating modes). processor User mode when 102, processor Supervisor mode when 012, processor Kernel mode when 002, 64-bit Virtual Addressing: following Status register settings select 64-bit virtual addressing User Supervisor operating modes. Enabling 64-bit virtual addressing permits execution 64-bit opcodes translation 64-bit virtual addresses. 64-bit virtual addressing User Supervisor modes independently always used Kernel mode. field controls whether Refill Vector XTLB Refill Vector address used misses Kernel addresses. 64-bit opcodes always valid Kernel mode. 64-bit addressing operations enabled Supervisor mode when 64-bit addressing operations enabled User mode when Status Register Reset contents Status register undefined reset, except following bits distinguishes between Reset Soft Reset (Nonmaskable Interrupt [NMI]). Status Register (R4650) Status register (SR) R4650 similar that R4600 most part. Please refer previous section details. Figure shows format entire register R4650. Following figure description fields that unique R4650.
(Cu3:.Cu0)
Figure
Status Register (4650)
CHAPTER
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
Bits different R4650, compared R4600. R4650, because does have TLB, does support 64-bit program addressing, only operating modes, bits reserved. noted Table 3.4, bits (DL) (IL) used cache locking.
Data cache lock, R4650. Does prevent refills into when invalid. Does inhibit update D-cache store operations. normal operation refill into disabled Instruction cache lock, R4650. Does prevent refills into when invalid. normal operation refill into disabled User Mode bit, R4650. User Kernel (Simplification KSU, remains subject ERL, R4xxx. Table Bits 4650 Status Register
Figure shows fields Cause register, which consulted determine kind exception that happened will used decide which exception routine call. Cause Register (R3xxx R4600/R4700)
ExcCode
Figure
Fields Cause Register (R3xxx R4600/R4700)
Branch Delay: set, this indicates that does point actual "exception" instruction, rather branch instruction which immediately precedes When exception restart point instruction which "delay slot" following branch, point branch instruction; harmless re-execute branch, returned from exception branch delay instruction itself branch would taken exception would have broken interrupted program. only time software might sensitive this must analyze "offending" instruction then instruction This would occur instruction needs emulated (e.g. floating point instruction device with hardware FPA; breakpoint placed branch delay slot). Co-processor error: exception taken because "coprocessor" format instruction "co-processor" which enabled then this field coprocessor number from that instruction. Interrupt Pending: shows interrupts which currently asserted (but "masked" from actually signalling exception). These bits follow inputs hardware levels. Bits read/writable, contain value last written them. However, bits active when enabled appropriate global interrupt enable flag will cause interrupt. subtly different from rest Cause register fields; doesn't indicate what happened when exception took place, rather shows what happening now.
3-10
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CHAPTER
ExcCode
5-bit code which indicates what kind exception happened, detailed Table 3.5, "ExcCode Values: R3xxx/R4600/R4700 Exception differences".
ExcCode Value
Mnemonic TLBL TLBS AdEL AdES load/TLB store Interrupt modification
Description
Address error load/I-fetch store respectively). Either attempt access outside kuseg when user mode, attempt read word half-word misaligned address. error (instruction fetch data load, respectively). External hardware signalled error some kind; proper exception handling system-dependent. R30xx family CPUs can't take error store; write buffer would make such exception "imprecise". Generated unconditionally syscall instruction. Breakpoint break instruction. reserved instruction Co-Processor unusable arithmetic overflow. Note that unsigned versions instructions (e.g. addu) never cause this exception. Trap Exception R4600/R4700; reserved R3xxx Reserved Floating-Point exception Reserved.
16-31
Syscall
Table ExcCode Values: R3xxx/R4600/R4700 Exception differences
Cause Register (R4650) Cause register fields (shown Figure 3.6) similar those R4600, described previous section. Notable differences between R4650 R4600 cause registers described Table 3.6.
Cause Register
Code
Figure
Cause Register Format (R4650)
3-11
CHAPTER
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
Field
Description Indicates whether last exception taken occurred branch delay slot. delay slot normal Reserved. Currently read must written `0'. Coprocessor unit number referenced when Coprocessor Unusable exception taken. Watch exception, indicates that DWatch register matched. other exceptions this field undefined. Watch exception, indicates that IWatch register matched. other exceptions this field undefined. Enables dedicated interrupt vector. interrupts exception vector (200) interrupts common exception vector (180) Indicates interrupt pending. interrupt pending interrupt Exception code field (see Table 3.5, "ExcCode Values: R3xxx/R4600/R4700 Exception differences," page 1-11) Table Cause Register fields (R4650)
ExcCode
Register This 32-bit read/write register containing 32-bit address return point this exception (64-bits R4600/R4700). instruction causing exception EPC, unless Cause, which case points previous (branch) instruction. R4600/R4700 will write set. Also, R4600, R4700, R4650 ErrorPC cache errors soft reset. BadVaddr Register (R3xxx) 32-bit register containing address whose reference exception; MMU-related exception, attempt user program access addresses outside kuseg, address wrongly aligned datum size referenced. After other exception this register undefined. Note particular that after error. BadVaddr Register (R4xxx/R4650) 64-bit bits R3xxx R4650) Virtual Address register (BadVAddr) read-only register that displays most recent virtual address that caused following exceptions: Address Error (e.g., unaligned access), Invalid, Modified, Refill, Virtual Coherency Data Access, Virtual Coherency Instruction Fetch. R4650, bounds exception recognized place exceptions because does exist. processor does write BadVAddr register when Status register BadVAddr register does save information errors, since errors addressing errors.
Processor-specific registers
Count Compare Registers (R3041 only) Only present R3041, these provide simple 24-bit counter/timer running cycle rate. Count counts then wraps around zero once reached value Compare register. wraps around output asserted. According configuration (bit BusCtrl register), will either remain active until reset
3-12
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CHAPTER
software (re-write Compare), will pulse. either case counter just keeps counting. generate interrupt must connected interrupt inputs. From reset Compare setup maximum value (0xFF FFFF), counter runs 224-1 before wrapping around. Count Compare Registers (R4xxx only) 32-bit Count register acts timer, incrementing constant rate-half maximum instruction issue rate-whether instruction executed, retired, forward progress made through pipeline. This register read written. written diagnostic purposes system initialization; example, synchronize processors. 32-bit Compare register acts timer; maintains stable value that does change own. When value Count register equals value Compare register, interrupt IP(7) Cause register set. This causes interrupt soon interrupt enabled. Writing value Compare register, side effect, clears timer interrupt. diagnostic purposes, Compare register read/write register. normal however, Compare register write-only. Config Register (R3071 R3081)
Lock Slow Figure Refill FPInt Halt reserved
Fields R3071/81 Config Register
Lock: this write register last time; future writes Config will ignored. Slow Bus: hardware require that this set. only matters when performs store while running from cached location. system hardware design determines proper setting this bit; setting should permissible system, loses some performance memory systems able support more aggressive performance. idle cycle guaranteed between read write transfer. This enables additional time tri-stating, control logic generation, etc. "data cache block refill", reload words into data cache miss, reload just word. initialized either R3081, reset-time hardware input. FPInt: controls interrupt level which interrupts reported. original R3000 CPUs external this determined wiring; R3081's chip would inefficient (and jeopardize pin-compatibility) send interrupt chip again. FPInt binary value interrupt number which dedicated interrupts. default field initialized "011" select Int3; MIPS convention external interrupt whichever dedicated FPA, will then ignore value external pin; field cause register will simply follow FPA. Note: external Int3 corresponds numbered Cause register register. That's because both Cause fields support "software interrupts" numbered bits
3-13
CHAPTER
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
R3071, this field "reserved" must written "000". Halt: bring standstill. will start again soon interrupt input asserted (regardless state interrupt mask). This useful power reduction, also used emulate MC68000 "Halt" operation. slows 1/16th normal clock rate, reduce power consumption. Illegal unless running 33Mhz higher. Note that CPUs output clock (which normally used synchronize interface logic) slows down too; hardware design should also accommodate this feature software desires alternate cache (AC): I-cache/4K D-cache, I-cache/8K D-cache. Reserved: must only written zero. will probably read zero, software should rely this. Config Register (R3041)
Lock
Figure
Fields R3041 Config (Cache Configuration) Register
Lock: finally configure register (additional writes will have effect until reset). fields exactly value shown. DBlockRefill (DBR): read words into cache miss, refill just word missed proper setting given system dependent number factors, best determined measuring performance each mode selecting best one. Note that possible software dynamically reconfigure refill algorithm depending current code executing, presuming register been "locked". Force D-Cache Miss (FDM): R3041-specific cache mode, where loads result data being fetched from memory (missing data cache), incoming data still used refill cache. Stores continue write cache. This useful when software desires obtain high-bandwidth cache cache refills, corresponding main memory "volatile" (e.g. FIFO, updated DMA). Config Register (R4600/R4700) Config register specifies various configuration options selected R4600/R4700 processors; Table lists these options. Some configuration options, defined Config bits 31:3, hardware during reset included Config register readonly status bits software access. field only read/ write field indicated Config register bits 2:0) controlled software; reset these fields undefined. Figure shows format Config register; Table 3.7, which follows figure, describes Config register fields.
3-14
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CHAPTER
2322
Config Register
Figure
Config Register Format (R4600/R4700)
Field
Description System clock ratio: processor clock frequency divided processor clock frequency divided processor clock frequency divided processor clock frequency divided processor clock frequency divided processor clock frequency divided processor clock frequency divided Reserved Writeback data rate: DDDD DDxDDx DDxxDDxx DxDxDxDx DDxxxDDxxx DDxxxxDDxxxx DxxDxxDxxDxx DDxxxxxDDxxxxx DxxxDxxxDxxxDxxx BigEndianMem Little endian Doubleword every cycle Doublewords every cycles Doublewords every cycles Doublewords every cycles Doublewords every cycles Doublewords every cycles Doublewords every cycles Doublewords every cycles Doublewords every cycles Reserved endian
Others
Primary I-cache Size (I-cache size 212+IC bytes). R4600/R4700 processor, this Kbytes 010) Primary D-cache Size (D-cache size 212+DC bytes). R4600/R4700 processor, this Kbytes 010) Primary I-cache line size Primary D-cache line size bytes Words) bytes Words)
kseg0 coherency algorithm (see EntryLo0 EntryLo1 registers) Reserved. Returns indicated values when read.
Table Config Register Fields (R4600/R4700)
Config Register (R4650) Config register specifies various configuration options selected R4650 processors. Some configuration options, defined Config bits 31:3, hardware during reset included Config register read-only status bits software access. Figure 3.10 shows format Config register; Table 3.8, which follows figure, describes Config register fields. Config Register
2322 BEEM
Figure 3.10
Config Register Format (R4650)
3-15
CHAPTER
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
Field
Description Pipeline clock ratio: processor input clock frequency multiplied processor input clock frequency multiplied processor input clock frequency multiplied processor input clock frequency multiplied processor input clock frequency mu

Other recent searches


SPB-37200-50G - SPB-37200-50G   SPB-37200-50G Datasheet
M13FX - M13FX   M13FX Datasheet
LB1687 - LB1687   LB1687 Datasheet
FGH80N60FD2 - FGH80N60FD2   FGH80N60FD2 Datasheet
AD845 - AD845   AD845 Datasheet
2SC5232 - 2SC5232   2SC5232 Datasheet

 

Privacy Policy | Disclaimer
© 2012 Datasheet Archive