| The Datasheet Archive - 100 Million Datasheets from 7500 Manufacturers. |
Version December 1998 2975 Stender Way, Santa Clara, California 9
Top Searches for this datasheetVersion December 1998 2975 Stender Way, Santa Clara, California 95054 Telephone: (800) 345-7015 TWX: 910-338-2070 FAX: (408) 492-8674 Printed U.S.A. 1998 Integrated Device Technology, Inc. Integrated Device Technology, Inc. reserves right make changes products specifications time, without notice, order improve design performance supply best possible product. does assume responsibility circuitry described other than circuitry embodied product. Company makes representations that circuitry described herein free from patent infringement other rights third parties which result from use. license granted implication otherwise under patent, patent rights other rights, Integrated Device Technology, Inc. LIFE SUPPORT POLICY Integrated Device Technology's products authorized critical components life support devices systems unless specific written agreement pertaining such intended executed between manufacturer officer IDT. Life support devices systems devices systems which intended surgical implant into body support sustain life whose failure perform, when properly used accordance with instructions provided labeling, reasonably expected result significant injury user. critical component components life support device system whose failure perform reasonably expected cause failure life support device system, affect safety effectiveness. logo registered trademark, BiCameral, BurstRAM, BUSMUX, CacheRAM, DECnet, Double-Density, FASTX, Four-Port, FLEXI-CACHE, Flexi-PAK, Flow-thruEDC, IDT/ IDTenvY, IDT/sae, IDT/sim, IDT/ux, MacStation, MICROSLICE, PalatteDAC, REAL8, RC3041, RC3051, RC3052, RC3081, RC36100, RC32364, RC4600, RC4640, RC4650, RC4700, RC5000, RC64474, RC64475, RISController, RISCore, RISC Subsystem, RISC Windows, SARAM, SmartLogic, SyncFIFO, SyncBiFIFO, SPC, TargetSystem WideBus trademarks Integrated Device Technology, Inc. MIPS registered trademark, RISCompiler, RISComponent, RISComputer, RISCware, RISC/os, R3000, R3010 trademarks MIPS Computer Systems, Inc. Postscript registered trademark Adobe Systems, Inc. AppleTalk, LocalTalk, Macintosh registered trademarks Apple Computer, Inc. Centronics registered trademark Genicom, Inc. Ethernet registered trademark Digital Equipment Corp. registered trademark Corp. This software developer's guide provides introduction design overview well more detailed descriptions following product families: IDT79RC30xx family 32-bit RISC controllers IDT79RC323xx family 32-bit enhanced MIPS-2 embedded devices IDT79RC4xxx 64-BIT RISCONTROLLER family high-performance 64-bit CPUs IDT79RC5000 family MIPS-4 compatible devices reference real hardware (non-synthetic) assembler instructions provided separate book starting with present revision this Software Reference Manual. Chapter "Introduction," presents overview IDT's microprocessor families, including discussion Pipeline, comparison MIPS CISC architecture. Chapter "MIPS Architecture," discusses high-level architecture from programmer's point view, including comparisons basic address space RC30xx, RC4600/4700, RC4650. Chapter "System Control Co-Processor Architecture," discusses aspects MIPS architecture that must managed operating system, including details about Control Co-Processor Chapter "Exception Management," examines software techniques used manage exceptions, includes several code examples. Chapter "Cache Management," discusses IDT's implementation on-chip caches instructions (I-cache) data (D-cache). Chapter "Memory Management," discusses memory management Translation Lookaside Buffer (TLB). Also included discussion RC4650's simple base-bounds mechanism, which uses instead TLB. Chapter "Reset Initialization," reviews reset, compares exception, includes information bootstrap sequences starting application. Chapter "Floating Point Co-Processor," describes operation floating points, compares implementations various MIPS microprocessors. Chapter "Assembler Language Programming," discusses techniques conventions reading writing MIPS assembler code, including complete table assembler instructions. Chapter Programming," provides overview principles designing efficient runtime environment, including discussion optimization. Chapter "Portability Considerations," discusses main facets designing portability. Chapter "Writing Power-On Diagnostics," provides pragmatic, hands-on look producing usable diagnostics MIPS environment. Chapter "Instruction Timing Optimization," discusses scheduling implications using MIPS instructions, includes information about additional hazards. Chapter "Software Tools Board Bring-Up," describes software tools typically used when debugging board. Chapter "Software Design Examples," contains examples programs applications embedded systems. Chapter "Assembly Language Programming Tips," contains tips optimizing your programming MIPS environment. Chapter "Assembly Language Syntax," contains details assembler directives other assembler language programming syntax issues. Introduction Overview IDT's Microprocessor Families Pipeline 32-bit 64-bit CPUs MIPS Architecture Levels MIPS CISC Architectures Instruction Encoding Features Addressing Memory Accesses Operations Directly Supported Multiply Divide Operations Programmer-Visible Pipeline Effects Notes Machine Assembler Language MIPS Architecture Programmer's View Processor Architecture Registers Conventional Names Uses General-Purpose Registers Notes conventional register names Integer Multiply Unit Registers Instruction Types Instruction Terminology Loading Storing: Addressing Modes Data Types Memory Registers Integer Data Types Unaligned Loads Stores Using Assembler Unaligned Loads Stores Using Floating Point Data Memory Basic Address Space RC3xxx Summary RC3xxx System Addressing 2-10 Kernel User Mode 2-10 Memory CPUs without hardware 2-11 Subsegments RC3041 RC32364 Memory Width Configuration 2-11 Kernel Mode Virtual Addressing 36100 2-12 RC36100 Address Translation 2-12 Basic Address Space RC4600/RC4700 2-14 Basic Address Space RC4650 2-16 System Control Co-Processor Architecture Control Summary Control "Co-processor Control Instructions Standard Control Registers Control Register Formats PRId Register Status Register (RC3xxx) Status Register (RC32364) Status Register (RC4600/RC4700) Status Register Format (RC4600/RC4700) Table Contents Status Register Modes Access States Status Register Reset 3-10 Status Register (RC4650) 3-10 Cause Register (RC3xxx RC4600/RC4700) 3-10 Cause Register (RC4650) 3-11 Cause Register (RC32364) 3-11 Register 3-12 BadVaddr Register (RC3xxx) 3-12 BadVaddr Register (RC4xxx/RC4650/RC32364) 3-13 Processor-Specific Registers 3-13 Count Compare Registers (RC3041 only) 3-13 Count Compare Registers (RC4xxx RC32364 only) 3-13 Config Register (RC3071 RC3081) 3-13 Config Register (RC3041) 3-14 Config Register (RC32364) 3-14 3-15 Config Register (RC4600/RC4700) 3-15 Config Register (RC4650) 3-16 BusCtrl Register (RC3041 only) 3-17 PortSize Register (RC3041 only) 3-18 Context Register (RC4600/RC4700 only) 3-18 XContext Register (RC4600/RC4700 only) 3-19 Error Checking Correcting (ECC) Register (RC4600/RC4700/RC4650/RC32364 only) Cache Error (CacheErr) Register (RC4600/RC4700/RC4650/RC32364 only) 3-20 Error Exception Program Counter (Error EPC) Register (RC4600/RC4700/RC4650/RC32364 only) 3-21 IWatch Register (RC4650/RC32364 only) 3-22 DWatch Register (RC4650/RC32364 only) 3-22 TagLo Register (RC4650/RC32364 only) 3-23 Registers System Operation Support 3-24 Exception Management Exceptions Precise Exceptions Exception Timing Exception Vectors Exception Handling Basics Interrupts 4-23 Cache Management Caches Cache Management RC30xx Cache Characteristics Cache Locking When Cache Locking Cache Locking RC32364 Cache Locking RC36100 Cache Isolation Swapping RC30xx Cache Characteristics Initializing Sizing Caches RC30xx Cache Sizing Code Sample: RC32364/RC4xxx/RC5000 Cache Sizing Code Sample: Initializing RC30xx Cache 5-11 RC30xx Cache Initialization Code: 5-11 Initializing RC4xxx/RC32364/RC5000 Cache 5-12 RC4xxx/RC32364/RC5000 Specific Cache Initialization Code: 5-12 Table Contents Invalidation 5-14 Locking RC4650 Caches 5-16 Example: Instruction Cache Locking 5-16 Testing Probing 5-17 Configuration (RC3041/71/81 only) 5-17 Write Buffer 5-18 Implementing wbflush() 5-18 Memory Management Translation Lookaside Buffer (TLB) Memory Management Base-bounds Registers Description Registers EntryHi, EntryLo (RC30xx) EntryHi, EntryLo0 Index Register (RC30xx) Index Register (RC4600/RC4700/RC32364/RC5000) Random Register (RC30xx) Random Register (RC4600/RC4700/RC32364/RC5000) PageMask Register (RC4600/RC4700/RC32364/RC5000 only) Wired Register (RC4600/RC4700/RC32364/RC5000 only) Context Register XContext Register (RC4600/RC4700/RC5000 only) IBase Register (RC4650 only) 6-10 IBound Register (RC4650 only) 6-10 DBase Register (RC4650 only) 6-11 DBound Register (RC4650 only) 6-11 CAlg Register (RC4650 only) 6-12 Control Instructions 6-13 Programming 6-13 Refills Occur 6-13 Using ASIDs 6-14 Random Register "Wired" Entries 6-14 Memory Translation Setup 6-14 Exception Sample Code 6-15 Basic Exception Handler 6-15 Fast kuseg Refill from Page Table 6-15 Simulating Dirty Bits 6-16 Debugging 6-16 Management Utilities 6-16 Reset Initialization Starting Probing Recognizing Bootstrap Sequences Starting Application Floating Point Co-processor What Floating Point? IEEE Standard Background IEEE Exponent Field Bias IEEE Mantissa Normalization Reserved Exponent Values MIPS Data Formats MIPS Implementation IEEE Table Contents Floating Point Registers (RC30xx) Floating Point Registers (RC4xxx/RC5000) Floating Point Exceptions/Interrupts Floating Point Control/Status Register Floating-point Implementation/Revision Register Guide Instructions Load/store Move Between Registers 3-operand Arithmetic Operations 4-operand Arithmetic Operations Unary (sign-changing) Operations Conversion Operations Conditional Branch Test Instructions 8-10 Other Floating Point Instructions 8-11 Instruction Timing Requirements 8-11 Instruction Timing Speed 8-12 Initialization Enable Demand 8-12 Floating Point Emulation 8-12 Assembler Language Programming Syntax Overview Points Note Register-to-Register Instructions Immediate (Constant) Operands Multiply/Divide Instructions Load/Store Instructions Unaligned Load Store Instructions Addressing Modes GP-relative Addressing Jumps, Subroutine Calls Branches Conditional Branches Coprocessor Conditional Branches Compare Coprocessor Transfers Coprocessor Hazards Assembler Directives 9-10 Sections 9-10 .text, .rdata, .data 9-11 .lit4, .lit8 9-11 .bss 9-11 .sdata, .sbss 9-12 Stack Heap 9-12 Special Symbols 9-12 Data Definition Alignment 9-12 .byte, .half, .word, .short 9-12 .hword expressions, .int expressions .long expressions 9-12 .single, .float, .double 9-13 .ascii, .asciiz "str" 9-13 .string "str" 9-13 .align 9-13 .comm, .lcomm 9-13 .space size, fill 9-14 Symbol Binding Attributes 9-14 .globl symbol, .global symbol 9-14 .extern 9-14 .weakext 9-15 Table Contents Function Directives 9-15 .ent, .end 9-15 .aent 9-15 .frame, .mask, .fmask 9-16 Assembler Control (.set) 9-16 .set noreorder/reorder 9-17 .set volatile/novolatile 9-17 .set noat/at 9-17 .set nomacro/macro 9-17 .set nobopt/bopt 9-18 .set mipsn 9-18 Listing Controls 9-18 .eject 9-18 .list 9-18 .nolist 9-18 .subttl "subheading" 9-18 .psize lines, columns 9-18 .title "heading" 9-18 Complete Guide Assembler Instructions 9-19 Alphabetic List Assembler Instructions 9-36 List RC30xx Instructions 9-36 Alphabetic List Rc4xxx Assembler Instructions 9-40 List RC4xxx Instructions 9-40 ALPHABETIC LIST RC5000 ASSEMBLER INSTRUCTIONS 9-42 ALPHABETIC LIST RC32364 ASSEMBLER INSTRUCTIONS 9-43 Programming 10-1 Stack, Subroutine Linkage, Parameter Passing 10-1 Stack Argument Structure 10-1 Which Arguments Which Registers? 10-1 Examples from library 10-2 Passing Structures 10-2 printf() varargs work 10-2 Returning Value from Function 10-3 Stack-frame Allocation 10-3 Leaf functions 10-3 Non-leaf functions 10-4 Functions Needing Run-time Computed Stack Locations 10-7 Shared Non-shared Libraries 10-8 Sharing Code Single-address Space Systems 10-8 Sharing Code Across Address Spaces 10-9 Introduction Optimization 10-11 Common Optimizations 10-11 Prevent Unwanted Effects from Optimization 10-13 Optimizer-unfriendly Code Avoid 10-13 Portability Considerations 11-1 Writing Portable 11-1 Language Standards 11-1 Library Functions POSIX 11-2 Data Representations Alignment 11-2 Notes Structure Layout Padding 11-3 Isolating System Dependencies 11-4 Locating System Dependencies 11-4 Fixing Dependencies 11-4 Isolating Non-Portable Code 11-5 Table Contents Using Assembler 11-5 Endianness 11-5 What means programmer 11-7 Bitfield Layout Endianness 11-7 Changing endianness MIPS 11-8 Designing specifying configurable endianness 11-8 Read-only instruction memory 11-9 Writable (volatile) memory 11-9 Byte-lane swapping 11-9 Configurable controllers 11-10 Portability Endianness-independent Code 11-11 Endianness-independent code 11-11 Compatibility Within MIPS Family 11-11 Porting MIPS: Frequently Encountered Issues 11-12 Considerations Portability Future Devices 11-13 Writing Power-On Diagnostics 12-1 Golden Rules Diagnostics Programming 12-1 What Should Tests 12-2 Test Diagnostic Tests? 12-2 Overview Algorithmics' Power-on Selftest 12-3 Starting Points 12-3 Control Environment Variables 12-3 Reporting 12-4 Unexpected Exceptions During Test Sequence 12-4 Driving Test Output Devices 12-4 Restarting System 12-4 Standard Test Sequence 12-4 Notes test sequence 12-5 Annotated Examples from Test Code 12-7 Instruction Timing Optimization 13-1 Notes Examples 13-3 Additional Hazards 13-4 Early Modification 13-4 Bitfields Control Registers 13-4 Hazards specific RC4xxx, RC32364 RC5000 13-4 Hazards Specific RC4650 13-5 Hazards Specific RC32364 13-6 Non-obvious Hazards 13-6 Software Tools Board Bring-Up 14-1 Tools Used Debug 14-1 Initial Debugging 14-2 Porting Micromonitor 14-2 Running Micromonitor 14-2 Initial IDT/SIM Activity 14-2 Final Note IDT/KIT 14-3 Software Design Examples 15-1 Application Software 15-1 Memory 15-1 Starting 15-1 Library functions 15-2 Input Output 15-2 Table Contents Character Class Tests 15-3 String Functions 15-3 Mathematical Functions 15-3 Utility Functions 15-3 Diagnostics 15-4 Variable Argument Lists 15-4 Non-local jumps 15-4 Signals 15-4 Date time 15-4 Running Program 15-4 Debugging Program 15-5 Embedded System Software 15-5 Memory 15-5 Starting 15-6 Embedded System Library Functions 15-7 Trap Interrupt Handling 15-7 Simple Interrupt Routines 15-8 Floating-point Traps Interrupts 15-9 Emulating Floating Point Instructions 15-9 Debugging 15-9 UNIX-Like System 15-10 Terminology 15-10 Components Process 15-11 System Calls Protection 15-12 What kernel does 15-12 Virtual Memory Implementation MIPS 15-13 Interrupt handling MIPS 15-14 works 15-14 Assembly Language Programming Tips 16-1 32-bit Address Constant Values 16-1 "Set" Instructions 16-1 "Set" with Complex Branch Operations 16-1 Carry, Borrow, Overflow, Multi-precision Math 16-2 16-3 RC4xxx Features 16-3 RC5xxx features 16-3 Assembly Language Syntax 17-1 Index Table Contents Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table 3.10 Table 3.11 Table 3.12 Table 3.13 Table 3.14 Table 3.15 Table 3.16 Table 3.17 Table 3.18 Table 3.19 Table 3.20 Table 3.22 Table 3.21 Table 3.23 Table 3.24 Table Table Table Table Table Table Table Table Table Table Table Table Table 6.10 Table 6.11 Table 6.12 Table Table 13.1 Table 13.2 Table 13.3 Conventional Register Names Multiply Divide Instruction Cycle Timing Naming Conventions Virtual Physical Address Relationships Base Versions 2-13 Cacheability Coherency. 2-15 Standard Control Registers. "Imp" "Rev" values Status Register Fields (RC32364) Status Register Fields (4600/4700) Bits 4650 Status Register. 3-10 Cause Register Fields (RC3xxx RC4600/RC4700) 3-10 ExcCode Values: R3xxx/R4600/R4700 Exception differences. 3-11 Cause Register Field Descriptions 3-12 Config Register Fields (RC32364). 3-15 Config Register Fields (RC4600/RC4700). 3-16 Config Register Format (RC4650) 3-17 Context Register Fields 3-19 XContext Register Fields 3-20 Register Fields 3-20 CacheErr Register Format. 3-21 CacheErr Register Fields. 3-21 ErrorEPC Register Format. 3-22 IWatch Register Format. 3-22 /Watch Register Fields. 3-22 DWatch Register Format 3-23 TagLo Register Format. 3-23 DWatch Register Fields 3-23 TagLo Register Field Descriptions. 3-24 Primary Cache State Values. 3-24 RC32364 Exception Vectors. Exception Vector Addresses. Interrupt Bitfields Interrupt Pins 4-24 Registers Page Coherency Attributes Index Register Fields Random Register Fields PageMask Register Fields. Wired Register Fields XContext Register Fields 6-10 IBase Register Fields. 6-10 IBound Register Fields 6-11 DBase Register Fields 6-11 DBound Register Fields. 6-12 CAlg Register Fields 6-12 Floating Point Data Formats Instructions that Require Operand 13-1 RC5000 Floating Point Unit Execution Rate. 13-2 Instruction Requirements Between Instructions 13-5 List Tables Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure 3.10 Figure 3.11 Figure 3.12 Figure 3.13 Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure 6.10 Figure 6.11 Figure 6.12 Figure 6.13 Figure 6.14 Figure Figure 10.1 Figure 11.1 Figure 11.2 Figure 11.3 Figure 11.4 Figure 11.5 Figure 11.6 Figure 11.7 Figure 11.8 Figure 15.1 MIPS 5-stage pipeline.1-2 MIPS Relationships.1-4 pipeline branch delays.1-6 pipeline load delays .1-7 Virtual-to-Physical Address Translation RC36100 .2-13 Kernel Mode Address Space .2-15 Kernel Mode Address Space .2-17 PRId Register Format .3-4 Status Register Format (RC3xxx).3-4 Status Register (RC32364).3-6 Status Register (4600/4700).3-8 Status Register (4650).3-10 Cause Register Format (RC4650) .3-12 Config Register Format (RC32364) .3-14 Config Register Format (RC4600/RC4700).3-15 Config Register Format (RC4650) .3-16 Fields R3041 Control (BusCtrl) Register .3-18 Context Register Format.3-19 XContext Register Format .3-19 Register Format .3-20 Direct Mapped Cache .5-2 Cache partitioning example (RC36100).5-5 Two-way Set-associative Cache.5-6 32-bit EntryHi Register Fields .6-5 32-bit EntryLo0, EntryLo1 Register Fields RC32364 .6-5 EntryLo0, EntryLo1 Register Fields 32-bit Mode RC5000 .6-5 Index Register.6-6 Random Register RC32364.6-7 Random Register RC4600/RC4700/RC5000 .6-7 Wired Register Boundary.6-8 Wired Register .6-8 XContext Register Format .6-9 IBase Register .6-10 IBound Register .6-11 DBase Register.6-11 DBound Register .6-11 CAlg Register.6-12 Program Segments Memory .9-11 Stackframe Non-leaf Function .10-4 Example Data Alignment Memory .11-2 Example "pack" PRAGMA Layout .11-3 Example "pack" PRAGMA Effect.11-4 Endian Data Structure .11-7 Data Structure Mapping Big-Endian .11-8 Data Structure Mapping Little-Endian .11-8 Example Orientation with Wrong Endianness .11-9 Byte-lane swapper .11-10 Memory layout process .15-11 List Figures offers variety MIPS ISA-compatible CPUs targeted embedded applications. variety price performance points enables system developers design products around various family members quickly, reducing time-to-market development cost. applications segments these products typically serve, software development increasingly larger part system development. This manual intended augment various device interface manuals, targeted firmware developer using CPUs. manual covers MIPS architecture seen programmer attempts address most common issues facing developers. This manual draws upon concepts embodied various software development products: most notably, IDT/c-a multi-host, multi-target compiler microprocessor family-and IDT/sim-the target resident monitor/debugger IDT-based systems. Many IDT/MIPS architecture concepts discussed here supported similar fashion toolchains from other vendors. ultimate choice toolchain beyond scope this manual; purpose this manual guide developers toward tool over another. more information, your local sales representative about "AdvantageIDT" program. currently offers wide variety microprocessors. these devices based MIPS architecture, software developed processor should easily portable other family members. However, MIPS architecture does allow kernel specific features varied implementation; thus, minor changes reset code, cache management code, even exception code need occur when changing between certain family members. addition, instruction architecture undergoes "constant improvement" whereby later cores offer architectural features found earlier generations. Management these features also affects portability. currently offers families MIPS architecture: RC30xx family 32-bit RISC microcontrollers includes RC3051, RC3052, RC3081 RC3041 processors. different members family offer different price/performance trade-offs varying presence and/or TLB, varying cache sizes. these based around original MIPS-I R3000A core. R3C6100 integrated RISC microprocessor/microcontroller. This device features MIPS-I R3000A core integrated with cache with system functions such communications channels, memory controllers, controllers/channels. general, descriptions R30xx operations also apply this device. RC4xxx 64-bit RISController family high-performance 64-bit CPUs. These devices realized around proprietary implementation RC4400 compatible core. They MIPS-3 ISA. Some devices feature extension applications. RC5xxx family MIPS-4 compatible devices. this family, current device features multiple instruction issue, large caches, high frequency operation. descriptions RC4xxx operations (particularly kernel operations) also apply these devices. RC32300 family based RISCore32300 proprietary core. This family implements several features MIPS-2 MIPS-4 along with specialized extensions applications software debugging. RC32364 device first member family brings RC4xxx performance levels lower costs lower power consumption. Introduction Pipeline Although most programming occurs using high-level language (usually "C"), with little awareness underlying system processor architecture, certain operations require programmer assembly programming, and/or aware underlying system processor structure. This manual designed consulted when addressing these types issues. RC3041 RC3051 RC3052 RC3071 RC3081 RC36100 RC32364 R3000A R3000A R3000A R3000A R3000A R3000A RisCore32300 MIPS-1 MIPS-1 MIPS-1 MIPS-1 MIPS-1 MIPS-1 MIPS-2 MIPS4 extensions MIPS-3 8KB/16KB 8KB/16KB 512B Optional Optional Optional Optional Variable Port Width Interface Half-frequency option Half-frequency option Integrated system controller peripherals bit, DSP, power consumption, enhanced JTAG, MIPS-4 extensions RC4600 Proprietary 64-bit RISController Enhanced 64-bit RISController 64-bit RISController 64-bit RISController R5000 16KB 16KB RC4700 MIPS-3 16KB 16KB Enhanced multiply performance Cost reduced 64-bit RISController 32-bit width RC4650 MIPS-3 MIPS-3 MIPS-4 Singleprecision Singleprecision BaseBounds Basebounds RC4640 RC5000 32KB 32KB Multi-issue execution core I-cache register file D-cache register file instr Instruction sequence instr instr Time Figure MIPS 5-stage pipeline Introduction Pipeline Pipelined processors operate breaking instruction execution into multiple small independent "stages"; since stages independent, multiple instructions varying states completion cycle. Also, this organization tends facilitate higher frequencies operation, since very complex activities broken down into "bite-sized" chunks. result that multiple instructions executing time, that instructions initiated (and completed) very high frequency. Pipelining success depends caches, which reduce amount time spent waiting memory. current offerings separate instruction data caches, fetch instruction read write memory variable same clock phase. combining high-frequency operation with high memory-bandwidth, very high-performance achieved. normally runs from cache cache miss (where data instructions have fetched from memory) seen infrequent event. Figure shows typical pipeline CPU. This model assumes that instruction fetches data accesses satisfied from processor caches processor operation frequency. instructions rigidly defined follow same sequence pipestages, even where instruction does nothing some stage. result that, long keeps hitting cache, starts instruction every clock. pipeline stages are: Instruction fetch (IF): gets next instruction from instruction cache (I-cache). Read registers (RD): decodes instruction fetches contents registers uses. Arithmetic/logic unit (ALU): performs arithmetic logical operation clock (floating point math integer multiply/divide can't done clock handled differently; this described later). MEM: instruction read/write memory variables data cache (D-cache). typical programs, three four instructions nothing this stage, allocating stage each instruction ensures that processor never instructions wanting data cache same time. Write back (WB): store value obtained from operation back register file. pipeline limits kinds things instructions example: Instruction length: instructions bits (exactly machine ``word'') long, that they fetched constant time. This itself discourages complexity; there enough bits instruction encode really complicated addressing modes, example. arithmetic memory variables data from cache memory obtained only stage which much late available ALU. Memory accesses occur only simple load store instructions which move data from registers (this described ``load/store architecture''). MIPS CPUs have general-purpose registers, 3-operand arithmetical/logical instructions, avoid complex special-purpose instructions that compilers usually cannot generate. This makes easy target efficient optimizing compilers. offers both 32-bit 64-bit CPUs; MIPS architecture defines 64-bit CPUs such that they cleanly 32-bit applications. 32-bit 64-bit processors operate same, with respect 8-bit 16-bit data, described later this manual. MIPS architecture, 64-bit CPUs implicitly sign-extend most 32-bit values, that value interpreted same when used either 32-bit value 64-bit value. Additional instructions provided when size data important-for example, when performing loads/stores operations, when testing arithmetic carry 32-bit values. resulting architecture allows either 32-bit applications 64-bit applications 64-bit processors. reprogrammable computing world, need 64-bit architecture largely driven needs support large programs large address spaces. embedded applications typically served families, 64-bit addressing rarely necessary. However, ability directly load, store, manipulate 64-bit datums improves performance applications such internetworking equipment image decompression, which operate large, volatile, data streams. Introduction MIPS Architecture Levels Since 64-bit addressing rarely needed, 64-bit data sometimes are, most compiler tool chains allow programmer implement either "A32D32"or "A32D64" model: that 32-bit addresses 32-bit data, 32-bit addresses with 64-bit datums. Control over these widths typically achieved combination variable declarations ("long long" "double") and/or compiler switches. There multiple generations MIPS architecture. most commonly discussed MIPS1, MIPS-2, MIPS-3, MIPS-4 architectures. Successive generations implement features previous generation, along with instructions designed solve problems enhance performance. Note that these levels necessarily imply particular structure MMU, caches, exception model, other kernel specific resources. Thus, different implementations compatible chips require different kernels. Figure illustrates relationship MIPS levels. Architecture Extensions Figure MIPS Relationships MIPS-1 found R2000 R3000 generation CPUs. 32-bit ISA, defines basic instruction set. user application written with MIPS-1 instruction will operate correctly generations architecture. MIPS-2 also 32-bit. adds some instructions speed floating point data movement, eliminate software interlocks, compiler driven branch-prediction, other minor enhancements. This first implemented MIPS R6000 microprocessor. MIPS-3 64-bit ISA. addition supporting MIPS-1 MIPS-2 instructions, MIPS3 contains 64-bit equivalents certain earlier instructions that sensitive operand size (e.g. load double load word both supported), including doubleword (64-bit) data movement arithmetic. This first implemented R4000 clean transition from existing 32-bit architecture. MIPS-4 adds instructions improve floating point performance, such multiply-add, conditional move instructions. This first found MIPS R8000, also present R10000 R5000. 64-bit ISA. addition, implemented small extensions ISA, notably RC4650 RC4640. Although they strictly "MIPS extensions," they were added cooperation with MIPS allocation opcodes. Similar additions were also made RISCore32300 core (RC32364 part) which, while being MIPS2 core, implements some features MIPS-4 that involve 64-bit-ness, also adds instructions altogether. Introduction MIPS CISC Architectures Although MIPS architecture fairly straight-forward, there features, visible only assembly programmers, that appear surprising first. addition, operations familiar CISC architectures irrelevant MIPS architecture. example, MIPS architecture does mandate stack pointer stack usage; thus, programmers surprised find that push/pop instructions exist directly. instructions 32-bits long: mentioned above. This means, example, that impossible incorporate 32-bit constant into single instruction. "load immediate" instruction limited 16-bit value; special "load upper immediate" must followed immediate" 32-bit constant value into register. Note that this true even 64-bit instructions. That opcodes remain encoded 32-bits, even though data operated upon 64-bit. Instruction actions must pipeline: actions only carried designated pipeline phase, must complete clock. example, register writeback phase provides just value stored register file, instructions only change register. 3-operand instructions: arithmetic/logical operations don't have specify memory locations, there plenty instruction bits define independent source destination register. Compilers love 3-operand instructions, which give optimizers more scope improve code which handles complex expressions. registers: compilers like large (but necessarily large) number registers, there cost context-saving encoding registers used instruction. Register always returns zero, give compact encoding that useful constant. condition codes MIPS architecture does provide condition code flags implicitly arithmetical operations. motivation make sure that execution state stored place register file. Conditional branches MIPS) test single register sign/zero, pair registers equality/inequality. Memory references always register loads stores arithmetic memory variables complicates, therefore, slows down pipeline. Memory references only occur explicit load store instructions. large register file allows useful working data registers. Only data addressing mode loads stores define memory location with single base register value modified 16-bit signed displacement. Note that assembler compiler tools register, along with immediate value, synthesize additional addressing modes from this directly supported mode. Byte-addressing: instruction includes load/store operations 16-bit variables (referred byte halfword). Partial-word load instructions come flavors sign-extend zero-extend. Loads/stores must address-aligned memory word operations only load store data from single 4-byte aligned word; halfword operations must aligned half-word addresses. Techniques handle unaligned data efficiently will explained later. Jump instructions op-code field MIPS instruction bits; leaving bits define target jump. Since instructions 4-byte aligned memory least-significant address bits need stored, allowing address range 256Mbytes. Rather than make this branch PC-relative, this interpreted absolute address within 256Mbyte "segment". theory, this could impose limit size single program; reality, hasn't been problem. Branches segment achieved using instruction, using contents register target. Conditional branches have 16-bit displacement field (218 byte range since instructions 4byte aligned) which interpreted signed PC-relative displacement. Compilers only code simple conditional branch instruction, they know that target will within 128Kbytes instruction following branch. MIPS-4 does allow register+register addressing floating-point operands. Introduction Operations Directly Supported byte halfword arithmetic arithmetical logical operations performed 32-bit 64-bit) quantities. Byte and/or halfword arithmetic would require significant extra resources, many more op-codes. Where program explicitly does arithmetic short char, compiler must insert extra code ensure that wraparound overflows have appropriate effect. special stack support: conventional MIPS assembler usage does define register, hardware treats just like other register. There recommended format stack frame layout subroutines, that programs modules from different languages compilers. recommended that programmers stick these software conventions, there hardware requirements. Minimal subroutine overhead: There special feature; jump instructions have "jump link" option which stores return address into register. default, convenience, convention, becomes "return address" register. Minimal interrupt overhead MIPS architecture makes very presumptions about system exception handling, allowing fast response wide variety software models. RC30xx family, stashes away restart location special register EPC, modifies machine state just enough signal trap happened, disallow further interrupts; then jumps single predefined location. Everything else determined software. Note: interrupt trap, MIPS does store anything stack, write memory, preserve registers itself. convention, registers ($k0, $k1) reserved that interrupt/trap routines "bootstrap" themselves-it impossible anything MIPS without using some registers. program running system which takes interrupts traps, values these registers change time, thus should used. MIPS does have asynchronous integer multiply/divide unit. With special output registers, multiply unit relatively independent rest CPU. Programmers MIPS CPUs must also aware certain MIPS pipeline effects. Specifically, results certain operations available next instruction; programmer needs explicitly aware such cases. branch branch addr branch delay branch target Figure pipeline branch delays Delayed branches pipeline structure MIPS (see pipeline branch delays) means that when jump instruction reaches "execute" phase program counter generated, instruction after jump will already have been decoded. Rather than discard this potentially useful work, architecture rules state that instruction after branch always executed before instruction target branch. "branch likely" instructions introduced MIPS-2 ISA, delay slot "nullified" conditional branch taken. Introduction Operations Directly Supported pipeline branch delays show that special path provided through make branch address available half-clock early, ensuring that there only cycle delay before outcome branch determined appropriate instruction flow (branch taken taken) initiated. responsibility compiler system assembler-programmer allow for, frequently, instruction which would otherwise have been placed before branch moved into delay slot. Where nothing useful done, delay slot filled with "nop" (no-op, no-operation) instruction. Many MIPS assemblers will hide this feature from programmer unless explicitly told described later. Load data available next instruction another consequence pipeline that load instruction's data arrives from cache/memory system AFTER next instruction's phase starts possible data from load following instruction. Figure pipeline load delays sequence. MIPS-1 architecture, programmer must insure that this rule violated. load D-cache load delay data Figure pipeline load delays Again, most assemblers will hide this they can. Frequently, assembler move instruction which independent load into load delay slot; worst case, insert insure proper program execution. MIPS-2 does require placed unfilled load delay slots. simplify assembly level programming, MIPS Corp's assembler (and many other MIPS assemblers) provides "synthetic" instructions. synthetic instruction common assembly level operation that assembler will into more operating instruction. This mapping more intelligent than mere macro expansion. example, immediate load into instruction datum small enough, multiple instructions datum larger. These instructions dramatically simplify assembly level programming assembly code readability. This obviously useful, confusing. This manual will synthetic instructions sparingly, indicate when happens. Moreover, instruction tables below will consistently distinguish between synthetic machine instructions. These features help human programmers; most compilers generate instructions which correspond one-for-one with machine code. However, some compilers will generate synthetic instructions. These some helpful operations that assembler perform: 32-bit load immediates programmer code load with value (including memory location which will computed link time), assembler will break down into instructions load high half value. Load from memory location: programmer code load from memory-resident variable. assembler will normally replace this loading temporary register with high-order half variable's address, followed load whose displacement low-order half address. course, this does apply variables defined inside functions, which implemented either registers stack. Efficient access memory variables some programs contain many references static extern Introduction Operations Directly Supported variables, two-instruction sequence load/store them expensive. Some compilation systems, with run-time support, around this. Certain variables selected compile/assemble time default MIPS Corp's assembler selects variables which occupy less bytes storage) kept together single section memory which must smaller than 64Kbytes. run-time system then initializes register ($28 (global pointer) convention) point middle this section. Loads stores these variables coded relative load store. More types branch condition: assembler synthesizes full branches conditional arithmetic test between registers. Simple different forms instructions unary operations such produced with zero-valued register Two-operand forms 3-operand instructions written; assembler will result back into first-specified register. Hiding branch delay slot: normal coding most assemblers will allow access branch delay slot, re-organize instruction sequence substantially search something useful delay slot. assembler directive, .set noreorder, available where this must happen. Hiding load delay many assemblers will detect attempt result load next instruction, will either move code around insert (for MIPS-1). Unaligned transfers ``unaligned'' load/store instructions will fetch halfword word quantities correctly, even target address turns unaligned. Other pipeline corrections some instructions (such those which integer multiply unit) have additional constraints that implementation specific (see Appendix hazards). Many assemblers will just "handle" these cases automatically, least warn programmer about possible hazards violations. Other optimizations some MIPS instructions (particularly floating point) take multiple clocks produce results. However, hardware "interlocked", programmer does need aware these delays write correct programs. MIPS Corporation's assembler particularly aggressive these circumstances will perform substantial code movement make faster. This need considered when debugging. general, best dis-assembler utility disassemble resulting binary during debug. This will show system designers true code sequence being executed "uncover" modifications made assembler. This chapter describes assembly programmer's view architecture, terms registers, instructions, computational resources. This viewpoint corresponds assembly programmer writing user applications. Information about kernel software development (such handling interrupts, traps, cache memory management) described later chapters. There general purpose registers: $31. These bits wide RC30xx, bits wide RC4xxx RC5000. Two, only two, special hardware: always returns zero, writes ignored. used normal subroutine-calling instructions (jal, bgezal, bttzal) return address. Note that call-by-register version (jalr) register return address, though commonly also uses $31. other respects, registers identical used instruction. There programmer visible program counter. subroutine transfer instructions store link register, which used return from subroutine. Also, there condition codes status bits needed userlevel programmer. There registers associated with integer multiplier. These registers-referred "HI" "LO"-contain product result multiply operation quotient remainder divide. result multiplication 128-bits case RC4xxx 64-bits case RC3xxx. HI/LO also function accumulators "multiply-accumulate" instructions mad/madu RC4650 RC32364. RC4650 RC32364 also have true operand multiply instruction which does HI/LO registers all. floating point math co-processor (called floating point accelerator, also some times referred this manual), available, adds floating point registers1; simple assembler language they just called again fact that these floating point registers implicitly defined instruction. Actually, case RC30xx, only even-numbered registers usable math; they used either single-precision bit) double-precision (64-bit) numbers. When performing double-precision arithmetic, higher numbered register holds low-order bits even numbered register specified instruction. Only moves between integer FPA, load/store instructions, will refer odd-numbered registers. RC4600/4700/RC5000 offers full 64-bit operations floating point unit configured following ways: When Status register equals floating point unit configured sixteen 64-bit registers double-precision values thirty-two 32-bit registers single-precision values. When Status register equals floating point unit configured thirtytwo 64-bit registers. Each register hold single- double-precision values. RC4650 supports single precision floating point math only. floating point unit configured following ways: When Status register equals floating point unit configured sixteen 32-bit single-precision registers. also different registers called ``co-processor registers'' control purposes. These typically used manage actions/state FPA, should confused with data registers. MIPS Architecture Conventional Names Uses General-Purpose Registers When Status register equals floating point unit configured thirtytwo 32-bit single-precision registers. Some processors also support (RC30xx RC32364, RC4600, RC4700, RC5000). RC4650 only supports base-bounds translation. There dedicated registers handle memory address translation. Although hardware makes rules about registers, their practical governed number conventions. These conventions allow inter-changeability tools operating systems well library modules compiler calling conventions that must strictly followed. With conventional uses registers, conventional names. Given need with conventions, conventional names pretty much mandatory. common names described Table 2.1. 8-15 24-25 16-23 zero v0-v1 a0-a3 t0-t7 t8-t9 s0-s7 Always returns writes ignored. (assembler temporary) Used assembler (for synthetic instruction expansion) Values (except returned subroutine (arguments) First four parameters subroutine (temporaries) subroutines without saving Subroutine "register variables"; subroutine, which will change these, must save value restore before exits, calling routine sees their values preserved. Reserved interrupt/trap handler. global pointer some runtime systems maintain this give easy access "static" "extern" variables. stack pointer register variable. Subroutines which need this ``frame pointer''. Return address subroutine Table Conventional Register Names 26-27 k0-k1 s8/fp this register often used inside synthetic instructions generated assembler. programmer must explicitly, directive.set noat stops assembler from using (there some synthetic instructions that cause assembler issue warnings). v0-v1 used when returning non-floating-point values from subroutine. return anything bigger than registers, memory must used (described later chapter). a0-a3 used pass first four integer parameters subroutine, different mixture integer floating point parameters. actual convention fully described later chapter. t0-t9 convention, subroutines these values without preserving them. This makes them easy "temporaries" when evaluating expressions caller must assume that they will destroyed subroutine call. s0-s8 convention, subroutines must guarantee that values these registers exit same they were entry either using them, saving them stack restoring before exit. k0-k1 reserved trap/interrupt routines, which will restore their original value; they little anyone else. MIPS Architecture Integer Multiply Unit Registers (global pointer). compilation systems loaders support supported, will point load-time-determined location midst your static data. This means that loads stores data lying within 32Kbytes either side value performed single instruction using base register. Without global pointer, loading data from static memory area takes instructions: load most significant bits 32-bit constant address computed compiler loader, data load. compiler must know compile time that datum will linked within 64Kbyte range memory locations. practice only guess. usual practice "small" global data items area pointed linker fail gets big. definition what "small" typically specified with compiler switch (most compilers "-G"). most common default size bytes less. (stack pointer). Since takes explicit instructions raise lower stack pointer, generally done only subroutine entry exit; responsibility subroutine being called this. normally adjusted, entry, lowest point that stack will need reach point subroutine. compiler access stack variables constant offset from Stack usage conventions explained later chapter. (also known s8). subroutine will "frame pointer" keep track stack extends stack run-time. Some languages this explicitly (for many toolchains); programs, which "alloca" library routine, will this case, possible access stack variables from initialized function prologue constant position relative function's stack frame. Note that "frame pointer" subroutine call called subroutines that frame pointer; subroutine must preserve value (return address). entry subroutine, holds address which control should returned subroutine typically ends with instruction ``jr ra''. Subroutines, which themselves call subroutines, must first save usually stack. multiply unit consumes small amount area dramatically improves performance (and cache performance) over "multiply step" operations. It's basic operation multiply 32-bit values together produce 64-bit result, which stored 32-bit registers (called "hi" "lo") which private multiply unit. Instructions mfhi, mflo defined copy result into general registers. RC4xxx, 64-bit values multiplied produce 128-bit result. However, case RC4xxx, operands 32-bits long only, they must valid sign-extended values. high level language programming this issue, compiler will take care sign extension requirements; should checked when porting assembler-level code from RC30xx RC4xxx. Unlike results integer operations, multiply result registers interlocked. attempt read results before multiplication complete results being stopped until operation completes. integer multiply unit will also perform integer division between values general-purpose registers; this case ``lo'' register stores quotient, ``hi'' register remainder. RC30xx family, multiply operations take clocks division takes Instruction cycle timing multiply double multiply (64-bit) well divide double divide members RC4xxx family RC32364 listed Table 3-operand multiply (MUL) multiply-add (MAD) available RC4650 RC32364 only. Multiply-subtract (MSUB) available RC32364 only. MIPS Architecture Instruction Types MULT/U DIV/U DMULT/U DDIV/U MAD/U MSUB/U Table Multiply Divide Instruction Cycle Timing assembler synthetic multiply operation which starts multiply then retrieves result into ordinary register. Note that assembler even substitute series shifts adds multiplication constant, improve execution speed. Multiply/divide results written into "hi" "lo" soon they available; effect deferred until writeback pipeline stage, with writes general purpose (GP) registers. mfhi mflo instruction interrupted some kind exception before reaches writeback stage pipeline, will aborted with intention restarting However, subsequent multiply instruction which passed stage will continue parallel with exception processing) would overwrite "hi" "lo" register values, that re-execution mfhi would wrong (i.e. new) data. this reason recommended that multiply should started within instructions mfhi/mflo. assembler will avoid doing this when possible. Compilers will often generate code trap errors, particularly divide zero. Frequently, this instruction sequence placed after divide initiated, allow execute concurrently with divide (and avoid performance loss). Instructions mthi, mtlo defined setup internal registers from general-purpose registers. They essential restore values ``hi'' ``lo'' when returning from exception, probably anything else. RC4650 RC32364 provide couple multiplication instructions that apart from other members family. (multiply accumulate) instruction unsigned counterpart madu "hi" "lo" registers accumulators. addition these, another instruction offers true operand multiplication eliminates extra step moving result from "lo" register general purpose register. RC32364 provides more multiplication instruction. MSUB (multiply subtract) instruction unsigned counterpart MSUBU. These similar MADU except that they subtract from accumulator instead adding full list RC30xx family integer instructions presented Appendix Floating point instructions listed Appendix this manual. integer floating point instructions listed appendixes this manual. MIPS uses three instruction encoding formats. most part, instructions numerical order. Occasionally, simplify reading, list re-ordered clarity. instruction encodings have been chosen facilitate design high-frequency CPU. Specifically: instruction encodings reveal portions internal design. Although there variable encodings, those fields which required very early pipeline encoded very regular way: Source registers always same place that fetch instructions from integer register file without conditional decoding. Some instructions need both registers MIPS Architecture Loading Storing: Addressing Modes since register file designed provide source values every clock nothing been lost. 16-bit constant always same place permitting appropriate instruction bits directly into ALU's input multiplexer, without conditional shifts. Throughout this manual, description various instructions will also refer various subfields instruction, follows: basic op-code, bits long. Instructions with large sub-fields (for example, large immediate values, such required "long" j/jal instructions, arithmetic with 16bit constant) have unique "op" field. Other instructions classified groups sharing "op" value, distinguished other fields ("op2" etc.). fields identifying source registers. register written this instruction. Shift-amount: shift, used shift-by-constant instructions. Sub-code field used 3-register arithmetic/logical group instructions value zero). 16-bit signed word offset defining destination "PC-relative" branch. branch target will instruction offset words away from delay slot instruction; branch-to-self offset 26-bit word address jumped corresponds 28-bit byte address, which always word-aligned). high-order bits target address can't specified this instruction, taken from address jump instruction. This means that these instructions reach anywhere 256Mbyte region around instructions' location. jump further (jump register) instruction. 16-bit integer constant "immediate" arithmetic logic operations. Arithmetic logical sign extended (such sign-xtnd zero-xtnd). another extended opcode field, this time used ``co-processor'' type instructions. Field which hold source destination register. Field hold number control register (different from integer register file). Called ``crs''/``crd'' contexts where must source/destination respectively. rs1, offset target constant operation dest-reg, offset(src-reg) e.g.:lw offset($2); offset($4) mentioned above, there only basic addressing mode. load store machine instruction written integer registers used destination source. offset sign extended integer, 16-bit number anywhere between -32768 32767); program address used load dest-reg offset. This address mode normally enough select particular member structure ("offset" being distance between start structure member required); array indexed constant; also enough reference function variables from stack frame pointer; provide reasonable sized global area around value static extern variables. assembler synthesizes simple direct addressing mode, load values memory variables whose address computed link time. More complex modes such double-register scaled index must implemented with more instructions. MIPS Architecture long long long short char Notes: word word halfword byte Data Types Memory Registers RC30xx family CPUs load store between bytes single operation. Naming conventions used documentation build instruction mnemonics: doubleword "#$%& ``w'' ``w'' ``h'' ``b'' RC5000 only. compilers RC4xxx will allow efficient 64-bit integer math with special compile-time switch (e.g. mint64 switch IDT/C), where integer size bytes assembler instruction "ld/sd" used load/store bytes time. Some MIPS-III instruction; RC4xxx Table Naming Conventions Byte halfword loads come flavors: Sign-extend load value into least significant bits 32/64-bit register, fill high order bits copying ``sign bit'' (bit byte, half-word). This correctly converts signed value 32/64-bit signed integer. Zero-extend instructions load value into least significant bits 32/64-bit register, with high order bits filled with zero. This correctly converts unsigned value memory corresponding 32/64-bit unsigned integer value; byte value becomes 32/64-bit value 254. example, value 0xFE (-2, interpreted unsigned), then: 0(t1) 0(t1) will leave holding value 0xFFFF FFFE signed 32-bit) holding value 0x0000 00FE (254 signed unsigned 32-bit). Subtle differences shorter integers extended longer ones historical cause portability problems, modern standard elaborate rules. machines like MIPS, which does support 16-bit precision arithmetic directly, expressions involving short char variables less efficient than word operations. Loads stores MIPS architecture must aligned. Half-words must loaded from 2-byte boundaries, words from 4-byte boundaries; RC4xxx family, double words must loaded from 8-byte boundaries. load instruction with unaligned address will cause trap. needed, software provide trap handler which will emulate desired load operation hide this feature from application, substantial performance cost. MIPS architecture provides hardware mechanism access unaligned data. machine instructions (load word left), (load word right), (store word left) (store word right). RC4600/4700/5000, equivalent 64-bit instructions (load double left), (load double right), (store double left) (store double right) which deal with bytes opposed described this section. loads four bytes from least significant portion word starting from specified address high (left) portion destination register; loads from four bytes from most significant portion word starting from specified address (right) portion register. load word into register from arbitrary address register sequence MIPS Architecture 0(a0) 3(a0) 0(a0) 3(a0) Data Types Memory Registers endian machine sequence little endian machine (see diagram below). This sequence generated macro-instruction Memory Register (unaligned load word). macro-instruction (unaligned load half) also provided, synthesized loads shift. Note that allows instruction pairs same destination register without intervening instruction; however, least instruction must executed between instruction pair using value destination register. stores four bytes from high (left) portion source register least significant portion word starting from specified address; stores from four bytes from (right) portion register most significant portion word starting from specified address. store word from register arbitrary address register sequence 0(a0) 3(a0) 0(a0) 3(a0) endian machine sequence little endian machine (see diagram below). Note that uses hardware control effect Memory Register partial word writes; will work destination device does honor byte enables, whereas will work with word-wide device. MIPS Architecture Data Types Memory Registers data items declared code will correctly aligned default. certain embedded applications such intelligent networking datacom, data structures forced have unaligned data data structures packed bytes between data structures between fields within structure force alignment minimize memory usage. such cases, programmer required descend assembler coding deal with unaligned data accesses. Some compilers, such IDT/C compiler, provide mechanism achieve unaligned data accesses through itself. keyword _attribute_ allows programmer specify special attributes variables structure fields. This keyword followed attribute specification inside double parentheses. attribute interest achieving unaligned data accesses "packed". "packed" attribute forces byte alignment fields data structure. compiler uses lwl/lwr loading swl/swr storing unaligned data. following code does "packed" attribute. Study assembler code generated after compiling: Begin code struct char x[2] foo; Here generated assembler code when "packed" used: Begin code struct char x[2] _attribute_ ((packed)) main() foo.a foo.x[0] foo.x[1] code; begin partial listing assembler code generated from above code*/ 800201c8 <main+18> $v1,65 800201cc <main+1c> $v1,0($v0) 800201d0 <main+20> $v1,18 800201d4 <main+24> $v1,1($v0) note offset byte 800201d8 <main+28> $v1,4($v0) 800201dc <main+2c> $v1,37 800201e0 <main+30> $v1,5($v0) 800201e4 <main+34> $v1,8($v0) assembler code IDT/C compiler efficient enough recognize that field larger than certain number bytes, better lwl/lwr swl/swr pairs entire data transfer, that smarter pairs only point reaching word alignment beyond which regular instructions prove more efficient until point where less than bytes remain transferred using lwl/lwr swl/swr again. Note that "packed" attribute works only structures simple variables such char. achieve packing simple variable, inside structure with that variable only element. MIPS Architecture Basic Address Space RC3xxx This allows programmer load single-precision values load into even-numbered floating point register; programmer also load double-precision value macro instruction, that: ldc1 $f2, 24(t1) expanded loads consecutive registers: lwc1 lwc1 $f2, 24(t1) $f3, 28(t1) compiler aligns 8-byte long double-precision floating point variables 8-byte boundaries. RC30xx family hardware does require this alignment; done avoid compatibility problems with implementations MIPS-2 MIPS-3 CPUs such RC4600 (64-bit RISController), where ldc1 instruction machine instruction alignment necessary. ())) which MIPS processors handle addresses subtly different from that traditional CISC CPUs, appear confusing. Read first part this section carefully. Here some guidelines: addresses into programs rarely same physical addresses which come chip (sometimes they're close, same). This manual will refer them program addresses physical addresses respectively. more common name program addresses "virtual addresses"; note that term "virtual address" does necessarily imply that operating system must perform virtual memory management (e.g. demand paging from disks.), rather that address undergoes some transformation before being presented physical memory. Although virtual address proper term, this manual will typically term "program address" avoid confusing virtual addresses with virtual memory management requirements. However, should remembered that always uses virtual (program) addresses, which translated physical addresses. typical operating modes: user kernel. user mode, address above 2Gbytes (most-significant address set) illegal causes trap. Also, some instructions cause trap user mode. 32-bit program address space divided into four areas with traditional names; different things happen according area address lies kuseg 0000 0000 7FFF FFFF (low 2Gbytes): these addresses permitted user mode. machines with MMU, they will always translated (more about later chapter). Software should attempt these addresses unless RC30xx CPUs without MMU, kuseg "program address" transformed physical address adding offset; address transformations "base versions" RC30xx family described later this chapter. Note, however, that many embedded applications this address segment (those applications which require that kernel resources protected from user tasks). 0x8000 0000 9FFF FFFF (512 Mbytes): these addresses ``translated'' into physical addresses merely stripping bit, mapping them contiguously into Mbytes physical memory. This transformation operates same both "base" family members. This segment referred "unmapped" because version devices cannot redirect this translation different area physical memory. Addresses this region always accessed through cache, used until caches properly initialized. They will used most programs data systems using "base" family members; will used kernel systems which ("E" version devices). 0xA000 0000 BFFF FFFF (512 Mbytes): these addresses mapped into physical addresses stripping leading three bits, giving duplicate mapping Mbytes physical memory. However, kseg1 program address accesses will cache. kseg0 kseg1 MIPS Architecture Summary RC3xxx System Addressing kseg1 region only chunk memory which guaranteed behave properly from system reset; that's after-reset starting point (0xBFC0 0000, commonly called "reset exception vector") lies within physical address starting point 0x1FC0 0000 which means that hardware should place boot this physical address. Software will therefore this region initial program ROM, most systems also registers. general, devices should always mapped addresses that accessible from Kseg1, system always mapped contain reset exception vector. Note that code then accessed uncacheably (during boot using kseg1 program addresses, also accessed cacheably (for normal operation) using kseg0 program addresses. 0xC000 0000 FFFF FFFF Gbyte): this area only accessible kernel mode. kuseg, devices program addresses translated into physical addresses; thus, these addresses must referenced prior initialization. "base versions", physical addresses generated same program addresses kseg2. Note that many systems will need this region. versions, frequently contains structures such page tables; simpler OS'es probably will have little need kseg2. case RC32364, kseg2 actually from 0xC000 0000 0xFEFF FFFF (1008 Mbytes). Mbytes space beyond that reserved memory mapped on-chip registers (in-circuit emulator). kseg2 ())) MIPS program addresses rarely simply same physical addresses, simple embedded software will probably addresses kseg0 kseg1, where program address related obvious unchangeable physical addresses. Physical memory locations from 0x2000 0000 (512Mbyte) upward difficult access. versions RC30xx family, only reach these addresses through MMU. "base" family members, certain these physical addresses reached using kseg2 kuseg addresses: address transformations base RC30xx family members described later this chapter. kernel mode (the resets into this state), program addresses accessible. user mode: Program addresses above 2Gbytes (top set) illegal will cause trap. Note that MMU, this means valid user mode addresses must translated MMU; thus, User mode devices RC32364 typically requires memorymapped "base" CPUs, kuseg addresses mapped distinct area physical memory. Thus, kernel memory resources (including devices) made inaccessible User mode software, without requiring memory-mapping function from Alternately, hardware choose "ignore" high-order address bits when performing address decoding, thus "condensing" kuseg, kseg2, kseg1, kseg0 into same physical memory. Instructions beyond standard user become illegal. Specifically, kernel prevent User mode software from accessing on-chip (system control coprocessor, which controls exception machine state performs memory management functions CPU). Thus, primary differences between User Kernel modes are: User mode tasks inhibited from accessing kernel memory resources, including data structures devices. This also means that various user tasks protected from each other. User mode tasks inhibited from modifying basic machine state, prohibiting accesses CP0. Note that kernel/user mode does change interpretation anything just some things cease allowed user mode. kernel mode access addresses just user mode, they will translated same way. MIPS Architecture treatment kseg0 kseg1 addresses same RC30xx CPUs. system implemented using only physical addresses 512Mbytes, system software written only kseg0 kseg1, then choice "base" versions RC30xx family relevant. versions without ("base versions"), addresses kuseg kseg2 will undergo fixed address translation, provide system designer option provide additional memory. base members RC30xx family provide following address translations kuseg kseg2 program addresses: kuseg this region (the 2Gbytes program addresses) translated contiguous 2Gbyte physical region between 1-3Gbytes. effect, offset added each kuseg program address. hex: Memory CPUs without hardware 0x0000 0000 0x7FFF FFFF 0x4000 0000 0xBFFF FFFF kseg2 these program addresses genuinely untranslated. program addresses from 0xC000 0000 0xFFFF FFFF emerge identical physical addresses. This means that "base" versions generate most physical addresses (without MMU), except between 512Mbyte 1Gbyte (0x2000 0000 through 0x3FFF FFFF). noted above, many systems ignore high-order address bits when performing address decoding, thus condensing physical memory into lowest 512MB addresses. $%&'( $%%)' RC3041 RC32364 CPUs configured access different regions memory either 32-, 8-bits wide. Where program requests 32-bit operation narrow memory (either with uncached access, cache miss, store), break transaction into multiple data phases, match datum size memory port width. width configuration applied independently subsegments normal kseg regions, follows: kseg0 kseg1: usual, these both mapped onto 512Mbytes. This common region split into subsegments (64Mbytes each), each which programmed 32-bits wide. width assignment affects both kseg0 kseg1 accesses (that view these subsegments corresponding "physical" addresses). kuseg divided into four 512Mbyte subsegments, each independently programmable width. Thus, kuseg broken into multiple portions, which have varying widths. example this 32-bit main memory with some 16-bit PCMCIA font cards 8-bit NVRAM. kseg2 divided into 512Mbyte subsegments, independently programmable width. Again, this means that kseg2 support multiple memory subsystems, varying port width. Note that once various memory port widths have been configured (typically boot time), software does have aware actual width memory system. choose treat memory 32-bit wide, will automatically adjust when access made narrower memory region. This simplifies software development, also facilitates porting various system implementations (which choose same memory port widths). MIPS Architecture RC36100 Address Translation %)(&& When 36100 processor operating Kernel mode, four distinct virtual address segments simultaneously available. segments are: kuseg. kernel assert same virtual address user process, have same virtual-to-physical address translation performed translation user task. This facilitates kernel having direct access user memory regions. virtual-to-physical address translation, including Port Size attributes, identical with User mode addressing this segment. kseg0. Kseg0 512MB segment, beginning virtual address 0x8000_0000. This segment always translated linear 512MB region physical address space starting physical address references through this segment cacheable. When most significant three bits virtual address "100", virtual address resides kseg0. physical address constructed replacing these three bits virtual address with value "000". these references cacheable, kseg0 typically used kernel executable code some kernel data. kseg1. Kseg1 also 512MB segment, beginning virtual address 0xa000_0000. This segment also translated directly 512MB physical address space starting address references through this segment uncacheable. When most significant three bits virtual address "101", virtual address resides kseg1. physical address constructed replacing these three bits virtual address with value "000". Unlike kseg0, references through kseg1 cacheable. This segment typically used registers, boot code, operating system data areas such disk buffers. kseg2. This segment analogous kuseg, accessible only from kernel mode. This segment contains linear addresses, beginning virtual address 0xc000_0000. with kuseg, virtual-to-physical address translation depends whether processor base extended architecture version. When most significant bits virtual address "11," virtual address resides 1024MB segment kseg2. virtual-to-physical translation done either through (extended versions processor) through direct segment mapping (base versions). operating system would typically this segment stacks, per-process data that must remapped context switch, user page tables, some dynamically allocated data areas. Base versions RC30xx family (including RC36100) distinguishable from extended versions software examining (TLB Shutdown) Status Register after reset, before used. immediately after reset, indicating that non-functional, then current processor base version architecture. cleared after reset, then software executing extended architecture version processor. Processor Revision Identifier (PRId) register used distinguish RC36100 from other members RC30xx family. (,-. Processors that only implement base versions memory management perform direct segment mapping virtual-to-physical addresses, illustrated Figure 2.1. mapping kuseg kseg2 performed follows: Kuseg always translated contiguous region physical address space, beginning location 0x4000_0000. That value "00" highest order bits virtual address space translated value "01", "01" translated "10", with remaining bits virtual address unchanged. Virtual addresses kseg2 directly output physical addresses; that references kseg2 occur with physical address unchanged from virtual address. Virtual addresses kseg0 kseg1 both translated identically same physical address region. base versions architecture allow kernel software protected from user mode accesses, without requiring virtual page management software. User references kernel virtual address will result address error exception. MIPS Architecture RC36100 Address Translation Note that special areas virtual address space shown Figure translated physical addresses identically with remainder their virtual address segment. RC30xx family, these address areas were indicated "reserved" compatibility with future devices. VIRTUAL 0xffffffff 0xfff00000 0xffefffff On-chip registers (uncached) Kernel Cached (kseg2) Kernel Uncached (kseg1) Kernel Cached (kseg0) PHYSICAL On-chip registers (uncached) Kernel Cached Tasks 1023 0xffffffff 0xfff00000 0xffefffff 0xc0000000 0xbfffffff 0xa0000000 0x9fffffff 0x80000000 0x7fffffff 0x7ff00000 0x7fefffff Cache Miss Space 0xc0000000 0xbfffffff 0xbff00000 0xbfefffff Cache Miss Space Kernel/User Cached Tasks 2047 Kernel/User Cached (kuseg) Inaccessible Kernel Boot 0x40000000 0x3fffffff 0x20000000 0x1fffffff 0x00000000 0x00000000 Figure Virtual-to-Physical Address Translation RC36100 Some systems elect protect external physical memory well. That system include distinct memory devices which only accessed from kernel mode. physical address output determines whether reference occurred from kernel user mode, according Table 2.4. Some systems wish limit accesses some memory devices those physical address bits which correspond kernel mode virtual addresses. Alternately, some systems wish have kernel user tasks share common areas memory. Those systems could choose have their address decoder ignore high-order physical address bits, compress memory into lower region physical memory. high-order physical address bits useful privilege mode status outputs these systems. $-+' `000' `001' '01x' '10x' '11x' Kseg0 Kseg1 Inaccessible Kuseg Kuseg Kseg2 Table Virtual Physical Address Relationships Base Versions MIPS Architecture /,.0/1. Basic Address Space RC4600/RC4700 Readers interested RC4x00 have skipped preceding sections because sections pertain RC30xx, advised review those sections before proceeding. Some general comments regarding MIPS architecture those sections relevant even RC4xxx processors. Unlike RC30xx family, RC4xxx family does have "base versions." RC4600/RC4700 processors have memory management units (MMU). RC4600/RC4700 uses on-chip Translation Lookaside Buffer (TLB) translate program addresses physical addresses. RC4600/RC4700 modes operation: User, Supervisor Kernel. RC4600/RC4700, program address space either 32-bits 64-bits wide depending mode operation setting corresponding extended address Status Register (UX, KX); addresses 32-bits wide, they 64-bits wide. With 36-bit Physical Address, total Gigabytes physical address space available. Depending mode operation processor, different program address spaces become available follows: User User mode, single, contiguous program address space called available. size Gbytes (231) 32-bit mode called useg. 64-bit mode size 1Tbyte (240) space label xuseg. Legal 32-bit addresses 0x0000 0000 0x7FFFF FFFF, 64-bit addresses 0x0000 0000 0000 0000 0x0000 00FF FFFF FFFF. Presenting addresses outside these ranges while processor User mode results Address Error exception. Cache accessibility controlled settings entries. Super Supervisor mode designed layered operating systems which true kernel runs Kernel mode described later, rest runs Supervisor mode. 32-bit Supervisor mode, spaces named User Space Supervisor Space addressed. Their labels suseg sseg respectively. Gbytes suseg between 0x0000 0000 0x7FFF FFFF. sseg Mbytes, from 0xC000 0000 0xDFFF FFFF. 64-bit Supervisor mode, three spaces named User Space (xsuseg), Current Supervisor Space (xsseg) Separate Supervisor Space (csseg) available. Tbyte xsuseg from 0x0000 0000 0000 0000 0x0000 00FF FFFF FFFF. xsseg goes from 0x4000 0000 0000 0000 till 0x4000 00FF FFFF FFFF, also Tbytes long. Addressing csseg compatible with addressing sseg 32-bit mode; begins 0xFFFF FFFF C000 0000 ends 0xFFFF FFFF DFFF FFFF, covering Mbytes. Kernel processor enters Kernel mode when: set, set, Mode Kernel exceptions, either will set. processor remains exception mode until instruction return from exception (eret) executed, which point mode existing prior detection exception restored. Kernel-mode program address space shown Figure 2.2. MIPS Architecture Basic Address Space RC4600/RC4700 32-bit FFFF FFFF Mapped 00xE000 0000 Mapped C000 0000 Unmapped Uncached Unmapped Cached kseg1 kseg3 64-bit FFFF FFFF FFFF FFFF FFFF FFFF E000 0000 FFFF FFFF C000 0000 ksseg Mapped Mapped Unmapped Uncached Unmapped Cached Address Error Mapped ckseg3 cksseg ckseg1 FFFF FFFF A000 0000 FFFF FFFF 8000 0000 C000 00FF 8000 0000 kseg0 ckseg0 A000 0000 xkseg 8000 0000 C000 0000 0000 0000 Unmapped 8000 0000 0000 0000 xkphys Mapped kuseg 4000 0100 0000 0000 Address Error Mapped xksseg 4000 0000 0000 0000 0000 0100 0000 0000 Address Error Mapped 0000 0000 0000 0000 0000 0000 xkuseg Figure Kernel Mode Address Space References kseg0 kseg1 mapped through TLB. physical address defined low-ordered bits program address kseg0 kseg1. cacheability coherency kseg0 determined settings Config register while kseg1 never cacheable. 64-bit xkuseg offers special feature handler. Status register set, segment becomes unmapped, uncached space allowing exception code operate uncached using base register. segment xkphys physical spaces, each bytes long. References these spaces through TLB; physical address taken from bits 35:0. bits 61:59 program address determine cacheability coherency shown Table regions cksegx compatible with their 32-bit counterparts ksegx. $-+' Cacheable, noncoherent, write-through, write allocate Cacheable, noncoherent, write-through, write allocate Uncached Cacheable, noncoherent Reserved Table Cacheability Coherency 0x8000 0000 0000 0000 0x8800 0000 0000 0000 0x9000 0000 0000 0000 0x9800 0000 0000 0000 0xA000 0000 0000 0000 MIPS Architecture /,2. Basic Address Space RC4650 Readers interested RC4650 have skipped sections regarding RC30xx addressing pages back, advised review those sections before proceeding. Some general comments regarding MIPS architecture those sections relevant even RC4650. RC4650 employs simple mechanism support mapping program addresses physical addresses. found RC4600/RC4700 replaced "base-bounds" mechanism. When program address translated, page number first compared against Bounds register. address range," base register added program address form physical address. There base-bound registers instruction addresses (IBase IBounds registers) another data (DBase DBounds). addition these registers, Cache Algorithm (CAlg) register allows cache attributes single system. processor program addresses 32-bits wide; upper 32-bits 64-bit registers ignored. Physical address space Gbytes. RC4650 operating modes, User mode Kernel mode. address spaces defined follows: address space from 0x0000 0000 0x7FFF FFFF Gbytes) labelled useg User mode. This only space available User mode. same address space available from Kernel mode well, where label kuseg. kseg0 Mbyte address space 0x8000 0000 through 0x9FFF FFFF defined kseg0 aund accessible Kernel mode only. Addresses kseg0 mapped using basebounds mechanism; their physical addresses calculated setting upper bits program addresses zero. CAlg register controls cacheability this segment. reset kseg0 cacheable. kseg1 Mbyte address space 0xA000 0000 through 0xBFFF FFFF defined kseg1 accessible Kernel mode only. Addresses kseg1 mapped using basebounds mechanism; their physical addresses calculated setting upper bits program addresses zero. CAlg register controls cacheability this segment. reset caches disabled kseg1 address space, this changed later using CAlg register. kseg2 Gbyte address space 0xC000 0000 through 0xFFFF FFFF defined kseg2 accessible Kernel mode only. Addresses kseg2 mapped using base-bounds mechanism; their physical addresses calculated setting upper bits program addresses zero. CAlg register controls cacheability this segment. Figure shows kernel mode address space. useg MIPS Architecture Basic Address Space RC4650 FFFF FFFF Unmapped kseg2 0000 A000 0000 Unmapped Uncached Unmapped Cached kseg1 kseg0 8000 0000 kuseg Mapped 0000 0000 Note: Default value; changed CAlg register. Figure Kernel Mode Address Space address translation from program physical address takes place using same algorithm data well instructions although different base-bounds registers used each case. addresses above 0x7FFF FFFF generated User mode, address error exception generated. addresses useg, bits 31:12 compared Bound register bits 30:12. program address bigger than bounds address, Bound Exception occurs. Otherwise, physical address equals (program address bits 31:12 Base register bits 31:12) concatenated with program address bits 11:0. Program address bits 31:29 used select appropriate CAlg fields determine cacheability where applicable described earlier. MIPS Architecture Basic Address Space RC4650 This chapter describes aspects MIPS architecture that must managed operating system. Most these features transparent application programmer; however, most embedded systems programmers will have view underlying system architecture will find this material important. Opcodes reserved instruction fields defined four "co-processors". Architecturally, co-processors tightly coupled base integer CPU; example, defines instructions move data directly between memory coprocessor, rather than requiring moved into integer processor first. MIPS uses term "co-processor" both traditional non-traditional sense. device traditional microprocessor co-processor: optional part architecture, with particular instruction set. MIPS also uses term "co-processor" functions required manage environment, including exception management, cache control, memory management. This segmentation insures that chip architecture varied (e.g. cache architecture, interrupt controller, etc.), without impacting user mode software compatibility. These functions grouped MIPS into on-chip "co-processor "system control co-processor" these instructions implement whole control system. Note that co-processor independent existence, certainly optional. provides standard encoding instructions which access status register; that, although definition status register changes among implementations, programmers same assembler both CPUs. Similarly, exception memory management strategies varied among implementations, these effects isolated particular portions kernel. This chapter, coupled with chapters cache management, memory management, exception processing, provide details managing machine state. areas interest include: control co-processor: privileged instructions organized, with shortform descriptions. There relatively privileged instructions; most low-level control over exercised reading writing bit-fields within special registers. Exceptions: external interrupts, invalid operations, arithmetic errors result "exceptions", where control transferred exception handler routine. MIPS exceptions extremely simple hardware does absolute minimum, allowing programmer tailor exception mechanism needs particular system. later chapter describes MIPS exceptions, they precise, exception vectors, conventions about code exception handling routines. Special problems arise with nested exceptions: exceptions occurring while still handling earlier exception. Hardware interrupts have their style rules. Exception Management chapter includes annotated example moderately-complicated exception handler. Caches cache management: current RC30xx RC4xxx implementations have dual caches (the I-cache instructions, D-cache data). On-chip hardware provided manage caches, programmer working with devices, particularly with devices, need explicitly manage caches particular situations. manipulate caches, RC30xx allows software isolate them, inhibiting cache/memory traffic allowing processor access cache were simple memory; RC30xx swap roles I-cache D-cache (the only make I-cache writable). RC4xxx provides direct access both primary caches through cache instruction. RC32364 also implements cache instruction. System Control Co-Processor Architecture Control "Co-processor Caches must sometimes cleared stale invalid/uninitialized data. Even following power-up, caches random state must cleaned before they used. later chapter will discuss techniques used software manage on-chip cache resources. addition, techniques determine on-chip cache sizes will shown (greatest flexibility achieved software written independent cache sizes). diagnostics programmer, techniques test cache memory probe particular entries will discussed. some implementations system designer make configuration choices about cache (e.g. RC3081 RC3071 allow cache organization selected between 16kB I-cache/4kB D-cache each cache). cache management chapter will also discuss some considerations apply make proper selection. Write buffer: RC30xx family CPUs D-cache always write through; writes main memory well cache. This simplifies caches, main memory won't able accept data fast write Much performance loss made using FIFO buffer write cycles (both address data). RC30xx family, this FIFO, called write buffer, integrated on-chip. RC32364 theRC4xxx, D-cache either write-back write-through. FIFO store described above also exists RC32364 RC4xxx. System programmers need know that writes happen later than code sequence suggests. chapter cache management discusses this. Reset: reset almost nothing defined, software must configure carefully. MIPS CPUs, reset implemented almost exactly same exceptions. later chapter reset initialization discusses ways finding which executing software, program run. example runtime environment, attending stack special registers, provided. Memory management TLB/Base-Bounds: later chapter will discuss address translation managing translation hardware (base-bounds mechanism RC4650 others). This section mostly programmers. Power management: RC4xxx RC323xx processors into mode called "standby" mode with WAIT instruction. this mode internal core operates considerably reduced power. more information about this topic, refer RISC Microprocessor Application Guide. Control functions implemented with registers (most which consist multiple bitfields). There several control instructions used memory management implementation, which described later this manual. Aside from MMU, control RC30xx defines just instruction beyond necessary move from control registers. mtc0rs, <nn>-Move co-processor zero Loads "co-processor register number from general register unusual, good practice, refer control registers their number assembler sources; normal practice names listed Table 3.2. some tool-chains names defined C-style "include" file, pre-processor front-end assembler; assembler manual should provide guidance this. This only setting bits control register. mfc0rd, <nn-Move from co-processor zero General register loaded with values from control register number Once again, common symbolic name macro-processor save remembering numbers. This only inspecting bits control register. rfe-Restore from exception (RC30xx) This instruction available RC30xx only. Note that this "return from exception". This instruction restores status register back state prior trap. understand what does, refer status register defined later this chapter. only secure returning user mode from exception return with instruction which delay slot. System Control Co-Processor Architecture Standard Control Registers eret-Exception return (RC4xxx) This RC4xxx instruction which actually returns from exception, interrupt error trap. Unlike branch jump instruction, eret does execute next instruction. This instruction also available RC32364. RC4xxx some additional instructions control. Doubleword counterparts mtc0/ mfc0 instructions also available dmtc0/dmfc0 which allow 64-bit transfers. wait instruction puts low-power standby mode. more information about standby mode, refer IDT79RC4600 IDT79RC4700 64-bit RISController Processor Hardware User's Manual. PRId Cause BadVaddr Config BusCtrl PortSize Count Compare Context XContext CacheErr ErrorEPC IWatch DWatch IEPC DEPC Debug TagLo type level. (status register) mode flags. Describes most recently recognized exception. Exception return address. Contains last invalid program address which caused trap. address errors kinds, even there MMU. configuration (RC3071, RC3081, RC3041, RC4xxx, RC32364 only). (RC3041 only) configure interface signals. Needs setup match hardware implementation. (RC3041 only) used flag some program address regions 16-bits wide. Must programmed match hardware implementation. (RC3041/RC4xxx/RC32364, read/write) 24-bit counter incrementing with clock. (32-bit RC4x00 RC32364). (RC3041/RC4xxx/RC32364, read/write) 24-bit value used wraparound Count value output signal. (32-bit RC4xxx RC32364). (RC4600/RC4700 only) pointer kernel virtual page table entry (PTE) 32bit address spaces. (RC4600/RC4700 only) pointer kernel virtual page table entry (PTE) 64bit address spaces. (RC4600/RC4700/RC4650/RC32364 only) secondary-cache error checking correcting (ECC) Primary parity. (RC4600/RC4700/RC4650/RC32364 only) Cache Error Status register. (RC4600/RC4700/RC4650 only) Error Exception Program Counter. (RC4650/RC32364 only, read/write) specifies instruction program address that causes Watch exception. (RC4650/RC32364 only, read/write) specifies data program address that causes Watch exception. (RC32364 only )Imprecise Exception Program Counter (RC32364 only) Debug Exception Program Counter (RC32364 only) Debug Control Status Register (RC32364 only) Cache Register Table Standard Control Registers 3/16 System Control Co-Processor Architecture Control Register Formats note about reserved fields: many unused control register fields marked "0." Bits such fields guaranteed read zero should written zero. Other reserved fields marked reserved software must write them zero should assume that will back zero, other particular value. Figure shows layout fields PRId register, read-only register. field should related control register set. reserved Figure PRId Register Format encoding described Table 3.2: RC3000A (including RC3051, RC3052, RC3071, RC3081) unique (RC3041) RC36100 RC32364 RC4600 RC4700 RC4650 0x26 0x20 0x21 0x22 undefined undefined undefined undefined Table "Imp" "Rev" values Note that when field indicates unique, revision number used distinguish among various implementations. Refer RC3041 User's manual revision level appropriate that device. Since RC3051, kernel compatible with RC3000A, they share same value. When printing value this register, conventional print them "x.y" where decimal values Rev, respectively. this register manuals size things establish presence absence particular features. software will more portable robust designed include code sequences that probe existence individual features. This manual will provide examples determine cache sizes, presence absence TLB, FPA, etc. Figure Status Register Format (RC3xxx) System Control Co-Processor Architecture Control Register Formats Note that there modes such non-translated non-cached MIPS CPUs; translation caching decisions made basis program address. Fields are: CU3, Bits (31:30) control usability "co-processors" respectively. RC30xx family, these might enabled software wishes BrCond(3:2) input pins polling, speed exception decoding. Co-processor usable: present, disable. When instructions cause interrupt exception, even kernel. useful turn even when available; also enabled devices which include FPA, intent BrCond(1) polled input. Co-processor usable: some nominally-privileged instructions user mode (this rarely, ever, done). Co-processor instructions always usable kernel mode, regardless setting this bit. Reverse endianness user mode. MIPS processors configured, reset time, with either "endianness" (byte ordering convention, discussed various CPU's User's Manuals later this manual). allows applications with byte ordering convention systems with opposite convention, presuming software provided necessary support. When active, user mode software runs been configured with opposite endianness. Boot exception vectors: when uses (kseg1) space exception entry point (described later chapter). usually zero running systems; this relocates exception vectors. addresses, speeding accesses allowing "user supplied" exception service routines. shutdown: devices that implement full RC3000A MMU, program address simultaneously matches entries. Prolonged operation this state, some implementations, could cause internal contention damage chip. shutdown terminal, cleared only hardware reset. RC30xx base family members, which include TLB, this reset; software rely this determine presence absence TLB. Parity Error: cache parity error occurred. exception generated this condition, which really only useful diagnostics. MIPS architecture cache diagnostic facilities because earlier versions used external caches, this provided verify timing particular system. those implementations cache parity error essential design debug tool. CPUs with on-chip caches this feature rarely needed; only RC3071 RC3081 implement parity over on-chip caches. Cache Miss: data cache miss occurred while cache isolated. Parity Zero: when set, cache parity bits written zero checked. This useful RC3000A systems which required external cache RAMs, little relevance RC30xx family. SwC/IsC Swap caches Isolate (data) cache. Cache mode bits cache management diagnostics. more details, chapter cache management. These bits undefined reset. system software should these known values before proceeding. makes loads stores access only data cache. this mode, partial-word store invalidates cache entry. Note that when this set, even uncached data accesses will seen bus; further, this initialized reset. Boot-up software must insure this properly initialized before relying external data references. set: reverses roles I-cache D-cache, that software access invalidate I-cache entries. Interrupt mask: field defining which interrupt sources, when active, will allowed cause exception. interrupt sources external pins (one used FPA, which although lives same chip logically external); other software-writable interrupt bits Cause register. System Control Co-Processor Architecture Control Register Formats interrupt prioritizing provided CPU. This described greater detail chapter dealing with exceptions. basic protection bits. when running with kernel privileges, user mode. kernel mode, software whole program address space, privileged ("co-processor instructions. User mode restricts software program addresses between 0x0000 0000 0x7FFF FFFF, denied permission privileged instructions; attempts break rules result exception. prevent taking interrupt, enable. previous, previous: exception, hardware takes values saves them here; same time changing values KUc, (kernel mode, interrupts disabled). instruction used copy KUp, back into KUc, IEc. old, old: exception KUp, bits saved here. Effectively, KU/IE bits operated 3-deep, 2-bit wide stack which pushed exception popped rfe. This provides opportunity cleanly recover from exception occurring early exception handling routine that first exception saved particularly useful allow user refill code made shorter, described memory management chapter. KUc/IEc KUp/IEp KUo/IEo Status register (SR) read/write register that contains operating mode, interrupt enabling, diagnostic states processor. Figure shows format entire register Table explains fields. following list provides details more important Status register fields: 8-bit Interrupt Mask (IM) field controls individual enabling eight interrupt conditions. Interrupts must generally enabled before they cause exception set), corresponding bits both Interrupt Mask field Status register Interrupt Pending (IP) field Cause register (for more information, refer Interrupt Pending (IP) field Cause register).IM[1:0] masks software interrupts IM[7:2] correspond Int[5:0]. 4-bit Coprocessor Usability (CU) field controls usability possible coprocessors. Regardless setting, always usable Kernel mode. other cases, instruction access unusable coprocessor causes exception. 9-bit Diagnostic Status (DS) field (Status[24:16]) used self-testing checks cache virtual memory system. Reverse-Endian (RE) bit, reverses endianness machine. system reset, processor configured either little-endian big-endian. This selection always used Kernel Supervisor modes, also User mode when Setting inverts User mode endianness. (Cu3:.Cu0) Figure Status Register (RC32364) System Control Co-Processor Architecture Control Register Formats Controls usability each four coprocessor unit numbers. always usable when Kernel mode, regardless setting bit. usable0 unusable Enables Disables Non-Blocking Load Enable0 Disable Note: This will cleared whenever RC32364 takes imprecise exception caused nonblocking load instruction. responsibility exception handler turn this again. Reverse-Endian bit, valid User mode. Data Cache Lock enable. This enables data cache lock function. this during Data cache fill, cache line that particular will locked. Please refer "Cache Operation" section more detail disable Data cache locking enable Data cache locking Instruction Cache Lock enable. This enables instruction cache lock function. this during Instruction cache fill, cache line that particular will locked. Please refer "Cache Operation" section more detail disable Instruction cache locking enable Instruction cache locking Controls location refill general exception vectors. normal1 bootstrap Indicates soft reset occurred. Contents register modify check bits caches when description register. Specifies that cache parity errors cannot cause exceptions. parity remains enabled disables parity Reserved. Must written zeroes return zeroes when read. Interrupt Mask: controls enabling each external, internal, software interrupts. interrupt taken interrupts enabled, corresponding bits both Interrupt Mask field Status register Interrupt Pending field Cause register. IM[7:2] correspond interrupts Int[5:0] IM[1:0] software interrupts. disabled1 enabled User Mode Bits User Kernel Error Level normal1 error Exception Level normal1 exception Note: When going from should disabled first. This would done when preparing return from exception handler, such before executing ERET instruction. Interrupt Enable disable interrupts1 enables interrupts Table Status Register Fields (RC32364) System Control Co-Processor Architecture Control Register Formats Status register (SR) read/write register that contains operating mode, interrupt enabling, diagnostic states processor. following list describes more important Status register fields; Figure shows status register format field names. Figure shows format Status register. Table 3.4, which follows figure, describes Status register fields. (Cu3:.Cu0) Figure Status Register (4600/4700) Controls usability each four coprocessor unit numbers. always usable when Kernel mode, regardless setting bit. usable0 unusable Enables additional floating-point registers registers1 registers Reverse-Endian bit, valid User mode. Controls location refill general exception vectors. normal1 bootstrap Indicates soft reset occurred. (tag match valid state) miss indication last CACHE Invalidate, Write Back Invalidate, Write Back, Virtual primary cache. miss1 Contents register modify check bits caches when description register. Specifies that cache parity errors cannot cause exceptions. parity remains disables parity enabled Reserved. Must written zeroes return zeroes when read. Interrupt Mask: controls enabling each external, internal, software interrupts. interrupt taken interrupts enabled, corresponding bits both Interrupt Mask field Status register Interrupt Pending field Cause register. IM[7:2] correspond interrupts Int[5:0] IM[1:0] software interrupts. disabled enabled Table Status Register Fields (4600/4700) System Control Co-Processor Architecture Control Register Formats controls whether Refill Vector XTLB Refill Vector address used misses kernel addresses Refill Vector XTLB Refill Vector Enables 64-bit virtual addressing operations Supervisor mode. extended-addressing refill exception used misses supervisor addresses. 32-bit 64-bit Enables 64-bit virtual addressing operations User mode. extended-addressing refill exception used misses user addresses. 32-bit 64-bit Mode bits User Error Level normal Supervisor error Kernel Exception Level normal exception Note: When going from should disabled first. This would done when preparing return from exception handler, such before executing ERET instruction. Interrupt Enable disable interrupts enables interrupts Table Status Register Fields (4600/4700) (Continued) Fields Status register modes access states described sections that follow. Interrupt Enable: Interrupts enabled when following conditions true: these conditions met, settings bits identify interrupt. Note: Note:Setting delayed cycles. performing nested interrupts, re-enable first. Operating Modes: following Status register settings required User, Kernel, Supervisor modes (see Chapter more information about operating modes). processor User mode when processor Supervisor mode when 012, processor Kernel mode when 002, 64-bit Virtual Addressing: following Status register settings select 64-bit virtual addressing User Supervisor operating modes. Enabling 64-bit virtual addressing permits execution 64-bit opcodes translation 64-bit virtual addresses. 64-bit virtual addressing User Supervisor modes independently always used Kernel mode. field controls whether Refill Vector XTLB Refill Vector address used misses Kernel addresses. 64-bit opcodes always valid Kernel mode. 64-bit addressing operations enabled Supervisor mode when 64-bit addressing operations enabled User mode when System Control Co-Processor Architecture Control Register Formats contents Status register undefined reset, except following bits distinguishes between Reset Soft Reset (Nonmaskable Interrupt [NMI]). Status register (SR) RC4650 similar that RC4600 most part. Please refer previous section details. Figure shows format entire register RC4650. Following figure description fields that unique RC4650. (Cu3:.Cu0) Figure Status Register (4650) Bits different RC4650, compared RC4600. RC4650, because does have TLB, does support 64-bit program addressing, only operating modes, bits reserved. noted Table 3.5, bits (DL) (IL) used cache locking. Data cache lock, RC4650. Does prevent refills into when invalid. Does inhibit update D-cache store operations. normal operation refill into disabled Instruction cache lock, RC4650. Does prevent refills into when invalid. normal operation refill into disabled User Mode bit, RC4650. User Kernel (Simplification KSU, remains subject ERL, RC4xxx. Table Bits 4650 Status Register Table shows fields Cause register, which consulted determine kind exception that happened will used decide which exception routine call. ExcCode Table Cause Register Fields (RC3xxx RC4600/RC4700) Branch Delay: set, this indicates that does point actual "exception" instruction, rather branch instruction which immediately precedes When exception restart point instruction which "delay slot" following branch, point branch instruction; harmless re-execute branch, returned from exception branch delay instruction itself branch would taken exception would have broken interrupted program. System Control Co-Processor Architecture Control Register Formats only time software might sensitive this must analyze "offending" instruction then instruction This would occur instruction needs emulated (e.g. floating point instruction device with hardware FPA; breakpoint placed branch delay slot). Co-processor error: exception taken because "co-processor" format instruction "co-processor" which enabled then this field coprocessor number from that instruction. Interrupt Pending: shows interrupts which currently asserted (but "masked" from actually signalling exception). These bits follow inputs hardware levels. Bits read/writable, contain value last written them. However, bits active when enabled appropriate global interrupt enable flag will cause interrupt. subtly different from rest Cause register fields; doesn't indicate what happened when exception took place, rather shows what happening now. 5-bit code which indicates what kind exception happened, detailed Table 3.7. ExcCode 16-31 TLBL TLBS AdEL AdES Syscall load/TLB store Interrupt modification Address error load/I-fetch store respectively). Either attempt access outside kuseg when user mode, attempt read word half-word misaligned address. error (instruction fetch data load, respectively). External hardware signalled error some kind; proper exception handling systemdependent. RC30xx family CPUs can't take error store; write buffer would make such exception "imprecise". Generated unconditionally syscall instruction. Breakpoint break instruction. reserved instruction Co-Processor unusable arithmetic overflow. Note that unsigned versions instructions (e.g. addu) never cause this exception. Trap Exception RC4600/RC4700/RC32364; reserved RC3xxx Reserved Floating-Point exception; reserved parts with Reserved. Table ExcCode Values: R3xxx/R4600/R4700 Exception differences Cause register fields (shown Figure 3.6) similar those RC4600, described previous section. Notable differences between RC4650 RC4600 cause registers described Figure 3.6. Cause register fields (shown Figure 3.6) identical those RC4650, with just exception. zero case RC4650. case RC32364 called described last entry Table 3.8. System Control Co-Processor Architecture Control Register Formats Cause Register Code Figure Cause Register Format (RC4650) Indicates whether last exception taken occurred branch delay slot. delay slot normal Reserved. Currently read must written `0'. Coprocessor unit number referenced when Coprocessor Unusable exception taken. Watch exception, indicates that DWatch register matched. other exceptions this field undefined. Watch exception, indicates that IWatch register matched. other exceptions this field undefined. Enables dedicated interrupt vector. interrupts exception vector (200) interrupts common exception vector (180) Indicates interrupt pending. interrupt pending interrupt Exception code field (see Table 3.7) RC32364 only Indicates last exception imprecise. This occurs when exception taken NonBlocking Load Imprecise Precise Table Cause Register Field Descriptions ExcCode This 32-bit read/write register containing 32-bit address return point this exception (64bits RC4600/RC4700). instruction causing exception EPC, unless Cause, which case points previous (branch) instruction. RC4600/RC4700/RC32364 will write set. Also, RC4600, RC4700, RC32364 RC4650 ErrorPC cache errors soft reset. 32-bit register containing address whose reference exception; MMU-related exception, attempt user program access addresses outside kuseg, address wrongly aligned datum size referenced. After other exception this register undefined. Note particular that after error. System Control Co-Processor Architecture Processor-Specific Registers 64-bit bits RC3xxx RC4650) Virtual Address register (BadVAddr) read-only register that displays most recent virtual address that caused following exceptions: Address Error (e.g., unaligned access), Invalid, Modified, Refill, Virtual Coherency Data Access, Virtual Coherency Instruction Fetch. RC4650, bounds exception recognized place exceptions because does exist. processor does write BadVAddr register when Status register BadVAddr register does save information errors, since errors addressing errors. Only present RC3041, these provide simple 24-bit counter/timer running cycle rate. Count counts then wraps around zero once reached value Compare register. wraps around output asserted. According configuration (bit BusCtrl register), will either remain active until reset software (re-write Compare), will pulse. either case counter just keeps counting. generate interrupt must connected interrupt inputs. From reset Compare setup maximum value (0xFF FFFF), counter runs 224-1 before wrapping around. 32-bit Count register acts timer, incrementing constant rate-half maximum instruction issue rate-whether instruction executed, retired, forward progress made through pipe Other recent searchesZX95-43+ - ZX95-43+ ZX95-43+ Datasheet STQ-1016 - STQ-1016 STQ-1016 Datasheet OM3460SS - OM3460SS OM3460SS Datasheet IRLZ34S - IRLZ34S IRLZ34S Datasheet FDA20N50 - FDA20N50 FDA20N50 Datasheet CY7C1360C - CY7C1360C CY7C1360C Datasheet CY7C1362C - CY7C1362C CY7C1362C Datasheet
Privacy Policy | Disclaimer |