The Datasheet Archive - 100 Million Datasheets from 7500 Manufacturers.    


Datasheet Search Engine   
 
Part # or Description: • 5V RS232 Driver • 2SC5066* • "Real Time Clock" • "USB connector" • "blue led" 5mm • 10 watt zener diode • 2N3055* motorola
 
Search Tip: Try entering the part number only. Include a wildcard (eg. lm317* or 1n4148*)

 

 

©1994 Integrated Device Technology, Inc. Portions ©1994 Algorithmics,


Datasheet Thumbnail

  

Download PDF



Top Searches for this datasheet



R30xx Family Software Reference Manual
©1994 Integrated Device Technology, Inc. Portions ©1994 Algorithmics, Ltd. Chapter contains some material that ©1988 Prentice-Hall. Appendices contain material that ©1994 Mips Technology, Inc.
About
Integrated Device Technology, Inc. been MIPS semiconductor partner since 1988, efforts bring high-performance inherent MIPS architecture embedded systems engineers. These efforts include derivatives MIPS R3xxx R4xxx CPUs, development tools, applications support. Additional information about IDT's RISC family obtained from your local sales representative. Alternately, reached directly Corporate Marketing RISC Applications "Hotline" RISC Applications RISC Applications Internet (800) 345-7015 (408) 492-8208 (408) 492-8469 rischelp@idtinc.com
About Algorithmics
Much this manual written Dominic Sweetman Nigel Stephens Algorithmics London, England, under contract IDT. Algorithmics were early enthusiasts MIPS architecture, designing their first MIPS systems system software 1986/87. small engineering company, Algorithmics provide enabling technologies companies designing both R30xx family CPUs 64-bit R4x00 architecture. This includes training, toolkits, support, evaluation boards. Dominic Sweetman reached following:. Dominic Sweetman Algorithmics Drayton Park London ENGLAND. phone: 3301 fax: 3400 email: dom@algor.co.uk
About This Manual
This manual targeted systems programmer building R30xxbased system. contains architecture specific operations programming conventions relevant such programmer. This manual intended tutorial structured programming, real-time operating systems, particular high-level programming language, particular toolchain. Other references better suited those topics. This manual does contain specific code fragments most common programming conventions that specific R30xx RISController family. manual consciously limited R30xx family; information relevant R4xxx family processors found, device specific programs (such cache management, exception handling, etc.) shown examples specific R30xx family. This manual contains references toolchains most commonly used authors (IDT, Inc., Algorithmics, Ltd.). Code fragments shown typically from software used and/or provided these companies, includeing development tools such IDT/c software utilities (such IDT/kit, IDT/sim, Micromonitor). wide variety other, party products, also available support R30xx development, under Advantage-IDT program. reader this manual encouraged look available tools determine which toolchains utilities best system development requirements. Additional information family RISC processors, their support tools, available from your local salesman.
Integrated Device Technology, Inc. reserves right make changes products specifications time, without notice, order improve design performance supply best possible product. does assume responsibility circuitry described other than circuitry embodied product. Company makes representations that circuitry described herein free from patent infringement other rights third parties which result from use. license granted implication otherwise under patent, patent rights other rights, Integrated Device Technology, Inc.
LIFE SUPPORT POLICY Integrated Device Technology's products authorized critical components life support devices systems unless specific written agreement pertaining such intended executed between manufacturer officer IDT. Life support devices systems devices systems which intended surgical implant into body support sustain life whose failure perform, when properly used accordance with instructions provided labeling, reasonably expected result significant injury user. critical component components life support device system whose failure perform reasonably expected cause failure life support device system, affect safety effectiveness. logo registered trademark BiCameral, BurstRAM, BUSMUX, CacheRAM, DECnet, Double-Density, FASTX, Four-Port, FLEXI-CACHE, Flexi-PAK, Flow-thruEDC, IDT/c, IDTenvY, IDT/sae, IDT/sim, IDT/ux, MacStation, MICROSLICE, Orion, PalatteDAC, REAL8, R3041, R3051, R3052, R3081, R3721, R4600, RISCompiler, RISController, RISCore, RISC Subsystem, RISC Windows, SARAM, SmartLogic, SyncFIFO, SyncBiFIFO, SPC, TargetSystem WideBus trademarks Integrated Device Technology, Inc. MIPS registered trademark MIPS Computer Systems, others trademarks their respective companies.
R30xx Family Software Reference Manual Introduction.1 What RISC?. PIPELINES R3xxx Family CPUs MIPS Architecture Levels. MIPS-1 Compared with CISC Archtectures. Unusual Instruction Encoding Features Addressing Memory Accesses Operations Directly Supported Multiply Divide Operations Programmer-visible Pipeline Effects Note Machine Assembler Language MIPs-1 (R30xx) Architecture.2 Programmer's View Processor Archtecture. Registers. Conventional Names Uses General-Purpose Registers Notes Conventional Register Names Integer Multiply Unit Registers Instruction Types Loading Storing: Addressing Modes Data types Memory Registers Integer Data Types Unaligned Loads Stores Floating Point Data Memory Basic Address Space Summary System Addressing. Kernel User Mode Memory CPUs without Hardware. 2-10 Subsegments R3041 Memory Width Configuration 2-10 System Control Coprocessor Architecture.3 Control Summary Control ``CO-PROCESSOR 0''. Control Instructions. Standard control registers. PRId Register Register Cause Register Register BadVaddr Register R3041, R3071, R3081 Specific Registers.
Count Compare Registers (R3041 only) .3-8 Config Register (R3071 R3081) .3-8 Config Register (R3041) .3-9 BusCtrl Register (R3041 only) .3-10 PortSize Register (R3041 only) .3-11 What registers relevant when?.3-11 Exception Management.4 Exceptions .4-1 Precise Exceptions.4-1 When Exceptions Happen .4-2 Exception vectors .4-2 Exception Handling Basics.4-3 Nesting Exceptions .4-4 Exception Routine .4-4 Interrupts.4-12 Conventions Examples .4-14 Cache Management Caches Cache Management .5-1 Cache Isolation Swapping .5-3 Initializing Sizing Caches .5-4 Invalidation.5-6 Testing Probing.5-8 Configuration (R3041/71/81 only) .5-8 Write Buffer.5-9 Implementing wbflush().5-10 Memory Management Memory Management .6-1 Registers Described .6-3 EntryHi, EntryLo .6-3 Index .6-4 Random .6-4 Context .6-4 Control Instructions .6-5 Programming Interface TLB.6-5 Refill Happens .6-5 Using ASIDs .6-6 Random Register Wired Entries .6-6 Memory Translation Setup .6-6 Exception Sample Code .6-7 Basic Exception Handler .6-7 Fast kuseg Refill from Page Table .6-7 Simulating Dirty Bits.6-8 Debugging .6-8 Management Utilities.6-9 Reset Initialization.7 Starting Up.7-1 Probing Recognizing .7-4 Bootstrap Sequences .7-5 Starting Application .7-5
Floating Point Coprocessor.8 IEEE754 Standard Background What Floating Point?. IEEE exponent field bias. IEEE mantissa normalization. Strange values reserved exponent values MIPS Data formats MIPS Implementation IEEE754. Floating Point Registers. Floating Point Eeceptions/Interrupts. Floating Point Control/Status Register Floating Point Implementation/Revision Register. Guide Instructions Load/Store. Move Between Registers 3-Operand Arithmetic Operations. Unary (sign-changing) Operations. 8-10 Conversion Operations. 8-10 Conditional Branch Test Instructions. 8-10 Instruction Timing Requirements 8-12 Instruction Timing Speed 8-12 Initialization Enable Demand. 8-12 Floating Point Emulation 8-13 Assembler Language Programming.9 Syntax Overview. Points Note Register-to-Register Instructions Immediate (Constant) Operands Multiply/Divide. Load/Store Instructions. Unaligned Loads Store. Addressing Modes Gp-Relative Addressing. Jumps, Subroutine Calls Branches. Conditional Branches. Co-processor Conditional Branches Compare Coprocessor Transfers Coprocessor Hazards 9-10 Assembler Directives 9-10 Sections 9-10 .text, .rdata, .data 9-10 .lit4, .lit8 9-10 Program Segments Memory 9-11 .bss 9-12 .sdata, .sbss 9-12 Stack Heap 9-12 Special Symbols 9-12 Data Definition Alignment. 9-12
.byte, .half, .word 9-13 .float, .double 9-13 .ascii, .asciiz 9-13 .align 9-13 .comm, .lcomm 9-13 .space 9-14 Symbol Binding Attributes 9-14 .globl 9-14 .extern 9-15 .weakext 9-15 Function Directives. 9-15 .ent, .end 9-15 .aent 9-16 .frame, .mask, .fmask 9-16 Assembler Control (.set) 9-17 .set noreorder/reorder 9-17 .set volatile/novolatile 9-17 .set noat/at 9-18 .set nomacro/macro 9-18 .set nobopt/bopt 9-18 Complete Guide Assembler Instructions. 9-18 Alphabetic List Assembler Instructions 9-30 Programming.10 Stack, Subroutine Linkage, Parameter Passing 10-1 Stack Argument Structure. 10-1 Which Arguments What Registers 10-1 Examples from Library 10-2 Exotic Example; Passing Structures 10-2 Printf() Varargs Work 10-3 Returning Value from Function 10-4 Macros Prologues Epilogues 10-4 Stack-Frame Allocation 10-4 Leaf Functions 10-4 Non-Leaf Functions 10-5 Functions Needing Run-Time Computed Stack Locations 10-7 Shared Non-Shared Libraries. 10-9 Sharing Code Single-Address Space Systems 10-9 Sharing Code Across Address Spaces 10-10 Introduction Optimization. 10-11 Common Optimizations 10-11 Prevent Unwanted Effects From Optimization. 10-14 Optimizer-Unfriendly Code Avoid 10-15 Portability Considerations Writing Portable 11-1 Language Standards 11-1 Library Functions POSIX 11-2 Data Representations Alignment. 11-3 Notes Structure Layout Padding 11-3 Isolating System Dependencies 11-5
Locating System Dependencies 11-5 Fixing Dependencies. 11-5 Isolating Non-Portable Code 11-6 Using Assembler. 11-6 Endianness 11-7 What Means Programmer. 11-8 Bitfield Layout Endianness 11-9 Changing Endianness MIPS CPU. 11-10 Designing Specifying Configurable Endianness 11-10 Read-Only Instruction Memory 11-10 Writable (Volatile) Memory 11-11 Byte-Lane Swapping 11-11 Configurable Controllers 11-12 Portability Endianness-Independent Code 11-13 Endianness-Independent Code 11-13 Compatibility Within R30XX Family. 11-13 Porting MIPS: Frequently Encountered Issues. 11-15 Considerations Portability Future Devices. 11-16 Writing Power-On Diagnostics.12 Golden Rules Diagnostics Programming 12-1 What Should Tests 12-2 Test Diagnostic Tests? 12-3 Overview Algorithmics' Power-On Selftest. 12-3 Starting Points. 12-3 Control Environment Variables 12-4 Reporting. 12-4 Unexpected Exceptions During Test Sequence 12-5 Driving Test Output Devices 12-5 Restarting System 12-5 Standard Test Sequence 12-5 Notes Test Sequence 12-6 Annotated Examples from Test Code 12-9 Instruction Timing Optimization.13 Notes Examples. 13-1 Additional Hazards 13-2 Early Modification 13-2 Bitfields Control Registers. 13-3 Non-Obvious Hazards. 13-3 Software Tools Board Bring-Up.14 Tools Used Debug 14-1 Initial Debugging 14-2 Porting Micromonitor 14-2 Running Micromonitor 14-2 Initial IDT/SIM Activity 14-2 Final Note IDT/KIT 14-3 Software Design Examples Application Software 15-1 Memory 15-1 Starting 15-1
Library Functions 15-2 Input Output 15-3 Character Class Tests 15-3 String Functions 15-3 Mathematical Functions 15-3 Utility Functions 15-3 Diagnostics 15-4 Variable Argument Lists 15-4 Non-Local Jumps 15-4 Signals 15-4 Date Time 15-4 Running Program 15-4 Debugging Program 15-5 Embedded System Software 15-5 Memory 15-6 Starting 15-6 Embedded System Library Functions. 15-7 Trap Interrupt Handling 15-8 Simple Interrupt Routines 15-8 Floating-Point Traps Interrupts 15-9 Emulating Floating Point Instructions 15-10 Debugging. 15-10 Unix-Like System 15-11 Terminology. 15-11 Components Process 15-12 System Calls Protection 15-13 What Kernel Does. 15-13 Virtual Memory Implementation MIPS 15-14 Interrupt Handling MIPS. 15-15 Works 15-16 Assembly Language Programming Tips.16 32-bit Address Constant Values 16-1 "Set" Instructions 16-1 "Set" with Complex Branch Operations 16-2 Carry, Borrow, Overflow, Multi-Precision Math 16-2 Machine Instructions Reference (Appendix A).A Instruction Overview. Instruction Classes Instruction Formats Instruction Notation Conventions Instruction Notation Examples Load Store Instructions Jump Branch Instructions. Coprocessor Instructions. System Control Coprocessor (CP0) Instructions Instruct Details. Instruction Summary. A-79 Instruction Reference (Appendix B).B Instruction Details .B-1
i-10
Instructions .B-1 Floating-Point Data Transfer .B-1 Floating-Point Conversions .B-1 Floating-Point Arithmetic .B-2 Floating-Point Register-to-Register Move .B-2 Floating-Point Branch .B-2 Computational Instructions Valid Operands .B-2 Compare Condition values .B-3 Register Specifiers.B-3 32-bit registers.B-4 Register Access 32-bit Registers.B-5 Instruction Notation Conventions .B-5 Load Store Memory .B-6 Instruction Descriptions .B-6 Instruction Summary .B-27 Operation Reference (Appendix Operation Details .C-1 Operations .C-1 Exception Operations.C-1 Dand Register Movement Operations.C-1 Operation Descriptions .C-1 Assembler Language Syntax (Appendix D).D Object Code Formats (Appendix E).E Sections Segments.E-1 ECOFF Object File Format (RISC/OS).E-1 File Header.E-2 Optional a.out Header .E-2 Example Loader .E-3 Further Reading .E-4 (MIPS ABI).E-4 File Header.E-4 Program Header .E-5 Example Loader .E-6 Further Reading .E-7 Object Code Tools .E-7 Glossary Common "MIPS" Terms. DRAWINGS MIPS 5-Stage Pipeline.1.2 Pipeline Branch Delays. Pipeline Load Delays PRId Register Fields Fields Status Register. Fields Cause Register. Fields R3071/81 Config Register. Fields R3041 Config (Cache Configuration)Register. Fields R3041 Control (BusCtrl) Register 3-10 Direct Mapped Cache EntryHi EntryLo Register Fields
i-11
10.1 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 15.1
EntryHi EntryLo Register Fields Fields Index Register Fields Random Register. Fields Context Register. Control/Status Register Fields Implementation/Revision Register Program Segments Memory 9-11 Stackframe Non-Leaf Function 10-5 Structure Layout Padding Memory. 11-3 Data Representation with #pragma Pack(1) 11-4 Data Representation with #pragma Pack(2) 11-5 Typical Big-Endians Picture 11-8 Little Endians Picture. 11-8 Bitfields Big-Endian. 11-9 Bitfields Little-Endian. 11-10 Garbled String Storage when Mixing Modes 11-11 Byte-Lane Swapper. 11-12 Memory Layout Process 15-12 Instruction Formats
TABLES R30xx Family Members Compared. Conventional Names Registers with Usage Mnemonics. Summary Control Registers (Not MMU) ExcCode Values: Different kinds Exceptions Reset Exception Entry Points (Vectors) R30xx Family Interrupt Bitfields Interrup Pins 4-13 Control Registers Memory Management Floating Point Data Formats Rounding Modes Encoded Control/Status Register. Move Instructions. 3-Operand Arithmetic. 8-10 Sign-Changing Operators 8-10 Data Conversion Operations. 8-10 Test Instructions 8-11 Assembler Register Identifier Conventions 9-20 Assembler Instructions. 9-20 12.1 Test Sequence Brief 12-5 16.1 32-bit Immediate Values. 16-1 16.2 Add-With-Carry. 16-2 16.3 Subtract-with-Borrow Operation 16-3 Instruction Operation Notations. Load Store Common Function Access Type Specifications Load/Store. Format Field Decoding .B-2 Logical Negation Predicates Condition True/False.B-3 Valid Operand Specifiers with 32-bit Coprocessor Registers.B-4 Load Store Common Functions .B-6
i-12
INTRODUCTION
CHAPTER
Integrated Device Technology, Inc.
IDT's R30xx family RISC microcontrollers family includes R3051, R3052, R3071, R3081 R3041 processors. different members family offer different price/performance trade-offs, basically integrated versions MIPS R3000A CPU. R3000A well known high-performance Unix systems implemented around less publicized equally impressive performance brought wide variety embedded applications. IDT's RISController family also includes devices built around MIPS R4000 64-bit microprocessor technology. These devices, such R4600 Orion microprocessor, offer even higher levels performance than R3000A derivative family. However, these devices also feature slightly different models, allow 64-bit kernels applications. Thus, they sufficiently different from R30xx family that this manual focused exclusively R30xx family. This manual aimed programmer dealing with R30xx family components. Although most programming occurs using high-level language (usually "C"), with little awareness underlying system processor architecture, certain operations require programmer assembly programming, and/or aware underlying system processor structure. This manual designed consulted when addressing these types issues.
WHAT RISC?
MIPS "RISC'' CPUs, born particularly fertile period academic research development. RISC CPUs (``Reduced Instruction Computer'') share number architectural attributes facilitate implementation high-performance processors. Most architectures opposed implementations) since 1986 their remarkable performance features developed years earlier couple seminal research projects. Someone commented that RISC computer architecture defined after 1984''; although meant jibe industry's acronym, comment's truth also derives from widespread acceptance conclusions that research. these ``MIPS'' project Stanford University. project name MIPS puns familiar ``millions instructions second'' taking name from phrase ``Microcomputer without Interlocked Pipeline Stages''. Stanford group's work showed that pipelining, wellknown technique speeding computers, been under-exploited earlier architectures.
CHAPTER
INTRODUCTION
PIPELINES
I-cache register file D-cache register file
instr
Pipelined processors operate breaking instruction execution into multiple small independent "stages"; since stages independent, multiple instructions varying states completion time. Also, this organization tends facilitate higher frequencies operation, since very complex activities broken down into "bitesized" chunks. result that multiple instructions executing time, that instructions initiated (and completed) very high frequency. MIPS consistently been among most aggressive utilization these techniques. Pipelining depends success another technique; using caches reduce amount time spent waiting memory. MIPS R3000A architecture uses separate instruction data caches, fetch instruction read write memory variable same clock phase. mating high-frequency operation high memory-bandwidth, very high-performance achieved. CISC architectures, caches often seen part memory. RISC architecture makes more sense dual caches regarded very much part CPU; fact, pipelines virtually RISC processors require caches maintain execution. normally runs from cache cache miss (where data instructions have fetched from memory) seen exceptional event. R3000A derivatives, instruction execution divided into five phases (called pipestages), with each pipestage taking fixed amount time (see "MIPS 5-stage pipeline" page 1-2). Again, note that this model assumes that instruction fetches data accesses satisfied from processor caches processor operation frequency. instructions rigidly defined follow same sequence pipestages, even where instruction does nothing some stage. result that, long keeps hitting cache, starts instruction every clock. "Figure 1.1. MIPS 5-stage pipeline", illustrates this operation. Instruction execution activity described occurring individual pipestages: (``instruction fetch'') gets next instruction from instruction cache (I-cache). (``read registers'') decodes instruction fetches contents registers uses. (``arithmetic/logic unit'') performs arithmetic logical operation clock (floating point math integer multiply/ divide can't done clock done differently; this described later).
Instruction sequence
instr
instr
Time
Figure 1.1. MIPS 5-stage pipeline
INTRODUCTION
CHAPTER
stage where instruction read/write memory variables data cache (D-cache). Note that typical programs, three four instructions nothing this stage; allocating stage each instruction ensures that processor never instructions wanting data cache same time. (``write back'') store value obtained from operation back register file. rigid pipeline does limit kinds things instructions particular: Instruction length instructions bits (exactly machine ``word'') long, that they fetched constant time. This itself discourages complexity; there enough bits instruction encode really complicated addressing modes, example. arithmetic memory variables data from cache memory obtained only stage which much late available ALU. Memory accesses occur only simple load store instructions which move data from registers (this described ``load/ store architecture''). However, MIPS project architects also attended best thinking time about what makes easy target efficient optimizing compilers. MIPS CPUs have general purpose registers, 3-operand arithmetical/logical instructions eschew complex special-purpose instructions which compilers can't usually generate.
R3xxx FAMILY CPUS
MIPS Corporation formed 1984 make commercial version Stanford MIPS CPU. commercial enhanced with memory management hardware, first appearing late 1985 R2000. ambitious external floating point math co-processor (the R2010 FPA) first shipped mid-87. R3000, shipped 1988, almost identical from programmer's viewpoint (although small hardware enhancements combined give substantial boost performance). R3000A done 1989, improve frequency operation over original R3000 (other minor enhancements were added, such ability user tasks operate with opposite "endianness" from kernel). R2000/R3000 chips include cache controller implementation external caches merely required industry standard SRAMs some address latches. math co-processor shares cache buses interpret instructions parallel with integer CPU) transfer operands results between memory integer CPU. division function ingenious, practical workable, allowing R2000/3000 generation built without extravagant ultra-high pincount packages. However, clock speeds increased very high-speed signals cache interface increased design complexity limited operational frequency. addition, overall chip count basic execution core proved limitation area power sensitive embedded systems. R3051, R3052, R3071, R3081 R3041 members far) family products defined, designed, manufactured IDT. chips integrate functions R3000A CPU, cache memory (R3081 only) math co-processor. This means that fastest logic chip; integrated chips only cheaper smaller than original implementation, also much easier use. parts differ their cache sizes, whether they include onchip and/or FPA, clock rates packaging options. addition, although parts used pin-compatibly, certain products feature optional enhancements their bus-interface that serve reduce system cost complexity, other subtle enhancements cost performance. major differences summarized "Table 1.1. R30xx family members compared".
CHAPTER
INTRODUCTION
Part 3051 3051E 3052 3052E 3081 3081E 3071 3071E 3041
Cache
Clock (MHz) 20-40
Package Options PLCC
System Interface
32-bit MUX'ed
16K+4K/ 8K+8K 16K+4K/ 8K+8K 16K+4K/ 8K+8K 16K+4K/ 8K+8K 0.5K
20-40
PLCC
32-bit MUX'ed
20-50
PLCC
Optional frequency operation Optional Clock Input
33-50
PLCC
frequency operation Clock Input Variable port width interface.
16-25
PLCC TQFP
Table 1.1. R30xx family members compared
MIPS ARCHITECTURE LEVELS
There multiple generations MIPS architecture. most commonly discussed MIPS-1, MIPS-2, MIPS-3 architectures. MIPS-1 found R2000 R3000 generation CPUs. 32-bit ISA, defines basic instruction set. application written with MIPS-1 instruction will operate correctly generations architecture. MIPS-2 also 32-bit. adds some instructions speed floating point data movement, branch-likely instructions, other minor enhancements. This first implemented MIPS R6000 microprocessor. MIPS-3 64-bit ISA. addition supporting MIPS-1 MIPS-2 instructions, MIPS-3 contains 64-bit equivalents certain earlier instructions that sensitive operand size (e.g. load double load word both supported), including doubleword (64-bit) data movement arithmetic. This first implemented R4000 clean ("seamless") transition from existing 32-bit architecture. Note that these levels necessarily imply particular structure MMU, caches, exception model, other kernel specific resources. Thus, different implementations compatible chips require different kernels. case R30xx family, devices implement MIPS-1 ISA. Many devices also kernel compatible with R3000A, some devices (most notably those without MMU) require small kernel changes different boot modules.
MIPS-1 COMPARED WITH CISC ARCHITECTURES
Although MIPS architecture fairly straight-forward, there features, visible only assembly programmers, which first appear surprising. addition, operations familiar CISC architectures Historically, many embedded MIPS applications have exclusively "kseg0 kseg1" memory regions (described later book). these applications, presence absence largely irrelevant.
INTRODUCTION
CHAPTER
irrelevant MIPS architecture. example, MIPS architecture does mandate stack pointer stack usage; thus, programmers surprised find that push/pop instructions exist directly. most notable these features summarized here.
Unusual instruction encoding features
instructions 32-bits long mentioned above. This means, example, that impossible incorporate 32-bit constant into single instruction (there would instruction bits left encode operation registers!). ``load immediate'' instruction limited 16-bit value; special ``load upper immediate'' must followed ``or immediate'' 32-bit constant value into register. Instruction actions must pipeline actions only carried designated pipeline phase, must complete clock. example, register writeback phase provides just value stored register file, instructions only change register. 3-operand instructions arithmetic/logical operations don't have specify memory locations, there plenty instruction bits define independent source destination register. Compilers love 3-operand instructions, which give optimizers more scope improve code which handles complex expressions. registers choice become universal; compilers like large (but necessarily large) number registers, there cost context-saving encoding registers used instruction. Register always returns zero, give compact encoding that useful constant. condition codes MIPS architecture does provide condition code flags implicitly arithmetical operations. motivation make sure that execution state stored place register file. Conditional branches MIPS) test single register sign/zero, pair registers equality.
Addressing memory accesses
Memory references always register loads stores arithmetic memory variables upsets pipeline, done. Memory references only occur explicit load store instructions. large register file allows multiple variables "on-chip" simultaneously. Only data addressing mode loads stores define memory location with single base register value modified 16-bit signed displacement. Note that assembler/compiler tools register, along with immediate value, synthesize additional addressing modes from this directly supported mode. Byte-addressed instruction includes load/store operations 16-bit variables (referred byte halfword). Partialword load instructions come flavors sign-extend zeroextend. Loads/stores must address-aligned memory word operations only load store data from single 4-byte aligned word; halfword operations must aligned half-word addresses. Many CISC microprocessors will load/store multi-byte item from byte address (although unaligned transfers always take longer). Techniques generate code which will handle unaligned data efficiently will explained later. Jump instructions smallest op-code field MIPS instruction bits; leaving bits define target jump. Since instructions 4-byte aligned memory least-significant
CHAPTER
INTRODUCTION
address bits need stored, allowing address range 256Mbytes. Rather than make this branch PC-relative, this interpreted absolute address within 256Mbyte ``segment''. theory, this could impose limit size single program; reality, hasn't been problem. Branches segment achieved using instruction, which uses contents register target. Conditional branches have only 16-bit displacement field (218 byte range since instructions 4-byte aligned) which interpreted signed PC-relative displacement. Compilers only code simple conditional branch instruction they know that target will within 128Kbytes instruction following branch.
Operations directly supported
byte halfword arithmetic arithmetical logical operations performed 32-bit quantities. Byte and/or halfword arithmetic would require significant extra resources, many more op-codes, understandable omission. Most programmers will data type most arithmetic, MIPS bits such arithmetic will efficient. rules perform arithmetic whenever source destination variable long int. However, where program explicitly does arithmetic short compiler must insert extra code make sure that wraparound overflows have appropriate effect. special stack support conventional MIPS assembler usage does define register, hardware treats just like other register. There recommended format stack frame layout subroutines, that programs modules from different languages compilers; recommended that programmers stick these conventions, they have relationship hardware. Minimal subroutine overhead there special feature; jump instructions have ``jump link'' option which stores return address into register. default, convenience convention becomes ``return address'' register. Minimal interrupt overhead MIPS architecture makes very presumptions about system exception handling, allowing fast response wide variety software models. R30xx family, stashes away restart location special register EPC, modifies machine state just enough signal trap happened disallow further interrupts; then jumps single predefined location memory. Everything else software. Just emphasize this: interrupt trap MIPS does store anything stack, write memory, preserve registers itself. convention, registers ($k0, $k1; register conventions explained chapter reserved that interrupt/trap routines ``bootstrap'' themselves impossible anything MIPS without using some registers. program running system which takes interrupts traps, values these registers change time, thus should used.
particular kind trap miss address user-privilege address space) different dedicated entry point.
INTRODUCTION
CHAPTER
Multiply divide operations
MIPS does have integer multiply/divide unit; worth mentioning because many RISC machines don't have multiply hardware. multiply unit relatively independent rest CPU, with special output registers.
Programmer-visible pipeline effects
addition discussion above, programmers R3xxx architecture CPUs also must aware certain effects MIPS pipeline. Specifically, results certain operations available immediately subsequent instruction; programmer need explicitly aware such cases.
branch addr
branch
branch delay
branch target
Figure 1.2.
pipeline branch delays
Delayed branches pipeline structure MIPS (see "Figure 1.2. pipeline branch delays") means that when jump instruction reaches ``execute'' phase program counter generated, instruction after jump will already have been decoded. Rather than discard this potentially useful work, architecture rules state that instruction after branch always executed before instruction target branch. "Figure 1.2. pipeline branch delays" show that special path provided through make branch address available half-clock early, ensuring that there only cycle delay before outcome branch determined appropriate instruction flow (branch taken taken) initiated. responsibility compiler system assemblerprogrammer allow even exploit this "branch delay slot"; turns that usually possible arrange code such that instruction ``delay slot'' does useful work. Quite often, instruction which would otherwise have been placed before branch moved into delay slot. This tricky conditional branch, where branch delay instruction must least) harmless path where isn't wanted. Where nothing useful done delay slot filled with ``nop'' (no-op, no-operation) instruction. Many MIPS assemblers will hide this feature from programmer unless explicitly told described later. Load data available next instruction another consequence pipeline that load instruction's data arrives from cache/ memory system AFTER next instruction's phase starts possible data from load following instruction. "Figure 1.3. pipeline load delays" this works. MIPS-1 architecture, programmer must insure that this rule violated
CHAPTER
INTRODUCTION
load
D-cache
load delay
data
Figure 1.3.
pipeline load delays
Again, most assemblers will hide this they can. Frequently, assembler move instruction which independent load into load delay slot; worst case, insert insure proper program execution.
NOTE MACHINE ASSEMBLER LANGUAGE
simplify assembly level programming, MIPS Corp's assembler (and many other MIPS assemblers) provides "synthetic" instructions. Typically, synthetic instruction common assembly level operation that assembler will into more true instructions. This mapping more intelligent than mere macro expansion. example, immediate load into instruction datum small enough, multiple instructions datum larger. However, these instructions dramatically simplify assembly level programming. example, programmer just writes ``load immediate'' instruction assembler will figure whether needs generate multiple machine instructions with just this example, depending size immediate datum). This obviously useful, confusing. This manual will synthetic instructions sparingly, indicate when happens. Moreover, instruction tables below will consistently distinguish between synthetic machine instructions. These features there help human programmers; most compilers generate instructions which one-for-one with machine code. However, some compilers will fact generate synthetic instructions. Helpful things assembler does: 32-bit load immediates programmer code load with value (including memory location which will computed link time), assembler will break down into instructions load high half value. Load from memory location programmer code load from memory-resident variable. assembler will normally replace this loading temporary register with high-order half variable's address, followed load whose displacement loworder half address. course, this does apply variables defined inside functions, which implemented either registers stack. Efficient access memory variables some programs contain many references static extern variables, two-instruction sequence load/store them expensive. Some compilation systems, with run-time support, around this. Certain variables selected compile/assemble time default MIPS Corp's assembler selects variables which occupy less bytes storage)
INTRODUCTION
CHAPTER
kept together single section memory which must smaller than 64Kbytes. run-time system then initializes register ($28 (global pointer) convention) point middle this section. Loads stores these variables coded single relative load store. More types branch condition assembler synthesizes full branches conditional arithmetic test between registers. Simple different forms instructions unary operations such produced with zero-valued register Two-operand forms 3-operand instructions written; assembler will result back into first-specified register. Hiding branch delay slot: normal coding most assemblers will allow access branch delay slot. MIPS Corp.'s assembler, particular, exceptionally ingenious re-organize instruction sequence substantially search something useful delay slot. assembler directive ``.noreorder'' available where this must happen. Hiding load delay: many assemblers will detect attempt result load next instruction, will either move code around insert nop. Unaligned transfers: ``unaligned'' load/store instructions will fetch halfword word quantities correctly, even target address turns unaligned. Other pipeline corrections: some instructions (such those which integer multiply unit) have additional constraints that implementation specific (see Appendix hazards). Many assemblers will just "handle" these cases automatically, least warn programmer about possible hazards violations. Other optimizations: some MIPS instructions (particularly floating point) take multiple clocks produce results. However, hardware ``interlocked'', programmer does need aware these delays write correct programs. MIPS Corp.'s assembler particularly aggressive these circumstances, will perform substantial code movement make faster. This need considered when debugging. general, best dis-assembler utility disassemble resulting binary during debug. This will show system designers true code sequence being executed, thus "uncover" modifications made assembler compiler.
MIPS-1 (R30xx) ARCHITECTURE
CHAPTER
Integrated Device Technology, Inc.
PROGRAMMER'S VIEW PROCESSOR ARCHITECTURE
This chapter describes assembly programmer's view architecture, terms registers, instructions, computational resources. This viewpoint corresponds, example, assembly programmer writing user applications (although more typically, such programmer would high-level language). Information about kernel software development (such handling interrupts, traps, cache memory management) described later chapters.
Registers
There general purpose registers: $31. Two, only two, special hardware: always returns zero, matter what software attempts store used normal subroutine-calling instruction (jal) return address. Note that call-by-register version (jalr) register return address, though practice only $31. other respects registers identical used instruction used destination instructions; value will remain unchanged, however, instruction would effectively NOP). MIPS architecture ``program counter'' register, probably better think that way. return address instructions later sequence (the instruction after jump delay slot instruction); instruction after call call's ``delay slot'' typically used last parameter. There condition codes nothing ``status register'' other internals consequence user-level programmer. There registers associated with integer multiplier. These registers, referred "HI" "LO", contain 64-bit product result multiply operation, quotient remainder divide. floating point math co-processor (called floating point accelerator), available, adds floating point registers; simple assembler language they just called again fact that these floating point registers implicitly defined instruction. Actually, only even-numbered registers usable math; they used either single-precision bit) double-precision (64-bit) numbers, When performing double-precision arithmetic, numbered register $N+1 holds remaining bits even numbered register identified Only moves between integer FPA, load/ store instructions, ever refer odd-numbered registers (and even then assembler helps programmer forget.)
also different registers called ``co-processor registers'' control purposes. These typically used manage actions/state FPA, should confused with data registers.
CHAPTER
MIPS-1 (R30xx) ARCHITECTURE
Conventional names uses general-purpose registers
Although hardware makes rules about registers, their practical governed number conventions. These conventions allow inter-changeability tools, operating systems, library modules. strongly recommended that these conventions followed.
8-15 24-25 16-23 Name zero v0-v1 a0-a3 t0-t7 t8-t9 s0-s7 Subroutine ``register variables''; subroutine which will write these must save value restore before exits, calling routine sees their values preserved. Reserved interrupt/trap handler change under your feet global pointer some runtime systems maintain this give easy access (some) ``static'' ``extern'' variables. stack pointer register variable. Subroutines which need this ``frame pointer''. Return address subroutine Always returns (assembler temporary) Reserved assembler Value (except returned subroutine (arguments) First four parameters subroutine (temporaries) subroutines without saving Used
26-27
k0-k1 s8/fp
Table 2.1. Conventional names registers with usage mnemonics
With conventional uses registers conventional names. Given need with conventions, conventional names pretty much mandatory. common names described Table 2.1, "Conventional names registers with usage mnemonics". Notes conventional register names this register reserved inside synthetic instructions generated assembler. programmer must explicitly directive .noat stops assembler from using then there some things assembler won't able v0-v1 used when returning non-floating-point values from subroutine. return anything bigger than bits, memory must used (described later chapter). a0-a3 used pass first four non-FP parameters subroutine. That's occasionally-false oversimplification; actual convention fully described later chapter. t0-t9 convention, subroutines these values without preserving them. This makes them easy ``temporaries'' when evaluating expressions caller must remember that they destroyed subroutine call. s0-s8 convention, subroutines must guarantee that values these registers exit same they were entry either using them, saving them stack restoring before exit. This makes them eminently suitable ``register variables'' storing value which must preserved over subroutine call.
MIPS-1 (R30xx) ARCHITECTURE
CHAPTER
k0-k1 reserved trap/interrupt routines, which will restore their original value; they little anyone else. (global pointer). present, will point load-time-determined location midst your static data. This means that loads stores data lying within 32Kbytes either side value performed single instruction using base register. Without global pointer, loading data from static memory area takes instructions: load most significant bits 32bit constant address computed compiler loader, data load. compiler must know compile time that datum will linked within 64Kbyte range memory locations. practice can't know, only guess. usual practice ``small'' global data items area pointed linker complain still gets big. definition what "small" typically specified with compiler switch (most compilers "G"). most common default size bytes less. compilation systems loaders support (stack pointer). Since takes explicit instructions raise lower stack pointer, generally done only subroutine entry exit; responsibility subroutine being called this. normally adjusted, entry, lowest point that stack will need reach point subroutine. compiler access stack variables constant offset from Stack usage conventions explained later chapter. (also known s8). subroutine will ``frame pointer'' keep track stack wants operations which involve extending stack amount which determined run-time. Some languages this explicitly; assembler programmers always welcome experiment; (for many toolchains) programs which ``alloca'' library routine will find themselves doing this case possible access stack variables from initialized function prologue constant position relative function's stack frame. Note that ``frame pointer'' subroutine call called subroutines which frame pointer; long functions calls preserve value they should) this (return address). entry subroutine, holds address which control should returned subroutine typically ends with instruction ``jr ra''. Subroutines which themselves call subroutines must first save usually stack.
Integer multiply unit registers
MIPS' architects decided that integer multiplication important enough deserve hard-wired instruction. This common RISCs, which might instead: implement ``multiply step'' which fits standard integer execution pipeline, require software routines every multiplication (e.g. Sparc AM29000); perform integer multiplication floating point unit good solution which compromises optional nature MIPS floating point ``co-processor''. multiply unit consumes small amount area, dramatically improves performance (and cache performance) over "multiply step" operations. It's basic operation multiply 32-bit values together produce 64-bit result, which stored 32-bit
CHAPTER
MIPS-1 (R30xx) ARCHITECTURE
registers (called ``hi'' ``lo'') which private multiply unit. Instructions mfhi, mflo defined copy result into general registers. Unlike results integer operations, multiply result registers interlocked. attempt read results before multiplication complete results being stopped until operation completes. integer multiply unit will also perform integer division between values general-purpose registers; this case ``lo'' register stores quotient, ``hi'' register remainder. R30xx family, multiply operations take clocks division takes assembler synthetic multiply operation which starts multiply then retrieves result into ordinary register. Note that MIPS Corp.'s assembler even substitute series shifts adds multiplication constant, improve execution speed. Multiply/divide results written into ``hi'' ``lo'' soon they available; effect deferred until writeback pipeline stage, with writes general purpose (GP) registers. mfhi mflo instruction interrupted some kind exception before reaches writeback stage pipeline, will aborted with intention restarting However, subsequent multiply instruction which passed stage will continue parallel with exception processing) would overwrite ``hi'' ``lo'' register values, that re-execution mfhi would wrong (i.e. new) data. this reason recommended that multiply should started within instructions mfhi/ mflo. assembler will avoid doing this where can. Integer multiply divide operations never produce exception, though divide zero produces undefined result. Compilers will often generate code trap errors, particularly divide zero. Frequently, this instruction sequence placed after divide initiated, allow execute concurrently with divide (and avoid performance loss). Instructions mthi, mtlo defined setup internal registers from general-purpose registers. They essential restore values ``hi'' ``lo'' when returning from exception, probably anything else.
Instruction types
full list R30xx family integer instructions presented Appendix Floating point instructions listed Appendix this manual. Currently, floating point instructions only available R3081, described R3081 User's Manual. MIPS-1 uses only three basic instruction encoding formats; this keys high-frequencies attained RISC architectures. Instructions mostly numerical order; simplify reading, list occasionally re-ordered clarity. Throughout this manual, description various instructions will also refer various subfields instruction. general, following typical nomenclature used: basic op-code, which bits long. Instructions which large sub-fields (for example, large immediate values, such required ``long'' j/jal instructions, arithmetic with 16-bit constant) have unique ``op'' field. Other instructions classified groups sharing ``op'' value, distinguished other fields (``op2'' etc.). rs1, fields identifying source registers. register changed this instruction. Shift-amount: shift, used shift-by-constant instructions.
MIPS-1 (R30xx) ARCHITECTURE
CHAPTER
Sub-code field used 3-register arithmetic/logical group instructions value zero). offset 16-bit signed word offset defining destination ``PCrelative'' branch. branch target will instruction ``offset'' words away from ``delay slot'' instruction after branch; branch-to-self offset target 26-bit word address jumped corresponds 28-bit byte address, which always word-aligned). long instruction rarely used, this format pretty much exclusively function calls (jal). high-order bits target address can't specified this instruction, taken from address jump instruction. This means that these instructions reach anywhere 256Mbyte region around instructions' location. jump further (jump register) instruction. constant 16-bit integer constant ``immediate'' arithmetic logic operations. another extended opcode field, this time used ``coprocessor'' type instructions. Field which hold source destination register. Field hold number control register (different from integer register file). Called ``crs''/``crd'' contexts where must source/destination respectively. instruction encodings have been chosen facilitate design high-frequency CPU. Specifically:. instruction encodings reveal portions internal design. Although there variable encodings, those fields which required very early pipeline encoded very regular way: Source registers always same place that fetch instructions from integer register file without conditional decoding. Some instructions need both registers since register file designed provide source values every clock nothing been lost. 16-bit constant always same place permitting appropriate instruction bits directly into ALU's input multiplexer, without conditional shifts.
Loading storing: addressing modes
mentioned above, there only basic ``addressing mode''. load store machine instruction written
operation dest-reg, offset(src-reg) e.g.:lw offset($2); offset($4)
registers used destination source. offset signed, 16-bit number anywhere between -32768 32767); program address used load dest-reg offset. This address mode normally enough pick particular member structure (``offset'' being distance between start structure member required); implements array indexed constant; enough reference function variables from stack frame pointer; provide reasonable sized global area around value static extern variables. assembler provides semblance simple direct addressing mode, load values memory variables whose address computed link time.
CHAPTER
MIPS-1 (R30xx) ARCHITECTURE
More complex modes such double-register scaled index must implemented with sequences instructions.
Data types Memory registers
R30xx family CPUs load store between bytes single operation. Naming conventions used documentation build instruction mnemonics:
``C'' name long short char MIPS name word word halfword byte Size(bytes) Assembler mnemonic ``w'' ``w'' ``h'' ``b''
Integer data types Byte halfword loads come flavors: Sign-extend load value into least significant bits 32-bit register, fill high order bits copying ``sign bit'' (bit byte, half-word). This correctly converts signed value 32-bit signed integer. Zero-extend instructions load value into least significant bits 32-bit register, with high order bits filled with zero. This correctly converts unsigned value memory corresponding 32-bit unsigned integer value; byte value becomes 32-bit value 254. byte-wide memory location whose address contains value 0xFE (-2, interpreted unsigned), then:
0(t1) 0(t1)
will leave holding value 0xFFFF FFFE signed 32-bit) holding value 0x0000 00FE (254 signed unsigned 32-bit). Subtle differences shorter integers extended longer ones historical cause portability problems, modern standards have elaborate rules. machines like MIPS, which does perform 16-bit precision arithmetic directly, expressions involving short char variables less efficient than word operations. Unaligned loads stores Normal loads stores MIPS architecture must aligned; halfwords loaded only from 2-byte boundaries, words only from 4byte boundaries. load instruction with unaligned address will produce trap. Because CISC architectures such MC680x0 iAPXx86 handle unaligned loads stores, this could complicate porting software from these architectures. MIPS architecture does provide mechanisms support this type operation; extremity, software provide trap handler which will emulate desired load operation hide this feature from application. data items declared code will correctly aligned. when known advance that program will transfer word from address whose alignment unknown will computed time, architecture does allow special 2-instruction sequence (much more efficient than series byte loads, shifts assembly). This sequence normally generated macro-instruction (unaligned load word).
MIPS-1 (R30xx) ARCHITECTURE
CHAPTER
macro-instruction ulh, unaligned load half, also provided, synthesized loads, shift, bitwise ``or'' operation.) special machine instructions (load word left, load word right). ``Left'' ``right'' arithmetical directions, ``shift left''; ``left'' movement towards more significant bits, ``right'' towards less significant bits. These instructions three things: load bytes from within aligned 4-byte (word) location; shift that data move byte selected address either most-significant (lwl) least-significant (lwr) 32-bit field; merge bytes fetched from memory with data already destination. This breaks most rules architecture usually sticks does logical operation memory variable, example. Special hardware allows lwl, pair used consecutive instructions, even though second instruction uses value generated first. example, configured big-endian assembler instruction:
0(t2)
implemented
0(t2) 3(t2)
Where: picks lowest-addressed byte unaligned 4-byte region, together with however many more bytes which into aligned word. then shifts them left, form most-significant bytes register value. aimed highest-addressed byte unaligned 4-byte region. loads together with bytes which precede same memory word, shifts right least significant bits register value. merge leaves high-order bits unchanged. Although special hardware ensures that required between lwr, there still load delay between second them normal instruction. Note that fact 4-byte aligned, then both instructions load entire word; duplicating effort, achieving desired effect. behavior when operating with little-endian byte order described later chapter. Floating point data memory Loads into floating point registers from 4-byte aligned memory move data without interpretation program load invalid floating point number error will result until arithmetic operation requested with operand. This allows programmer load single-precision values load into even-numbered floating point register; programmer also load double-precision value macro instruction, that:
ldc1 $f2, 24(t1)
expanded loads consecutive registers:
lwc1 lwc1 $f2, 24(t1) $f3, 28(t1)
CHAPTER
MIPS-1 (R30xx) ARCHITECTURE
compiler aligns 8-byte long double-precision floating point variables 8-byte boundaries. R30xx family hardware does require this alignment; done avoid compatibility problems with implementations MIPS-2 MIPS-3 CPUs such R4600 (Orion), where ldc1 instruction part machine code, alignment necessary.
BASIC ADDRESS SPACE
which MIPS processors handle addresses subtly different from that traditional CISC CPUs, appear confusing. Read first part this section carefully. Here some guidelines: addresses into programs rarely same physical addresses which come chip (sometimes they're close, same). This manual will refer them program addresses physical addresses respectively. more common name program addresses "virtual addresses"; note that term "virtual address" does necessarily imply that operating system must perform virtual memory management (e.g. demand paging from disks.), rather that address undergoes some transformation before being presented physical memory. Although virtual address proper term, this manual will typically term "program address" avoid confusing virtual addresses with virtual memory management requirements. MIPS-1 operating modes: user kernel. user mode, address above 2Gbytes (most-significant address set) illegal causes trap. Also, some instructions cause trap user mode. 32-bit program address space divided into four areas with traditional names; different things happen according area address lies kuseg 0000 0000 7FFF FFFF (low 2Gbytes): these addresses permitted user mode. machines with ("E" versions R30xx family), they will always translated (more about R30xx later chapter). Software should attempt these addresses unless machines without ("base" versions R30xx family), kuseg "program address" transformed physical address adding offset; address transformations "base versions" R30xx family described later this chapter. Note, however, that many embedded applications this address segment (those applications which require that kernel resources protected from user tasks). kseg0 0x8000 0000 9FFF FFFF (512 Mbytes): these addresses ``translated'' into physical addresses merely stripping bit, mapping them contiguously into Mbytes physical memory. This transformation operates same both "base" family members. This segment referred "unmapped" because version devices cannot redirect this translation different area physical memory. Addresses this region always accessed through cache, used until caches properly initialized. They will used most programs data systems using "base" family members; will used kernel systems which ("E" version devices).
MIPS-1 (R30xx) ARCHITECTURE
CHAPTER
kseg1 0xA000 0000 BFFF FFFF (512 Mbytes): these addresses mapped into physical addresses stripping leading three bits, giving duplicate mapping Mbytes physical memory. However, kseg1 program address accesses will cache. kseg1 region only chunk memory which guaranteed behave properly from system reset; that's after-reset starting point 0xBFC0 0000, commonly called "reset exception vector") lies within physical address starting point 0x1FC0 0000 which means that hardware should place boot this physical address. Software will therefore this region initial program ROM, most systems also registers. general, devices should always mapped addresses that accessible from Kseg1, system always mapped contain reset exception vector. Note that code then accessed uncacheably (during boot using kseg1 program addresses, also accessed cacheably (for normal operation) using kseg0 program addresses. kseg2 0xC000 0000 FFFF FFFF Gbyte): this area only accessible kernel mode. kuseg, devices program addresses translated into physical addresses; thus, these addresses must referenced prior initialization. "base versions", physical addresses generated same program addresses kseg2. Note that many systems will need this region. versions, frequently contains structures such page tables; simpler OS'es probably will have little need kseg2.
SUMMARY SYSTEM ADDRESSING
MIPS program addresses rarely simply same physical addresses, simple embedded software will probably addresses kseg0 kseg1, where program address related obvious unchangeable physical addresses. Physical memory locations from 0x2000 0000 (512Mbyte) upward difficult access. versions R30xx family, only reach these addresses through MMU. "base" family members, certain these physical addresses reached using kseg2 kuseg addresses: address transformations base R30xx family members described later this chapter.
Kernel user mode
kernel mode (the resets into this state), program addresses accessible. user mode: Program addresses above 2Gbytes (top set) illegal will cause trap. Note that MMU, this means valid user mode addresses must translated MMU; thus, User mode devices typically requires memory-mapped "base" CPUs, kuseg addresses mapped distinct area physical memory. Thus, kernel memory resources (including devices) made inaccessible User mode software, without requiring memory-mapping function from Alternately, hardware choose "ignore" high-order address bits when performing address decoding, thus "condensing" kuseg, kseg2, kseg1, kseg0 into same physical memory.
CHAPTER
MIPS-1 (R30xx) ARCHITECTURE
Instructions beyond standard user become illegal. Specifically, kernel prevent User mode software from accessing onchip (system control coprocessor, which controls exception machine state performs memory management functions CPU). Thus, primary differences between User Kernel modes are: User mode tasks inhibited from accessing kernel memory resources, including data structures devices. This also means that various user tasks protected from each other. User mode tasks inhibited from modifying basic machine state, prohibiting accesses CP0. Note that kernel/user mode does change interpretation anything just some things cease allowed user mode. kernel mode access addresses just user mode, they will translated same way.
Memory CPUs without hardware
treatment kseg0 kseg1 addresses same R30xx CPUs. system implemented using only physical addresses 512Mbytes, system software written only kseg0 kseg1, then choice "base" versions R30xx family relevant. versions without ("base versions"), addresses kuseg kseg2 will undergo fixed address translation, provide system designer option provide additional memory. base members R30xx family provide following address translations kuseg kseg2 program addresses: kuseg: this region (the 2Gbytes program addresses) translated contiguous 2Gbyte physical region between 13Gbytes. effect, offset added each kuseg program address. hex:
Program address 0x0000 0000 0x7FFF FFFF Physical Address 0x4000 0000 0xBFFF FFFF
kseg2: these program addresses genuinely untranslated. program addresses from 0xC000 0000 0xFFFF FFFF emerge identical physical addresses. This means that "base" versions generate most physical addresses (without MMU), except between 512Mbyte 1Gbyte (0x2000 0000 through 0x3FFF FFFF). noted above, many systems ignore high-order address bits when performing address decoding, thus condensing physical memory into lowest 512MB addresses. Subsegments R3041 memory width configuration R3041 configured access different regions memory either 32-, 8-bits wide. Where program requests 32-bit operation narrow memory (either with uncached access, cache miss, store), break transaction into multiple data phases, match datum size memory port width. width configuration applied independently subsegments normal kseg regions, follows: kseg0 kseg1: usual, these both mapped onto 512Mbytes. This common region split into subsegments (64Mbytes each), each which programmed 32bits wide. width assignment affects both kseg0 kseg1 accesses (that view these subsegments corresponding "physical" addresses).
2-10
MIPS-1 (R30xx) ARCHITECTURE
CHAPTER
kuseg: divided into four 512Mbyte subsegments, each independently programmable width. Thus, kuseg broken into multiple portions, which have varying widths. example this 32-bit main memory with some 16-bit PCMCIA font cards 8-bit NVRAM. kseg2: divided into 512Mbyte subsegments, independently programmable width. Again, this means that kseg2 support multiple memory subsystems, varying port width. Note that once various memory port widths have been configured (typically boot time), software does have aware actual width memory system. choose treat memory 32-bit wide, will automatically adjust when access made narrower memory region. This simplifies software development, also facilitates porting various system implementations (which choose same memory port widths).
2-11
MIPS-1 (R30xx) ARCHITECTURE
CHAPTER
Integrated Device Technology, Inc.
PROGRAMMER'S VIEW PROCESSOR ARCHITECTURE
This chapter describes assembly programmer's view architecture, terms registers, instructions, computational resources. This viewpoint corresponds, example, assembly programmer writing user applications (although more typically, such programmer would high-level language). Information about kernel software development (such handling interrupts, traps, cache memory management) described later chapters.
Registers
There general purpose registers: $31. Two, only two, special hardware: always returns zero, matter what software attempts store used normal subroutine-calling instruction (jal) return address. Note that call-by-register version (jalr) register return address, though practice only $31. other respects registers identical used instruction used destination instructions; value will remain unchanged, however, instruction would effectively NOP). MIPS architecture ``program counter'' register, probably better think that way. return address instructions later sequence (the instruction after jump delay slot instruction); instruction after call call's ``delay slot'' typically used last parameter. There condition codes nothing ``status register'' other internals consequence user-level programmer. There registers associated with integer multiplier. These registers, referred "HI" "LO", contain 64-bit product result multiply operation, quotient remainder divide. floating point math co-processor (called floating point accelerator), available, adds floating point registers; simple assembler language they just called again fact that these floating point registers implicitly defined instruction. Actually, only even-numbered registers usable math; they used either single-precision bit) double-precision (64-bit) numbers, When performing double-precision arithmetic, numbered register $N+1 holds remaining bits even numbered register identified Only moves between integer FPA, load/ store instructions, ever refer odd-numbered registers (and even then assembler helps programmer forget.)
also different registers called ``co-processor registers'' control purposes. These typically used manage actions/state FPA, should confused with data registers.
SYSTEM CONTROL COPROCESSOR ARCHITECTURE
CHAPTER
Integrated Device Technology, Inc.
This chapter concentrates aspects R30xx family architecture that must managed programmer. Note that most these features transparent user program author; however, nature embedded systems such that most embedded systems programmers will have view underlying system architecture, thus will find this material important.
Co-processors MIPS uses term "co-processor" both traditional fashion, also non-traditional fashion. Specifically, device traditional microprocessor co-processor: optional part architecture, with particular instruction set. Opcodes reserved instruction fields defined four ``coprocessors''. Architecturally, co-processors tightly coupled base integer CPU; example, defines instructions move data directly between memory coprocessor, rather than requiring moved into integer processor first. However, MIPS also uses term "co-processor" functions required manage environment, including exception management, cache control, memory management. This segmentation insures that chip architecture varied (e.g. cache architecture, interrupt controller, etc.), without impacting user mode software compatibility. These functions grouped MIPS into on-chip "co-processor ``system control co-processor'' these instructions implement whole control system. Note that co-processor independent existence, certainly optional. provides standard encoding instructions which access status register; that, although definition status register changes among implementations, programmers same assembler both CPUs. Similarly, exception memory management strategies varied among implementations, these effects isolated particular portions kernel.
CONTROL SUMMARY
This chapter, coupled with chapters cache management, memory management, exception processing, provide details managing machine state. areas interest include: control co-processor privileged instructions organized, with shortform descriptions. There relatively privileged instructions; most low-level control over exercised reading writing bit-fields within special registers. Exceptions external interrupts, invalid operations, arithmetic errors result ``exceptions'', where control transferred exception handler routine. MIPS exceptions extremely simple hardware does absolute minimum, allowing programmer tailor exception mechanism needs particular system. later chapter describes MIPS exceptions, they ``precise'', exception vectors, conventions about code exception handling routines. Special problems arise with nested exceptions: exceptions occurring while still handling earlier exception.
CHAPTER
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
Hardware interrupts have their style rules. Exception Management chapter includes annotated example moderately-complicated exception handler. Caches cache management R30xx implementations have dual caches (the I-cache instructions, D-cache data). On-chip hardware provided manage caches, programmer working with devices, particularly with devices, need explicitly manage caches particular situations. manipulate caches, allows software isolate them, inhibiting cache/memory traffic allowing processor access cache were simple memory; swap roles I-cache D-cache (the only make I-cache writable). Caches must sometimes cleared stale invalid/uninitialized data. Even following power-up, R30xx caches random state must cleaned before they used. later chapter will discuss techniques used software manage on-chip cache resources. addition, techniques determine on-chip cache sizes will shown (greatest flexibility achieved software written independent cache sizes). diagnostics programmer, techniques test cache memory probe particular entries will discussed. some implementations system designer make configuration choices about cache (e.g. R3081 R3071 allow cache organization selected between 16kB I-cache/ D-cache each cache). cache management chapter will also discuss some considerations apply make proper selection. Write buffer R30xx family CPUs D-cache always write through; writes main memory well cache. This simplifies caches, main memory won't able accept data fast write Much performance loss made using FIFO store which holds number ``write cycles'' stores both address data). R30xx family, this FIFO, called write buffer, integrated on-chip. System programmers need know that writes happen later than code sequence suggests. chapter cache management discusses this. Starting reset almost nothing defined, software must build carefully. MIPS CPUs, reset implemented almost exactly same exceptions. later chapter reset initialization discusses ways finding which executing software, program run. example runtime environment, attending stack special registers, provided. Memory management later chapter will discuss address translation managing translation hardware (the TLB). This section mostly programmers.
CONTROL ``CO-PROCESSOR control instructions
Most control functions implemented with registers (most which consist multiple bitfields). MIPS architecture escape mechanism define instructions ``co-processors'' control instructions coded ``co-processor 0''.
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CHAPTER
There several control instructions used memory management implementation, which described later chapter. leaving aside MMU, control defines just instruction beyond necessary move from control registers. mtc0 <nn> -Move co-processor zero Loads ``co-processor register number from general register unusual, good practice, refer control registers their number assembler sources; normal practice names listed Table 3.1, "Summary control registers (not MMU)". some toolchains names defined C-style ``include'' file, preprocessor front-end assembler; assembler manual should provide guidance this. This only setting bits control register. mfc0 >-Move from co-processor zero General register loaded with values from control register number Once again, common symbolic name macro-processor save remembering numbers. This only inspecting bits control register. -Restore from exception Note that this ``return from exception''. This instruction restores status register back state prior trap. understand what does, refer status register defined later this chapter. only secure returning user mode from exception return with instruction which delay slot.
Standard control registers
This table describes general control registers (ignoring control registers). Also note that typical convention reserve exception processing, although they proper registers integer unit.
Register Mnemonic PRId Cause BadVaddr Description type level (status register) mode flags Describes most recently recognized exception Return address from trap Contains last invalid program address which caused trap. address errors kinds, even there configuration (R3081 R3041 only) (R3041 only) configure interface signals. Needs setup match hardware implementation. (R3041 only) used flag some program address regions 16-bits wide. Must programmed match hardware implementation. (R3041 only, read/write) 24-bit counter incrementing with clock. (R3041 only, read/write) 24-bit value used wraparound Count value output signal.
Config BusCtrl
PortSize
Count Compare
Table 3.1. Summary control registers (not MMU)
CHAPTER Encoding control registers
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
next section describes format control registers, with sketch function each field. most cases, more information about things work found separate sections chapters later. note about reserved fields order here. Many unused control register fields marked ``0''. Bits such fields guaranteed read zero, should written zero. Other reserved fields marked ``reserved'' software must always write them zero, should assume that will back zero other particular value. Registers specific memory management system described later chapter.
PRId Register
reserved Figure 3.1. PRId Register fields
Figure 3.1, "PRId Register fields" shows layout PRId register, read-only register consulted identify type (more properly, this register describes CP0, allowing kernel dynamically configure itself various implementations). ``Imp'' should related control register set. encoding described below:
type R3000A (including R3051, R3052, R3071, R3081) unique (R3041) ``Imp'' value
Note that when field indicates unique, revision number used distinguish among various implementations. Refer R3041 User's manual revision level appropriate that device. Since R3051, kernel compatible with R3000A, they share same value. When printing value this register, conventional print them ``x.y'' where ``x'' ``y'' decimal values respectively. this register manuals size things, establish presence absence particular features; software will more portable robust designed include code sequences probe existence individual features. This manual will provide numerous examples designed determine cache sizes, presence absence TLB, FPA, etc. Register
Figure 3.2.
Fields status register (SR)
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CHAPTER
MIPS remarkably mode bits; those that exist defined fields status register shown Figure 3.2, "Fields status register (SR)". Note that there modes such non-translated non-cached MIPS CPUs; translation caching decisions made basis program address. Fields are: CU3, Bits (31:30) control usability ``co-processors'' respectively. R30xx family, these might enabled software wishes BrCond(3:2) input pins polling, speed exception decoding. ``co-processor usable'': present, disable. When instructions cause exception, even kernel. useful turn even when available; also enabled devices which include FPA, intent BrCond(1) polled input. ``co-processor usable'': able some nominallyprivileged instructions user mode (this rarely ever done). control instructions encoded ``co-processor type always usable kernel mode, regardless setting this bit. ``reverse endianness user mode''. MIPS processors configured, reset time, with either ``endianness'' (byte ordering convention, discussed various CPU's User's Manuals later this manual). allows binaries intended with byte ordering convention systems with opposite convention, presuming software provided necessary support. When active, user-privilege software runs been configured with opposite endianness. However, achieving cross-universe running would require large software effort well, should necessary embedded systems. ``boot exception vectors'': when uses (kseg1) space exception entry point (described later chapter). usually zero running systems; this relocates exception vectors. addresses, speeding accesses allowing "user supplied" exception service routines. ``TLB shutdown'': devices which implement full R3000A MMU, gets program address simultaneously matches entries. Prolonged operation this state, some implementations, could cause internal contention damage chip. shutdown terminal, cleared only hardware reset. base family members, which include TLB, this reset; software rely this feature determine presence absence support hardware. cache parity error occurred. exception generated this condition, which really only useful diagnostics. MIPS architecture cache diagnostic facilities because earlier versions used external caches, this provided verify timing particular system. those implementations cache parity error essential design debug tool. CPUs with on-chip caches this feature rarely needed; only R3071 R3081 implement parity over on-chip caches.
CHAPTER
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
shows result last load operation performed with Dcache isolated (described chapter cache management). cache really contained data addressed memory location (i.e. load would have cache even cache been isolated). When set, cache parity bits written zero checked. This useful R3000A systems which required external cache RAMs, little relevance R30xx family. ``swap caches'' ``isolate (data) cache''. Cache mode bits cache management diagnostics; their described detail later chapter cache management. simple terms: makes loads stores access only data cache, never memory; this mode partialword store invalidates cache entry. Note that when this set, even uncached data accesses will seen bus; further, this initialized reset. Boot-up software must insure this properly initialized before relying external data references. reverses roles I-cache D-cache, that software access invalidate I-cache entries. ``interrupt mask'': field defining which interrupt sources, when active, will allowed cause exception. interrupt sources external pins (one used FPA, which although lives same chip logically external); other software-writable interrupt bits Cause register. interrupt prioritization provided CPU: hardware treats interrupt bits same. This described greater detail chapter dealing with exceptions.
SwC,
basic protection bits. when running with kernel privileges, user mode. kernel mode, software whole program address space, privileged (``co-processor 0'') instructions. User mode restricts software program addresses between 0x0000 0000 0x7FFF FFFF, denied permission privileged instructions; attempts break rules result exception. prevent taking interrupt, enable. KUp, IEp``KU previous, previous'': exception, hardware takes values saves them here; same time changing values KUc, (kernel mode, interrupts disabled). instruction used copy KUp, back into KUc, IEc. KUo, IEo``KU old, old'': exception KUp, bits saved here. Effectively, KU/IE bits operated 3-deep, 2-bit wide stack which pushed exception popped rfe. This provides chance recovering cleanly from exception occurring early exception handling routine that first exception saved circumstances which this done limited, probably only really allowing user refill code made little shorter, described chapter memory management.
KUc,
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CHAPTER
Cause Register
Figure 3.3.
ExcCode
Fields Cause register
Figure 3.3, "Fields Cause register" shows fields Cause register, which consulted determine kind exception which happened will used decide which exception routine call. ``branch delay'': set, this indicates that does point actual "exception" instruction, rather branch instruction which immediately precedes When exception restart point instruction which ``delay slot'' following branch, point branch instruction; harmless re-execute branch, returned from exception branch delay instruction itself branch would taken exception would have broken interrupted program. only time software might sensitive this must analyze ``offending'' instruction then instruction This would occur instruction needs emulated (e.g. floating point instruction device with hardware FPA; breakpoint placed branch delay slot). ``co-processor error'': exception taken because ``coprocessor'' format instruction ``co-processor'' which enabled then this field coprocessor number from that instruction. ``Interrupt Pending'': shows interrupts which currently asserted (but "masked" from actually signalling exception). These bits follow inputs hardware levels. Bits read/writable, contain value last written them. However, bits active when enabled appropriate global interrupt enable flag will cause interrupt. subtly different from rest Cause register fields; doesn't indicate what happened when exception took place, rather shows what happening now. ExcCode 5-bit code which indicates what kind exception happened, detailed Table 3.2, "ExcCode values: different kinds exceptions".
ExcCode Value Mnemonic TLBL TLBS AdEL AdES Address error load/I-fetch store respectively). Either attempt access outside kuseg when user mode, attempt read word half-word misaligned address. Interrupt ``TLB modification'' ``TLB load/TLB store'' Description
Table 3.2. ExcCode values: different kinds exceptions
CHAPTER ExcCode Value Mnemonic Description
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
error (instruction fetch data load, respectively). External hardware signalled error some kind; proper exception handling system-dependent. R30xx family CPUs can't take error store; write buffer would make such exception "imprecise". Generated unconditionally syscall instruction. Breakpoint break instruction. ``reserved instruction'' ``Co-Processor unusable'' ``arithmetic overflow''. Note that ``unsigned'' versions instructions (e.g. addu) never cause this exception. reserved. Some already defined MIPS CPUs such R6000 R4xxx
13-31
Syscall
Table 3.2. ExcCode values: different kinds exceptions
Register This 32-bit register containing 32-bit address return point this exception. instruction causing suffering) exception EPC, unless Cause, which case points previous (branch) instruction. BadVaddr Register 32-bit register containing address whose reference exception; MMU-related exception, attempt user program access addresses outside kuseg, address wrongly aligned datum size referenced. After other exception this register undefined. Note particular that after error.
R3041, R3071, R3081 specific registers
Count Compare Registers (R3041 only) Only present R3041, these provide simple 24-bit counter/timer running cycle rate. Count counts then wraps around zero once reached value Compare register. wraps around output asserted. According configuration (bit BusCtrl register), will either remain active until reset software (re-write Compare), will pulse. either case counter just keeps counting. generate interrupt must connected interrupt inputs. From reset Compare setup maximum value 0xFF FFFF), counter runs 224-1 before wrapping around. Config Register (R3071 R3081)
Lock Slow Refill FPInt Halt reserved
Figure 3.4.
Fields R3071/81 Config Register
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CHAPTER
Lock this write register last time; future writes Config will ignored. intention that initialization software will register then lock case some illbehaved piece software developed some earlier version MIPS architecture tries stomp Config; this would have effect earlier CPUs. Slow hardware require that this set. only matters when performs store while running from cached location. system hardware design determines proper setting this bit; setting should permissible system, loses some performance memory systems able support more aggressive performance. idle cycle guaranteed between read write transfer. This enables additional time tri-stating, control logic generation, etc. ``data cache block refill'', reload words into data cache miss, reload just word. initialized either R3081, reset-time hardware input. FPInt controls interrupt level which interrupts reported. original R3000 CPUs external this determined wiring; R3081's chip would inefficient (and jeopardize pin-compatibility) send interrupt chip again. FPInt binary value interrupt number which dedicated interrupts. default field initialized "011'' select Int3; MIPS convention external interrupt whichever dedicated FPA, will then ignore value external pin; field cause register will simply follow FPA. R3071, this field "reserved", must written "000". Halt bring standstill. will start again soon interrupt input asserted (regardless state interrupt mask). This useful power reduction, also used emulate MC68000 "Halt" operation. slows 1/16th normal clock rate, reduce power consumption. Illegal unless running 33Mhz higher. Note that CPUs output clock (which normally used synchronize interface logic) slows down too; hardware design should also accommodate this feature software desires ``alternate cache''. I-cache/4K D-cache, I-cache/8K D-cache. Reserved must only written zero. will probably read zero, software should rely this. Config Register (R3041)
Lock
Figure 3.5.
Fields R3041 Config (Cache Configuration) Register
Take care: external Int3 corresponds numbered ``5'' Cause register register. That's because both Cause fields support ``software interrupts'' numbered bits
CHAPTER
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
Lock: finally configure register (additional writes will have effect until reset). fields exactly value shown. DBR: ``DBlockRefill'', read words into cache miss, refill just word missed proper setting given system dependent number factors, best determined measuring performance each mode selecting best one. Note that possible software dynamically reconfigure refill algorithm depending current code executing, presuming register been "locked". FDM: "Force D-Cache Miss", R3041-specific cache mode, where loads result data being fetched from memory (missing data cache), incoming data still used refill cache. Stores continue write cache. This useful when software desires obtain high-bandwidth cache cache refills, corresponding main memory "volatile" (e.g. FIFO, updated DMA). BusCtrl Register (R3041 only) R3041 many hardware interface options available other members R30xx family, which intended allow simpler cheaper interface memory components. BusCtrl register does most configuration work. needs strictly accordance with needs hardware implementation. Note also that default settings (from reset) leave interface compatible with other R30xx family members. Figure 3.6, "Fields R3041 Control (BusCtrl) Register" shows layout fields, their uses provided completeness.
0x30
Figure 3.6.
Fields R3041 Control (BusCtrl) Register
Lock: when software initialized BusCtrl desired state write this prevent contents being changed again until system reset. other numbers write exactly specified pattern this field (hex used ones, others given binary). Improper values cause test modes other unexpected side effects. ``MemStrobe* control''. this field binary, where means strobe activates reads, makes active writes. ``ExtDataEn* control''. Encoded ``Mem''. Note that must zero this function output. ``IOStrobe* control''. Encoded ``Mem''. Note that must zero this function output. BE16: ``BE16(1:0)* read control'' make these pins active write cycles only. ``BE(3:0)* read control'' make these pins active write cycles only. BTA: ``Bus turn around time''. Program with binary number between cycles guaranteed delay between read cycle start address phase next cycle. This field enables devices with slow tri-state time, enables system designer save cost omitting data transceivers.
3-10
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
CHAPTER
DMA: ``DMA Protocol Control'', enables ``DMA pulse protocol''. When set, uses control pins communicate desire even while progress. ``TC* negation control''. output which activated when internal timer register Count reaches value stored Compare. zero make just pulse couple clock periods; leave will asserted compare remain asserted until software explicitly clears re-writing Compare with value). used generate timer interrupt, then default pulse more useful when output being used external logic (e.g. signal DRAM refresh). ``SBrCond(3:2) control''. zero recycle SBrCond(3:2) pins IOStrobe ExtDataEn respectively. PortSize Register (R3041 only) PortSize register used flag different parts program address space accesses 32-bit wide memory. Settings this register have made time values which will mandated hardware design. ``IDT79R3041 Hardware User's Manual'' details.
What registers relevant when?
various registers their fields provide support specific times during system operation. After hardware reset: software must initialize into right state bootstrap itself. Hardware configuration start-up: R3041, R3071, R3081 require initialization Config, BusCtrl, and/or PortSize before very much will work. system hardware implementation will dictate proper configuration these registers. After exception: MIPS exception (apart from particular event) invokes single common ``general exception handler'' routine, fixed address. entry, program registers saved, only return address EPC. MIPS hardware knows nothing about stacks. case exception routine cannot user-mode stack purpose; exception might have been miss stack memory. Exception software will need least point some ``safe'' (exception-proof) memory space. information saved, using other register stage data from control registers where necessary. Consult Cause register find what kind exception dispatch accordingly. Returning from exception: control must eventually returned value stored entry. Whatever kind exception was, software will have adjust back upon return from exception. special instruction does job; note that does transfer control. make jump back software must load original value back into generalpurpose register operation. Interrupts: used adjust interrupt masks, determine which any) interrupts will allowed ``higher priority'' than current one. hardware offers interrupt prioritization, software whatever likes. Instructions which always cause exceptions: often used (for system calls, breakpoints, emulate some kinds instruction). These sometimes requires partial decoding offending
3-11
CHAPTER
SYSTEM CONTROL CO-PROCESSOR ARCHITECTURE
instruction, which usually found location EPC. there complication; suppose that exception occurs just after branch time prevent branch delay slot instruction from running. Then will point branch instruction (resuming execution starting delay slot would cause branch ignored), will set. This Cause register flags this event; find instruction which exception occurred, value when set. Cache management routines: contains bits defining special modes cache management. particular they allow software isolate data cache, swap roles instruction data caches. subsequent chapters will describe appropriate treatment these registers, provide software examples their use.
3-12
CHAPTER
MIPS-1 (R30xx) ARCHITECTURE
Conventional names uses general-purpose registers
Although hardware makes rules about registers, their practical governed number conventions. These conventions allow inter-changeability tools, operating systems, library modules. strongly recommended that these conventions followed.
8-15 24-25 16-23 Name zero v0-v1 a0-a3 t0-t7 t8-t9 s0-s7 Subroutine ``register variables''; subroutine which will write these must save value restore before exits, calling routine sees their values preserved. Reserved interrupt/trap handler change under your feet global pointer some runtime systems maintain this give easy access (some) ``static'' ``extern'' variables. stack pointer register variable. Subroutines which need this ``frame pointer''. Return address subroutine Always returns (assembler temporary) Reserved assembler Value (except returned subroutine (arguments) First four parameters subroutine (temporaries) subroutines without saving Used
26-27
k0-k1 s8/fp
Table 2.1. Conventional names registers with usage mnemonics
With conventional uses registers conventional names. Given need with conventions, conventional names pretty much mandatory. common names described Table 2.1, "Conventional names registers with usage mnemonics". Notes conventional register names this register reserved inside synthetic instructions generated assembler. programmer must explicitly directive .noat stops assembler from using then there some things assembler won't able v0-v1 used when returning non-floating-point values from subroutine. return anything bigger than bits, memory must used (described later chapter). a0-a3 used pass first four non-FP parameters subroutine. That's occasionally-false oversimplification; actual convention fully described later chapter. t0-t9 convention, subroutines these values without preserving them. This makes them easy ``temporaries'' when evaluating expressions caller must remember that they destroyed subroutine call. s0-s8 convention, subroutines must guarantee that values these registers exit same they were entry either using them, saving them stack restoring before exit. This makes them eminently suitable ``register variables'' storing value which must preserved over subroutine call.
MIPS-1 (R30xx) ARCHITECTURE
CHAPTER
k0-k1 reserved trap/interrupt routines, which will restore their original value; they little anyone else. (global pointer). present, will point load-time-determined location midst your static data. This means that loads stores data lying within 32Kbytes either side value performed single instruction using base register. Without global pointer, loading data from static memory area takes instructions: load most significant bits 32bit constant address computed compiler loader, data load. compiler must know compile time that datum will linked within 64Kbyte range memory locations. practice can't know, only guess. usual practice ``small'' global data items area pointed linker complain still gets big. definition what "small" typically specified with compiler switch (most compilers "G"). most common default size bytes less. compilation systems loaders support (stack pointer). Since takes explicit instructions raise lower stack pointer, generally done only subroutine entry exit; responsibility subroutine being called this. normally adjusted, entry, lowest point that stack will need reach point subroutine. compiler access stack variables constant offset from Stack usage conventions explained later chapter. (also known s8). subroutine will ``frame pointer'' keep track stack wants operations which involve extending stack amount which determined run-time. Some languages this explicitly; assembler programmers always welcome experiment; (for many toolchains) programs which ``alloca'' library routine will find themselves doing this case possible access stack variables from initialized function prologue constant position relative function's stack frame. Note that ``frame pointer'' subroutine call called subroutines which frame pointer; long functions calls preserve value they should) this (return address). entry subroutine, holds address which control should returned subroutine typically ends with instruction ``jr ra''. Subroutines which themselves call subroutines must first save usually stack.
Integer multiply unit registers
MIPS' architects decided that integer multiplication important enough deserve hard-wired instruction. This common RISCs, which might instead: implement ``multiply step'' which fits standard integer execution pipeline, require software routines every multiplication (e.g. Sparc AM29000); perform integer multiplication floating point unit good solution which compromises optional nature MIPS floating point ``co-processor''. multiply unit consumes small amount area, dramatically improves performance (and cache performance) over "multiply step" operations. It's basic operation multiply 32-bit values together produce 64-bit result, which stored 32-bit
CHAPTER
MIPS-1 (R30xx) ARCHITECTURE
registers (called ``hi'' ``lo'') which private multiply unit. Instructions mfhi, mflo defined copy result into general registers. Unlike results integer operations, multiply result registers interlocked. attempt read results before multiplication complete results being stopped until operation completes. integer multiply unit will also perform integer division between values general-purpose registers; this case ``lo'' register stores quotient, ``hi'' register remainder. R30xx family, multiply operations take clocks division takes assembler synthetic multiply operation which starts multiply then retrieves result into ordinary register. Note that MIPS Corp.'s assembler even substitute series shifts adds multiplication constant, improve execution speed. Multiply/divide results written into ``hi'' ``lo'' soon they available; effect deferred until writeback pipeline stage, with writes general purpose (GP) registers. mfhi mflo instruction interrupted some kind exception before reaches writeback stage pipeline, will aborted with intention restarting However, subsequent multiply instruction which passed stage will continue parallel with exception processing) would overwrite ``hi'' ``lo'' register values, that re-execution mfhi would wrong (i.e. new) data. this reason recommended that multiply should started within instructions mfhi/ mflo. assembler will avoid doing this where can. Integer multiply divide operations never produce exception, though divide zero produces undefined result. Compilers will often generate code trap errors, particularly divide zero. Frequently, this instruction sequence placed after divide initiated, allow execute concurrently with divide (and avoid performance loss). Instructions mthi, mtlo defined setup internal registers from general-purpose registers. They essential restore values ``hi'' ``lo'' when returning from exception, probably anything else.
Instruction types
full list R30xx family integer instructions presented Appendix Floating point instructions listed Appendix this manual. Currently, floating point instructions only available R3081, described R3081 User's Manual. MIPS-1 uses only three basic instruction encoding formats; this keys high-frequencies attained RISC architectures. Instructions mostly numerical order; simplify reading, list occasionally re-ordered clarity. Throughout this manual, description various instructions will also refer various subfields instruction. general, following typical nomenclature used: basic op-code, which bits long. Instructions which large sub-fields (for example, large immediate values, such required ``long'' j/jal instructions, arithmetic with 16-bit constant) have unique ``op'' field. Other instructions classified groups sharing ``op'' value, distinguished other fields (``op2'' etc.). rs1, fields identifying source registers. register changed this instruction. Shift-amount: shift, used shift-by-constant instructions.
MIPS-1 (R30xx) ARCHITECTURE
CHAPTER
Sub-code field used 3-register arithmetic/logical group instructions value zero). offset 16-bit signed word offset defining destination ``PCrelative'' branch. branch target will instruction ``offset'' words away from ``delay slot'' instruction after branch; branch-to-self offset target 26-bit word address jumped corresponds 28-bit byte address, which always word-aligned). long instruction rarely used, this format pretty much exclusively function calls (jal). high-order bits target address can't specified this instruction, taken from address jump instruction. This means that these instructions reach anywhere 256Mbyte region around instructions' location. jump further (jump register) instruction. constant 16-bit integer constant ``immediate'' arithmetic logic operations. another extended opcode field, this time used ``coprocessor'' type instructions. Field which hold source destination register. Field hold number control register (different from integer register file). Called ``crs''/``crd'' contexts where must source/destination respectively. instruction encodings have been chosen facilitate design high-frequency CPU. Specifically:. instruction encodings reveal portions internal design. Although there variable encodings, those fields which required very early pipeline encoded very regular way: Source registers always same place that fetch instructions from integer register file without conditional decoding. Some instructions need both registers since register file designed provide source values every clock nothing been lost. 16-bit constant always same place permitting appropriate instruction bits directly into ALU's input multiplexer, without conditional shifts.
Loading storing: addressing modes
mentioned above, there only basic ``addressing mode''. load store machine instruction written
operation dest-reg, offset(src-reg) e.g.:lw offset($2); offset($4)
registers used destination source. offset signed, 16-bit number anywhere between -32768 32767); program address used load dest-reg offset. This address mode normally enough pick particular member structure (``offset'' being distance between start structure member required); implements array indexed constant; enough reference function variables from stack frame pointer; provide reasonable sized global area around value static extern variables. assembler provides semblance simple direct addressing mode, load values memory variables whose address computed link time.
CHAPTER
MIPS-1 (R30xx) ARCHITECTURE
More complex modes such double-register scaled index must implemented with sequences instructions.
Data types Memory registers
R30xx family CPUs load store between bytes single operation. Naming conventions used documentation build instruction mnemonics:
``C'' name long short char MIPS name word word halfword byte Size(bytes) Assembler mnemonic ``w'' ``w'' ``h'' ``b''
Integer data types Byte halfword loads come flavors: Sign-extend load value into least significant bits 32-bit register, fill high order bits copying ``sign bit'' (bit byte, half-word). This correctly converts signed value 32-bit signed integer. Zero-extend instructions load value into least significant bits 32-bit register, with high order bits filled with zero. This correctly converts unsigned value memory corresponding 32-bit unsigned integer value; byte value becomes 32-bit value 254. byte-wide memory location whose address contains value 0xFE (-2, interpreted unsigned), then:
0(t1) 0(t1)
will leave holding value 0xFFFF FFFE signed 32-bit) holding value 0x0000 00FE (254 signed unsigned 32-bit). Subtle differences shorter integers extended longer ones historical cause portability problems, modern standards have elaborate rules. machines like MIPS, which does perform 16-bit precision arithmetic directly, expressions involving short char variables less efficient than word operations. Unaligned loads stores Normal loads stores MIPS architecture must aligned; halfwords loaded only from 2-byte boundaries, words only from 4byte boundaries. load instruction with unaligned address will produce trap. Because CISC architectures such MC680x0 iAPXx86 handle unaligned loads stores, this could complicate porting software from these architectures. MIPS architecture does provide mechanisms support this type operation; extremity, software provide trap handler which will emulate desired load operation hide this feature from application. data items declared code will correctly aligned. when known advance that program will transfer word from address whose alignment unknown will computed time, architecture does allow special 2-instruction sequence (much more efficient than series byte loads, shifts assembly). This sequence normally generated macro-instruction (unaligned load word).
MIPS-1 (R30xx) ARCHITECTURE
CHAPTER
macro-instruction ulh, unaligned load half, also provided, synthesized loads, shift, bitwise ``or'' operation.) special machine instructions (load word left, load word right). ``Left'' ``right'' arithmetical directions, ``shift left''; ``left'' movement towards more significant bits, ``right'' towards less significant bits. These instructions three things: load bytes from within aligned 4-byte (word) location; shift that data move byte selected address either most-significant (lwl) least-significant (lwr) 32-bit field; merge bytes fetched from memory with data already destination. This breaks most rules architecture usually sticks does logical operation memory variable, example. Special hardware allows lwl, pair used consecutive instructions, even though second instruction uses value generated first. example, configured big-endian assembler instruction:
0(t2)
implemented
0(t2) 3(t2)
Where: picks lowest-addressed byte unaligned 4-byte region, together with however many more bytes which into aligned word. then shifts them left, form most-significant bytes register value. aimed highest-addressed byte unaligned 4-byte region. loads together with bytes which precede same memory word, shifts right least significant bits register value. merge leaves high-order bits unchanged. Although special hardware ensures that required between lwr, there still load delay between second them normal instruction. Note that fact 4-byte aligned, then both instructions load entire word; duplicating effort, achieving desired effect. behavior when operating with little-endian byte order described later chapter. Floating point data memory Loads into floating point registers from 4-byte aligned memory move data without interpretation program load invalid floating point number error will result until arithmetic operation requested with operand. This allows programmer load single-precision values load into even-numbered floating point register; programmer also load double-precision value macro instruction, that:
ldc1 $f2, 24(t1)
expanded loads consecutive registers:
lwc1 lwc1 $f2, 24(t1) $f3, 28(t1)
CHAPTER
MIPS-1 (R30xx) ARCHITECTURE
compiler aligns 8-byte long double-precision floating point variables 8-byte boundaries. R30xx family hardware does require this alignment; done avoid compatibility problems with implementations MIPS-2 MIPS-3 CPUs such R4600 (Orion), where ldc1 instruction part machine code, alignment necessary.
BASIC ADDRESS SPACE
which MIPS processors handle addresses subtly different from that traditional CISC CPUs, appear confusing. Read first part this section carefully. Here some guidelines: addresses into programs rarely same physical addresses which come chip (sometimes they're close, same). This manual will refer them program addresses physical addresses respectively. more common name program addresses "virtual addresses"; note that term "virtual address" does necessarily imply that operating system must perform virtual memory management (e.g. demand paging from disks.), rather that address undergoes some transformation before being presented physical memory. Although virtual address proper term, this manual will typically term "program address" avoid confusing virtual addresses with virtual memory management requirements. MIPS-1 operating modes: user kernel. user mode, address above 2Gbytes (most-significant address set) illegal causes trap. Also, some instructions cause trap user mode. 32-bit program address space divided into four areas with traditional names; different things happen according area address lies kuseg 0000 0000 7FFF FFFF (low 2Gbytes): these addresses permitted user mode. machines with ("E" versions R30xx family), they will always translated (more about R30xx later chapter). Software should attempt these addresses unless machines without ("base" versions R30xx family), kuseg "program address" transformed physical address adding offset; address transformations "base versions" R30xx family described later this chapter. Note, however, that many embedded applications this address segment (those applications which require that kernel resources protected from user tasks). kseg0 0x8000 0000 9FFF FFFF (512 Mbytes): these addresses ``translated'' into physical addresses merely stripping bit, mapping them contiguously into Mbytes physical memory. This transformation operates same both "base" family members. This segment referred "unmapped" because version devices cannot redirect this translation different area physical memory. Addresses this region always accessed through cache, used until caches properly initialized. They will used most programs data systems using "base" family members; will used kernel systems which ("E" version devices).
MIPS-1 (R30xx) ARCHITECTURE
CHAPTER
kseg1 0xA000 0000 BFFF FFFF (512 Mbytes): these addresses mapped into physical addresses stripping leading three bits, giving duplicate mapping Mbytes physical memory. However, kseg1 program address accesses will cache. kseg1 region only chunk memory which guaranteed behave properly from system reset; that's after-reset starting point 0xBFC0 0000, commonly called "reset exception vector") lies within physical address starting point 0x1FC0 0000 which means that hardware should place boot this physical address. Software will therefore this region initial program ROM, most systems also registers. general

Other recent searches


uPD444012L-X - uPD444012L-X   uPD444012L-X Datasheet
TLCBD1060 - TLCBD1060   TLCBD1060 Datasheet
SN74ALVCH162409 - SN74ALVCH162409   SN74ALVCH162409 Datasheet
SF1128A - SF1128A   SF1128A Datasheet
SCP-5888 - SCP-5888   SCP-5888 Datasheet
MON960 - MON960   MON960 Datasheet
HIP6014 - HIP6014   HIP6014 Datasheet
DG308A - DG308A   DG308A Datasheet
DG309 - DG309   DG309 Datasheet

 

Privacy Policy | Disclaimer
© 2012 Datasheet Archive