The Datasheet Archive - 100 Million Datasheets from 7500 Manufacturers.    


Datasheet Search Engine   
 
Part # or Description: • 5V RS232 Driver • 2SC5066* • "Real Time Clock" • "USB connector" • "blue led" 5mm • 10 watt zener diode • 2N3055* motorola
 
Search Tip: Try entering the part number only. Include a wildcard (eg. lm317* or 1n4148*)

 

 

Version September 1998 2975 Stender Way, Santa Clara, California


Datasheet Thumbnail

  

Download PDF



Top Searches for this datasheet




Version September 1998
2975 Stender Way, Santa Clara, California 95054 Telephone: (800) 345-7015 TWX: 910-338-2070 FAX: (408) 492-8674 Printed U.S.A. ©1998 Integrated Device Technology, Inc.
1998 Integrated Device Technology, Inc. rights reserved. 4-98. Printed United States America. information this document subject change without notice provided IS." Integrated Device Technology, Inc. (hereafter sometimes referred "IDT") makes warranty kind, either expressed implied, including, limited implied warranties merchantability fitness specific purpose. patents, copyrights other intellectual property rights licenses, either express implied, granted information this document. does warrant that information meets your requirements that information free errors. information include technical inaccuracies typographical errors. Changes made information incorporated editions this document. logo registered trademark, IDT, Integrated Device Technology, IDT79RV5000, IDT79RC4700, IDT79RC4650, IDT79RC4640, IDT79RC3041, IDT79R3051, IDT79RC3052, IDT79RC3081, IDT79R3000, BiCameral, BurstRAM, BUSMUX, CacheRAM, DECnet, Double-Density, FASTX, Four-Port, FLEXI-CACHE, Flexi-PAK, Flow-thruEDC, IDT/ IDTenvY, IDT/sae, IDT/sim, IDT/ux, MacStation, MICROSLICE, PalatteDAC, REAL8, RC3041, RC3051, RC3052, RC3081, RC36100, R3721, RC4600, RC4650, RC4700, RC5000, RISController, RISCore, RISC Subsystem, RISC Windows, SARAM, SmartLogic, SyncFIFO, SyncBiFIFO, SPC, TargetSystem WideBus trademarks Integrated Device Technology, Inc. MIPS registered trademark, RISCompiler, RISComponent, RISComputer, RISCware, RISC/os, R3000, R3010 trademarks MIPS Computer Systems, Inc. other trademarks registered trademarks property their respective owners.
Integrated Device Technology, Inc. reserves right make changes products specifications time, without notice, order improve design performance supply best possible product. does assume responsibility circuitry described other than circuitry embodied product. Company makes representations that circuitry described herein free from patent infringement other rights third parties which result from use. license granted implication otherwise under patent, patent rights other rights, Integrated Device Technology, Inc.
LIFE SUPPORT POLICY Integrated Device Technology's products authorized critical components life support devices systems unless specific written agreement pertaining such intended executed between manufacturer officer IDT. Life support devices systems devices systems which intended surgical implant into body support sustain life whose failure perform, when properly used accordance with instructions provided labeling, reasonably expected result significant injury user. critical component components life support device system whose failure perform reasonably expected cause failure life support device system, affect safety effectiveness.
Performance .1-1 Upward Compatibility.1-1 Block Diagram .1-1 Features .1-2 Instruction Pipeline .1-3 Dual Issue .1-3 Integer (CPU) Pipeline .1-3 Floating-Point Unit (FPU) Pipeline .1-5 Virtual-to-Physical Address Mapping.1-6 Joint TLB.1-6 Cache .1-6 Instruction Cache .1-7 Data Cache .1-7 Write buffer.1-7 Clocks .1-7 System Interface.1-8 Introduction.2-1 Registers.2-1 Coprocessors (CP0-CP2) Their Registers .2-1 Data Formats Addressing .2-3 Instruction Summary.2-6 Instruction Formats .2-6 Instruction Types.2-7 Load Store Instructions .2-8 Scheduling Load Delay Slot.2-9 Defining Access Types .2-9 Computational Instructions .2-10 64-bit Operations .2-12 Cycle Timing Multiply Divide Instructions .2-13 Jump Branch Instructions .2-13 Special Instructions.2-14 Coprocessor Instructions .2-14 MIPS Instruction Additions Instructions .2-16 Prefetch .2-16 Integer Conditional Moves .2-16 Introduction.3-1 Floating-Point General Registers (FGRs).3-2 Floating-Point Registers (FPRs).3-2 Floating-Point Control Registers (FCRs) .3-3 Implementation Revision Register (FCR0).3-3 Control/Status Register (FCR31) .3-4 Accessing Control/Status Register .3-5 IEEE Standard 754.3-5 Control/Status Register Bit.3-5 Control/Status Register Condition .3-5 Control/Status Register Cause, Flag, Enable Fields .3-5
Cause Bits. Enable Bits. Flag Bits Control/Status Register Rounding Mode Control Bits Data Formats. Floating-Point Formats Binary Fixed-Point Format. Floating-Point Instruction Summary Floating-Point Load, Store, Move Instructions .3-11 Transfers Between Memory. 3-11 Transfers Between 3-11 Load Delay Hardware Interlocks 3-12 Data Alignment 3-12 Endianness 3-12 Floating-Point Conversion Instructions. 3-12 Floating-Point Computational Instructions. 3-12 Branch Condition Instructions 3-12 Floating-Point Compare Operations 3-12 MIPS Instruction Additions Instructions. 3-13 Indexed Floating-Point Load 3-13 Indexed Floating-Point Store 3-13 Branch Floating-Point Coprocessor 3-14 Floating-Point Multiply-Add/Subtract. 3-14 Floating-Point Compare 3-14 Floating-Point Conditional Moves 3-15 Reciprocal's 3-15 FPU-Instruction Latencies 3-15 Introduction Instruction Pipeline Stages Dual Issue. Branch Delay Load Delay. Interlock Exception Handling Stall Conditions. Slip Conditions. Write Buffer. Introduction Exception Processing Registers Context Register Virtual Address Register (BadVAddr) Count Register (9). Compare Register (11) Status Register (12) Cause Register (13) Exception Program Counter (EPC) Register (14) XContext Register (20). Error Checking Correcting (ECC) Register (26) Cache Error (CacheErr) Register (27). 5-10 Error Exception Program Counter (Error EPC) Register (30) .5-11 Overview Exception Types Handling .5-11 Sample Hardware Processes Various Exceptions .5-11 Reset. 5-12 Cache Error. 5-12
Soft Reset NMI. 5-12 General Exceptions. 5-12 Exception Vector Locations 5-13 Priority Exceptions. 5-13 Causes, Hardware Processing Software Servicing Exceptions. 5-14 Reset Exception 5-14 Soft Reset Exception. 5-14 Maskable Interrupt (NMI) Exception 5-15 Address-Error Exception 5-15 Exceptions 5-16 Refill Exception 5-16 Invalid Exception. 5-17 Modified Exception 5-18 Cache Error Exception 5-18 Error Exception 5-19 Integer Overflow Exception 5-19 Trap Exception 5-20 System Call Exception 5-20 Breakpoint Exception 5-20 Reserved Instruction Exception. 5-21 Coprocessor Unusable Exception 5-21 Floating-Point Exception 5-22 Interrupt Exception 5-22 Exception Handling Servicing Flowcharts. 5-23 Introduction Exception Types Exception Trap Processing Trap Handlers IEEE Standard Exceptions. Flags Exceptions Inexact Exception (I). Invalid Operation Exception (V). Division-by-Zero Exception (Z). Overflow Exception Underflow Exception Unimplemented Instruction Exception Saving Restoring State Introduction Address Spaces. Virtual Address Space. Physical Address Space. Virtual-to-Physical Address Translation. 32-bit Mode Virtual Address Translation 64-bit Mode Virtual Address Translation Operating Modes User Mode Operations 32-bit User Mode (useg) 64-bit User Mode (xuseg) Supervisor Mode Operations. 32-bit Supervisor Mode, User Space (suseg) 32-bit Supervisor Mode, Supervisor Space (sseg) 64-bit Supervisor Mode, User Space (xsuseg) 64-bit Supervisor Mode, Current Supervisor Space (xsseg).
64-bit Supervisor Mode, Separate Supervisor Space (csseg) Kernel Mode Operations 32-bit Kernel Mode, User Space (kuseg). 32-bit Kernel Mode, Kernel Space (kseg0) 32-bit Kernel Mode, Kernel Space (kseg1) 32-bit Kernel Mode, Supervisor Space (ksseg) 32-bit Kernel Mode, Kernel Space (kseg3) 7-10 64-bit Kernel Mode, User Space (xkuseg) 7-10 64-bit Kernel Mode, Current Supervisor Space (xksseg). 7-11 64-bit Kernel Mode, Physical Spaces (xkphys). 7-11 64-bit Kernel Mode, Kernel Space (xkseg) 7-11 64-bit Kernel Mode, Compatibility Spaces 7-11 System Control Coprocessor. 7-12 Translation Lookaside Buffer (TLB). 7-13 Format Entry 7-13 Registers 7-16 Index Register 7-16 Random Register (1). 7-17 EntryLo0 (2), EntryLo1 Registers 7-17 PageMask Register 7-18 Wired Register (6). 7-18 EntryHi Register (CP0 Register 7-19 Processor Revision Identifier (PRId) Register (15) 7-19 Config Register (16). 7-20 Load Linked Address (LLAddr) Register (17) 7-21 Cache Registers [TagLo (28) TagHi (29)]. 7-21 Virtual-to-Physical Address Translation Process 7-23 Hits Misses 7-24 Multiple Matches. 7-25 Invalid Accesses 7-25 Instructions 7-25 Introduction Primary Caches Cache Line Size Cache Organization Accessibility Organization Primary Instruction Cache (I-Cache). Organization Primary Data Cache (D-Cache) Accessing Primary Caches Secondary Cache Controller. Organization Interface Block Diagram Secondary Cache Operations Secondary Cache Mode Configuration Secondary Cache Software Enable Cache-Line States Cache-Line Ownership 8-10 Cache Write Policy .8-11 Cache-State Transitions .8-11 Cache Coherency 8-12 Cache Coherency Attributes 8-12 Uncached Attribute. 8-12 Noncoherent Attribute 8-12 Multiprocessor Synchronization Support 8-13
Test-and-Set 8-13 Counter. 8-14 Load Linked Store Conditional 8-14 Introduction System Interface Signals Clock Interface Signals Secondary Cache Interface Signals Interrupt Interface Signals. JTAG Interface Signals Initialization Interface Signals Introduction 10-1 Terminology 10-1 Processor Requests 10-1 Rules Processor Requests. 10-2 Processor Read Request 10-3 Processor Write Request 10-3 External Requests 10-3 External Read Request 10-5 External Write Request 10-5 Read Response. 10-5 Secondary Cache Transactions. 10-6 Secondary Cache Probe, Invalidate, Clear 10-6 Secondary Cache Write 10-7 Secondary Cache Read 10-7 Handling Requests 10-8 Load Miss 10-8 Store Miss 10-9 Store 10-10 Uncached Loads Stores. 10-10 Uncached Instruction Fetch. 10-10 Load Linked Store Conditional Operation 10-10 Branch-Target Alignment .10-11 Introduction .11-1 Address Data Cycles .11-1 Issue Cycles .11-1 Handshake Signals.11-2 System Interface Operation .11-2 Master Slave States .11-3 External Arbitration.11-3 Uncompelled Change Slave State .11-4 Processor Request Protocols .11-4 Processor Read Request Protocol .11-5 Processor Write Request Protocol .11-6 Processor Request Flow Control.11-7 External Request Protocols .11-10 External Arbitration Protocol. 11-11 External Read Request Protocol .11-12 External Null Request Protocol .11-13 External Write Request Protocol .11-14 Read Response Protocol .11-15 Secondary Cache Protocols .11-16
Secondary Cache Read Protocol .11-16 Secondary Cache Read 11-16 Secondary Cache Read Miss 11-17 Secondary Cache Read Miss with Error. 11-18 Secondary Cache Write .11-19 Secondary Cache Line Invalidate.11-20 Secondary Cache Probe Protocol .11-21 Secondary Cache Block Clear Protocol .11-22 SysADC[7:0] Protocol .11-23 Data Rate Control .11-23 Data-Transfer Patterns .11-23 Independent Transmissions SysAD Bus.11-24 System Interface Endianness .11-24 System Interface Cycle Time.11-25 Release Latency .11-25 System Interface Commands/Data Identifiers .11-25 Command Data Identifier Syntax .11-26 System Interface Command Syntax.11-26 Read Requests 11-26 Write Requests 11-27 Null Requests. 11-28 System Interface Data Identifier Syntax .11-29 Noncoherent Data. 11-29 Data Identifier Definitions. 11-29 System Interface Addresses.11-30 Addressing Conventions .11-30 Subblock Ordering.11-31 Valid Byte Lanes During Partial-Word Transfers .11-33 Processor Internal Address .11-35 Introduction 12-1 Asserting Interrupts. 12-1 Introduction 13-1 Parity Generation Checking System Interface. 13-1 Summary Parity Generation Checking 13-2 Reset Signals. 14-1 Power-on Reset 14-1 Cold Reset 14-2 Warm Reset. 14-3 Processor Reset State. 14-3 Initialization Sequence. 14-3 Boot-Mode Settings 14-3 Driver Strength Control 14-5 SysClock. 15-1 PClock 15-1 Phase-Locked Loop (PLL) 15-1 Analog Power Filtering 15-2 Standby Mode. 15-2
Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Integer Pipeline Example Integer (CPU) Instruction Latencies System Control Coprocessor (CP0) Register Definitions. Some Integer-Instruction Latencies. Load Store Instructions Byte Access Within Doubleword 2-10 Arithmetic Instructions (ALU Immediate). 2-11 Arithmetic (3-Operand, R-Type) 2-11 Shift Instructions. 2-11 Multiply Divide Instructions. 2-12 Multiply/Divide Instruction Latency Repeat Rates 2-13 Jump Branch Instructions. 2-13 Special Instructions 2-14 Coprocessor Instructions. 2-14 Instructions 2-15 Exception Instructions 2-15 Floating-Point Control Register Assignments. FCR0 Fields Control/Status Register Fields. Rounding Mode Decoding. Calculating Values Single Double-Precision Formats Floating-Point Format Parameter Values Minimum Maximum Floating-Point Values Binary Fixed-Point Format Fields Instruction Summary: Load, Move Store Instructions .3-10 Instruction Summary: Conversion Instructions.3-10 Instruction Summary: Computational Instructions 3-11 Instruction Summary: Compare Branch Instructions. 3-11 Mnemonics Compare-Instruction Conditions .3-12 Floating-Point Instruction Latencies .3-16 Relationship CPU-Pipeline Stage Interlock Condition .4-5 CPU-Pipeline Exceptions .4-5 CPU-Pipeline Interlocks .4-6 Exception Processing Registers.5-1 Context Register Fields .5-2 Status Register Fields .5-5 Status Register Diagnostic Status Bits.5-6 Cause Register Fields .5-7 Cause Register ExcCode Fields .5-7 XContext Register Fields.5-9 Register Fields.5-10 CacheErr Register Fields .5-10 Exception Vector Base Addresses .5-13 Exception Vector Offsets .5-13 Exception Priority Order 5-13 Default Exception Actions .6-3
Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table 10.1 Table 10.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table 11.1 Table Table Table
Exception-Causing Conditions 32-bit 64-bit User Mode Segments 32-bit 64-bit Supervisor Mode Segments 32-bit Kernel Mode Segments 64-bit Kernel Mode Segments 7-10 Cacheability Coherency Attributes .7-11 Page Coherency Values 7-16 Index Register Field Descriptions. 7-17 Random Register Field Descriptions 7-17 Mask Field Values Page Sizes 7-18 Wired Register Field Descriptions 7-19 PRId Register Fields. 7-19 Config Register Fields 7-20 Cache Register Fields 7-22 Instructions. 7-25 Cache States 8-10 Cache Write Policy.8-11 Coherency Attributes Processor Behavior. 8-12 System Interface Signals Clock Interface Signals. Secondary Cache Interface Signals Interrupt Interface Signals JTAG Interface Signals Initialization Interface Signals Action Taken Load Miss Primary Data Cache 10-8 Store Miss Primary Secondary Data Caches 10-9 System Interface Requests .11-4 Transmit Data Rates Patterns .11-23 Release Latency External Requests .11-25 Encoding SysCmd(7:5) System Interface Commands .11-26 Encoding SysCmd(4:3) Read Requests.11-27 Encoding SysCmd(2:0) Block Read Request.11-27 Read Request Data Size Encoding SysCmd(2:0) .11-27 Write Request Encoding SysCmd(4:3).11-28 Block Write Request Encoding SysCmd(2:0) .11-28 Write Request Data Size Encoding SysCmd(2:0) .11-28 External Null Request Encoding SysCmd(4:3).11-29 Processor Data Identifier Encoding SysCmd(7:3) .11-30 External Data Identifier Encoding SysCmd(7:3) .11-30 Subblock Ordering Sequence: Address .11-32 Subblock Ordering Sequence: Address .11-32 Subblock Ordering Sequence: Address .11-33 Partial-Word Transfer Byte Lane Usage.11-34 Parity Generation Checking Operations 13-2 Boot Mode Settings 14-4 Boot Mode Bits Drive Strength 14-5
Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure RV5000 Block Diagram Integer (CPU) Pipeline .1-4 Dual-Issue Mechanism, Showing Pipelines. Typical RV5000 System Block Diagram .1-8 R5000 Registers .2-1 Registers .2-2 Big-Endian Byte Ordering .2-4 Little-Endian Byte Ordering .2-4 Little-Endian Data Doubleword .2-4 Big-Endian Data Doubleword .2-5 Big-Endian Misaligned Word Addressing .2-5 Little-Endian Misaligned Word Addressing .2-6 Instruction Formats .2-7 Functional Block Diagram .3-1 Registers .3-2 Implementation/Revision Register .3-3 Control/Status Register Assignments .3-4 Control/Status Register Cause, Flag, Enable Fields .3-5 Single-Precision Floating-Point Format .3-7 Double-Precision Floating-Point Format .3-7 Binary Fixed-Point Format .3-9 Instruction Pipeline Stages .4-1 nteger (CPU) Pipeline Activities .4-2 FDual-Issue Mechanism, Showing Pipelines .4-3 CPU-Pipeline Branch Delay .4-4 CPU-Pipeline Load Delay .4-4 CPU-Pipeline Exception Detection Mechanism .4-7 CPU-Pipeline Servicing Data Cache Miss .4-7 Slips During Instruction-Cache Miss .4-8 Context Register Format .5-2 BadVAddr Register Format .5-3 Count Register Format .5-3 Compare Register Format .5-4 Status Register .5-4 Status Register Field .5-6 Cause Register Format .5-7 Register Format .5-8 XContext Register Format .5-9 Register Format .5-10 CacheErr Register Format .5-10 ErrorEPC Register Format 5-11 Reset Exception Processing .5-12 Cache Error Exception Processing .5-12 Soft Reset Exception Processing .5-12 General Exception Processing .5-12 General Exception Handler (HW) .5-24 General Exception Servicing Guidelines (SW) .5-25
Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure TLB/XTLB Miss Exception Handler (HW) 5-26 TLB/XTLB Exception Servicing Guidelines (SW) 5-27 Cache Error Exception Handling (HW) Servicing Guidelines 5-28 Reset, Soft Reset Exception Handling 5-29 Control/Status Register Exception/Flag/Trap/ Enable Bits Overview Virtual-to-Physical Address Translation 32-bit Mode Virtual Address Translation 64-bit Mode Virtual Address Translation User Mode Virtual Address Space Supervisor Mode Address Space Kernel Mode Address Space Registers 7-12 Format Entry 7-14 Fields PageMask EntryHi Registers 7-15 Fields EntryLo0 EntryLo1 Registers 7-15 Index Register 7-16 Random Register 7-17 Wired Register Boundary 7-18 Wired Register 7-19 Processor Revision Identifier Register Format 7-19 Config Register Format 7-20 LLAddr Register Format 7-21 TagLo TagHi Register (P-cache) Formats 7-22 TagLo TagHi Register (S-cache) Formats 7-22 Address Translation 7-24 Logical Hierarchy Memory Primary I-Cache Line Format 8-Word Primary Data-Cache Line Format Primary Cache Data Organization Secondary Cache Block Diagram Miss Read-Followed-By-Write Cycles Read Write Cycles Data Block Diagram Block Diagram Primary Data Cache State Diagram 8-12 Synchronization with Test-and-Set 8-13 Synchronization Using Counter 8-14 Test-and-Set using 8-15 Counter Using 8-16 R5000 Processor Signals Requests System Events 10-2 Processor Requests External Agent 10-2 Processor Request Flow Control 10-3 External Requests Processor (except Read Response) 10-4 External Request Arbitration 10-4 External Agent Read Response Processor 10-5 Processor Requests Secondary Cache External Agent 10-6 Secondary Cache Invalidate Clear 10-6 Secondary Cache Probe 10-6 Secondary Cache Write Through 10-7
Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Secondary Cache Read 10-7 Secondary Cache Read Miss 10-8 State RdRdy* Signal Read Requests .11-1 State WrRdy* Signal Write Requests .11-2 System Interface Register-to-Register Operation .11-3 Symbol Undocumented Cycles .11-5 Processor Read Request Protocol .11-6 Processor Noncoherent Single Word Write Request Protocol .11-7 Processor Non-Coherent, Non-Secondary Cache Block Write Request .11-7 Processor Request Flow Control .11-8 Processor Write Requests with Second Write Delayed .11-8 R4000-Compatible Back-to-Back Write Cycle Timing .11-9 Write Reissue .11-10 Pipelined Writes .11-10 Arbitration Protocol External Requests .11-11 External Read Request, System Interface Master State .11-12 System Interface Release External Null Request .11-13 External Write Request, with System Interface Initially Master .11-14 Processor Word Read Request, followed Word Read Response .11-15 Block Read Response, System Interface already Slave State .11-16 Secondary Cache Read .11-17 Secondary Cache Read Miss .11-18 Secondary Cache Read Miss with Error .11-19 Secondary Cache Write Operation .11-20 Secondary Cache Line Invalidate .11-21 Secondary Cache Probe (Tag Read) .11-22 Secondary Cache Block Clear .11-22 Read Response, Reduced Data Rate, System Interface Slave State .11-24 System Interface Command Syntax Definition .11-26 Read Request SysCmd Definition .11-26 Write Request SysCmd Definition .11-27 Null Request SysCmd Definition .11-28 Data Identifier SysCmd Definition .11-29 Retrieving Data Block Sequential Order .11-31 Retrieving Data Subblock Order .11-32 Interrupt Register Bits Enables 12-1 R5000 Interrupt Signals 12-2 R5000 Nonmaskable Interrupt Signal 12-3 Masking R5000 Interrupt 12-3 Power-On Reset Timing Diagram 14-2 Cold Reset Timing Diagram 14-2 Warm Reset Timing Diagram 14-3 SysClock Timing 15-1 Input Timing 15-2 Output Timing 15-2 Filter Circuit 15-2
IDT79RV5000microprocessor (RV5000) 64-bit dual-issue RISC processor that serves many performance-critical applications. processor's Dhrystone MIPS performance 4Gbytes/sec aggregate bandwidth makes ideal embedded applications, such high-end internetworking systems, color printers, graphics terminals, well low-cost general computing with special emphasis floating-point operations memory management. such applications offers: high-performance upgrade path existing embedded customers internetworking, office automation visualization markets. Significant improvements floating-point performance moderately priced chip. Improved desktop-system memory hierarchy through implementation large primary instruction data caches (32KB each) on-chip secondary cache controller. Improved performance through MIPS-IV instruction-set architecture (ISA).
RV5000 conforms MIPS-IV Instruction Architecture (ISA) provides complete upward application-software compatibility with IDT79R3xxx IDT79R4xxx families microprocessors. array operating systems development tools facilitates rapid development systems that support thousands application programs. R5V000 true 64-bit processor also fully compatible with 32-bit operating systems applications. RV5000 enables 32-bit applications effortlessly access 64-bit compute power. embedded applications, power bandwidth 64-bit data types used without memory expansion 64-bit addressing.
Figure page block diagram RV5000's functional units.
Data Data DTLB Physical Store Buffer SysAD Instruction Select Write Buffer Read Buffer Data Instruction DBus IBus Control Floating-point Register File Unpacker/Packer Floating-p oint Control AuxTag Load Aligner Integer Control Joint Integer Register File Integer/Address Adder Data Virtual Shifter/Store Aligner Logic Unit Incrementer System/Memory Control Branch Adder Instruction Virtual Program Counter Address Buffer Instruction ITLB Physical Instruction Instruction Register Data Instruction
Floating-point Add/Sub/Cvt/Div/Sqrt Integer Divide
Coprocessor
Floating-point/Integer Multiply Phase Lock Loop, Clocks
Figure RV5000 Block Diagram
True 64-bit Microprocessor 64-bit integer (CPU) operations 64-bit floating-point (FPU) IEEE-754 operations 64-bit registers 64-bit virtual addresses
Execution Resources Integer ALU, with bypassing Integer multiply/divide unit Floating-point ALU, pipelined allow single-cycle repeat rate single-precision operations Floating-point divide/square-root unit Load unit Store unit Branch unit Selectable Frequencies Bus-to-pipeline frequency ratios High Performance Dhrystone MIPS 4Gbytes/sec aggregate bandwidth 200MHz pipeline clock frequency
Efficient Memory Hierarchy 32KB two-way associative instruction cache 32KB two-way associative data cache On-chip secondary cache controller 64Gbyte physical address space terabyte (240) maximum user process size Flexible with 48-entry 1.6Gbytes/sec cache bandwidth (each cache) 200MHz pipeline frequency Software Compatibility MIPS 64-bit instruction set, including CP1X functional units Software-compatible with R3xxx R4xxx families Compatible with Multiple Operating Systems C-executive Windows® Works PSOS Development Tools Cross compilers Logic models Logic analyzer support Low-Power Operation 3.3V power supply 25mW/MHz internal power dissipation 200MHz, 3.3V) Active power management, including WAIT operation 272-pin SuperBGA Package
RV5000 dual-issue, five-stage instruction pipeline that operates multiple frequency input system clock. pipeline parallel paths, integer (CPU) instructions other floating-point (FPU) instructions. Each stage CPU-instruction path takes processor clock.
Dual-issue instruction pairing given clock consist floating-point instruction instruction other type These instruction classes pre-decoded they brought on-chip. pre-decoded information stored instruction cache. there pending resource conflicts, RV5000 issue instruction class pipeline clock cycle. Long-latency operations, such floating-point SQRT, integer (CPU) multiply, slow issue instructions. RV5000 does perform out-of-order speculative execution; instead, pipeline slips until required resource becomes available. There alignment restrictions dual-issue instruction pairs. However, since RV5000 performs aligned fetches, instructions cycle from instruction cache, compilers should attempt align branch targets allow dual-issue first target cycle, order optimize performance.
RV5000 implements traditional five-stage pipeline, shown Figure Table 1.1:
cycle Figure Integer (CPU) Pipeline
Instruction Fetch, Phase Instruction Fetch, Phase Register Read, Phase Register Read, Phase Execution, Phase Execution, Phase Data Load/Store, Phase Data Load/Store, Phase Write Back, Phase Write Back, Phase
Instruction-cache access
Instruction virtual-to-physical address translation Register-file read, bypass calculation, instruction decode, branch-address calculation Issue slip decision, data virtual-address calculation, branch decision Integer add, logical, shift Store align Data-cache access load align Data virtual-to-physical address translation Resolve exceptions Register-file write Table Integer Pipeline
Typical integer (CPU) execution latencies shown Table 1.2. RV5000's short pipeline keeps load branch latencies low. caches allow combination loads stores execute back-to-back cycles without requiring pipeline slips stalls operation hits cache.
Load Store MULT/MULTU DMULT/DMULTU DIV/DIVU DDIV/DDIVU Other Integer Branch Jump
Table Example Integer (CPU) Instruction Latencies
on-chip floating-point unit (FPU) coprocessor includes 64-bit floating-point register file. forms seamless interface with integer (CPU) pipeline, decoding executing instructions parallel with CPU. supports single- double-precision arithmetic, specified IEEE Standard 754. also supports fully precise floating-point exceptions while allowing both overlapped pipelined operations. Precise exceptions extremely important mission-critical environments highly desirable debugging environment. described above, instructions issued given clock floating-point instruction another type other than floating-point ALU. Figure shows simplified diagram this dual-issue mechanism.
2-deep buffer Instr Cache instr
Read Integer Register File
Read Register File Stage
tage
Integer File rite
Integer Load/S tore
Integer Execution
Register File rite Stage
Load/Store Stage
Execution Stage
Figure Dual-Issue Mechanism, Showing Pipelines
RV5000 provides three modes operation: user mode supervisor mode kernel mode This mechanism available system software provide secure environment user processes. Bits status register determine mode operation. When operating kernel mode, four distinct virtual-address spaces totalling 1024Gbytes simultaneously available differentiated high-order bits virtual address. RV5000 also supports supervisor mode which virtual address space 256Gbytes, divided into three regions based high-order bits virtual address. When RV5000 uses 64-bit virtual addresses, address space layouts upward-compatible extension 32-bit virtual address space layout.
fast virtual-to-physical address decoding, RV5000 uses fully associative joint which maps virtual pages their corresponding physical addresses. organized pairs even-odd entries, maps virtual address address space identifier into 64Gbyte physical address space. mechanisms assist controlling amount mapped space, replacement characteristics various memory regions. First, page size configured, per-entry basis, 4Kbytes, 16Kbytes, 64Kbytes, 256Kbytes, 1Mbyte, 4Mbytes, 16Mbytes. register loaded with page size mapping, that size entered into when entry written. Thus, operating systems provide special purpose maps; example, typical frame buffer memory mapped using only entry. second mechanism controls replacement algorithm when miss occurs. RV5000 implements random replacement algorithm select entry written with mapping. However, processor provides mechanism whereby system-specific number mappings locked into TLB, thus avoid being randomly replaced. This facilitates design real-time systems, allowing deterministic access critical software. joint also contains information control cache coherency protocol each page. Specifically, each page attribute bits determine whether coherency algorithm uncached, non-coherent write-back, non-coherent write-through write-allocate, non-coherent write-through write-allocate, sharable, exclusive, update. Non-coherent write-back typically used both code data RV5000. write-through modes support more efficient frame-buffer accesses than R4000 family. coherent modes supported R4000 compatibility generate different transaction types system interface; however, cache coherency supported.
RV5000 incorporates on-chip instruction data caches that accessed single processor cycle. processor also includes on-chip secondary cache controller simple interfacing large, high-speed second-level cache SRAM. Both on-chip primary caches 32KB size, two-way associative, virtually indexed, physically tagged. Because caches virtually indexed, virtual-to-physical address translation occurs parallel with cache access. Each cache 64-bit data path accessed parallel each pipeline cycle. cache subsystem provides integer unit (CPU) floating-point unit (FPU) with aggregate bandwidth 3.2Gbytes second pipeline clock frequency 200MHz.
instruction cache 64-bits wide. Cache lines eight instructions bytes). Instruction fetches bytes cycle, peak instruction-cache bandwidth 1.6Gbytes/sec 200MHz. instruction cache protected with word parity. holds 24-bit physical address valid bit, parity protected.
data cache 64-bits wide. Cache lines bytes. Data loads bytes cycle, peak data-cache bandwidth 1.6Gbytes/sec 200MHz addition 1.6Gbytes/sec instruction-cache bandwidth). data cache protected with byte parity protected with single parity bit. virtually indexed physically tagged allow simultaneous address translation data-cache accesses. normal write policy writeback. Software can, however, select writethrough per-page basis, such frame buffers. data cache associated store buffer. When RV5000 executes store instruction, this single-entry buffer gets written with store data while comparison performed. matches, data written into data cache next cycle that data cache accessed (the next non-load cycle). store buffer allows RV5000 execute store every processor cycle perform back-to-back stores without penalty.
Writes external memory, whether cache-miss writebacks stores uncached write-through addresses, on-chip write buffer. write buffer holds four 64-bit address data pairs cache line written back. entire buffer used data-cache writeback allows processor proceed parallel with memory update.
RV5000 uses system interface clock input clock. pipeline speed derived from this clock using multiply input reference. assumed that system designer manages system clock distribution needs system. Thus, RV5000 does output system reference clock, rather operates synchronization with input clock. RV5000 outputs low-frequency reference clock: Mode Clock. This clock operates 1/256 rate input clock, used clock-in serial initialization stream during reset.
RV5000 supports 64-bit multiplexed system interface that compatible with R4xxx system interface. interface consists 64-bit address/data with check bits 9-bit command bus. addition, there handshake signals interrupt inputs. interface simple timing specification capable transferring data between processor memory peak rate 800Mbytes/sec 100MHz. Figure shows typical system using RV5000.
Address
Boot DRAM (80ns) Cache (Optional) Control
SCSI
ENET
RV5000
Memory Controller
Figure Typical RV5000 System Block Diagram
MIPS instruction-set architecture, central processing unit (CPU) executes integer system instructions, floating-point unit (FPU) coprocessor executes floating-point instructions. This chapter describes only CPU. Chapter describes with FPU.
R5000 integer unit thirty-two general purpose registers. These registers used scalar integer operations address calculation. register file consists read ports write port, fully bypassed minimize operation latency pipeline. Figure shows R5000 registers.
General Purpose Registers
Multiply Divide Registers
Program Counter
Figure R5000 Registers
general purpose registers have assigned functions: hardwired value zero, used target register instruction whose result discarded. also used source when zero value needed. used implicit return destination address register series instructions. three special purpose registers: Program Counter register Multiply Divide register, higher result Multiply Divide register, lower result Multiply Divide registers (HI, store: product integer multiply operations, quotient remainder integer divide operations. R5000 Program Status Word (PSW) register such; this covered Status Cause registers incorporated within System Control Coprocessor (CP0), described next section, below.
MIPS instruction architecture defines three coprocessors, designated CP0, CP1, CP2. R5000 implements first two: Coprocessor (CP0) supports virtual memory system exception handling. also referred System Control Coprocessor, described below. Coprocessor (CP1) implements MIPS floating-point instruction used FPU. registers associated with described Chapter Coprocessor (CP2) reserved future use. registers associated with shown Figure described Table 2.1. translates virtual addresses into physical addresses manages exceptions transitions between kernel, supervisor, user states. also controls cache subsystem, controls power management, provides diagnostic control error recovery facilities. Access reserved undefined register results undefined. exception result. Power management implemented with standby mode, which reduces power consumption core. standby mode entered executing WAIT instruction with SysAD idle, exited interrupt. Register Name Index Random EntryLo0 EntryLo1 Context PageMask Wired BadVAddr Count EntryHi Compare Cause PRId Exception Processing Reg. Memory Management
Figure Registers
Register Name Config LLAddr
Reg.
XContext
CacheErr TagLo TagHi ErrorEPC
Reserved
21-25
Index Random EntryLo0 EntryLo1 Context PageMask Wired BadVAddr Count EntryHi Compare Cause PRId Config LLAddr XContext CacheErr TagLo TagHi ErrorEPC
Programmable pointer into array Pseudo-random pointer into array (read only) half entry even virtual page (VPN) half entry virtual page (VPN) Pointer kernel virtual page table entry (PTE) 32-bit address spaces page mask Number wired entries Reserved virtual address Timer count High half entry Timer compare Status register Cause last exception Exception program counter Processor revision identifier Configuration register Load linked address Reserved Pointer kernel virtual table 64-bit address spaces Reserved Secondary-cache error checking correcting (ECC) primary parity Cache error status register Cache register Cache register Error exception program counter Reserved
Table System Control Coprocessor (CPO) Register Definitions
R5000 processor uses four data formats: 64-bit doubleword, 32-bit word, 16-bit halfword, 8-bit byte. Byte ordering within halfword, word, doubleword data formats configured either big-endian little-endian order. Endianness refers location byte within multi-byte data structure. Figures show ordering bytes within words ordering words within multiple-word structures big-endian little-endian conventions. When R5000 processor configured big-endian system, byte most-significant (left-most) byte, thereby providing compatibility with 68000 conventions. Figure illustrates this configuration.
Higher Address
Word Address
Lower Address
Figure Big-Endian Byte Ordering
When configured little-endian system, byte always least-significant (right-most) byte, which compatible with conventions. Figure illustrates this configuration.
Higher Address
Word Address
Lower Address
Figure Little-Endian Byte Ordering
this text, always least-significant (right-most) bit; thus, designations always little-endian (although instructions explicitly designate positions within words). Figures show little-endian big-endian byte ordering doublewords.
Most-significant byte
Least-significant byte Word
Byte Halfword
Byte
Bits Byte
Figure Little-Endian Data Doubleword
Most-significant byte
Least-significant byte Word
Byte Halfword Byte
Bits Byte
Figure Big-Endian Data Doubleword
uses byte addressing halfword, word, doubleword accesses with following alignment constraints: Halfword accesses must aligned even byte boundary 4.). Word accesses must aligned byte boundary divisible four 8.). Doubleword accesses must aligned byte boundary divisible eight 16.). following special instructions load store words that aligned 4-byte (word) 8word (doubleword) boundaries: LWRSWLSWR LDLLDRSDLSDR These instructions used pairs provide addressing misaligned words. Addressing misaligned data incurs additional instruction cycle over that required addressing aligned data. This extra cycle result extra instruction "pair" (e.g., form pair). Also note that moves unaligned data same rate hardware mechanism. Figures show access misaligned word that byte address
Higher Address
Lower Address
Figure Big-Endian Misaligned Word Addressing
Higher Address Lower Address
Figure Little-Endian Misaligned Word Addressing
R5000 executes MIPS instruction set, which superset MIPS instruction backward-compatible with MIPS III. Each instruction consists single 32-bit word, aligned word boundary. There three instruction formats-immediate (I-type), jump (J-type), register (R-type). small number instruction formats simplifies instruction decoding, allowing compiler synthesize more complicated (and less frequently used) operations addressing modes from these three formats needed. Table gives overview R5000 CPU-instruction latencies. summary MIPS instruction additions given remainder this section, along with brief explanation each instruction. more information MIPS instruction set, refer MIPS Microprocessor Family Software Manual.
Arithmetic Logical Shift Load Store Multiply (32-bit) Multiply (64-bit) Divide (32-bit) Divide (64-bit)
Table Some Integer-Instruction Latencies
three types instruction types shown Figure 2.9.
I-Type (Immediate)
J-Type (Jump)
immediate
R-Type (Register)
target
immediate target funct
funct
6-bit operation code 5-bit source register specifier 5-bit target (source/destination) register branch condition 16-bit immediate value, branch displacement address displacement 26-bit jump target address 5-bit destination register specifier 5-bit shift amount 6-bit function field
Figure Instruction Formats
instruction includes following types instructions: Load Store instructions move data between memory general registers. They immediate (I-type) instructions, since only addressing mode supported base register plus 16-bit, signed immediate offset. Computational instructions perform arithmetic, logical, shift, multiply, divide operations values registers. They include register (R-type, which both operands result stored registers) immediate (I-type, which operand 16-bit immediate value) formats. Jump Branch instructions change control flow program. Jumps always made paged, absolute address formed combining 26-bit target address with high-order bits Program Counter (J-type format) register address (R-type format). Branches have 16-bit offsets relative program counter (I-type). Jump Link instructions save their return address register Coprocessor instructions perform operations coprocessors. Coprocessor load store instructions I-type. Coprocessor (system coprocessor) instructions perform operations registers control memory management exception handling facilities processor standby mode power management. Special instructions perform system calls breakpoint operations. These instructions always R-type. Exception instructions cause branch general exception-handling vector based upon result comparison. These instructions occur both R-type (both operands result registers) I-type (one operand 16-bit immediate value) formats.
Load store immediate (I-type) instructions that move data between memory general registers. only addressing mode that load store instructions directly support base register plus 16-bit signed immediate offset. Table lists load store instructions.
PREF
Load Byte Load Byte Unsigned Load Doubleword Load Doubleword Left Load Doubleword Right Load Halfword Load Halfword Unsigned Load Linked Load Linked Doubleword Load Word Load Word Left Load Word Right Load Word Unsigned Prefetch, Register Offset Prefetch Indexed, Register Register Store Byte Store Conditional Store Conditional Doubleword Store Doubleword Store Doubleword Left Store Doubleword Right Store Halfword Store Word Store Word Left Store Word Right Sync
PREFX SYNC
Prefetch implemented R5000; these instructions no-ops.
Table Load Store Instructions
load instruction that does allow result used instruction immediately following called delayed load instruction. instruction slot immediately following this delayed load instruction referred load delay slot. R5000, instruction immediately following load instruction reference contents loaded register, hardware interlocks insert additional real cycles. Consequently, scheduling load delay slots desirable, both performance R-Series processor compatibility. However, scheduling load delay slots required.
Access type indicates size R5000 data item loaded stored. access type determined load/store instruction opcode. Regardless access type byte ordering (endianness), address specifies low-order byte addressed field. big-endian configuration, low-order byte most-significant byte; little-endian configuration, low-order byte least-significant byte. access type, together with three low-order bits address, define bytes accessed within addressed doubleword (shown Table 2.4). Only combinations shown Table permissible; other combinations cause address-error exceptions.
Doubleword Septibyte Sextibyte Quintibyte Word Triplebyte
Halfword
Byte
Table Byte Access Within Doubleword
Computational instructions either register (R-type) format, which both operands registers, immediate (I-type) format, which operand 16-bit immediate. Computational instructions perform following operations register values: arithmetic logical shift multiply divide These operations following four categories computational instructions: Immediate instructions three-Operand Register-Type instructions shift instructions multiply divide instructions Table through Table list computational instructions.
ADDI ADDIU ANDI DADDI DADDIU SLTI SLTIU XORI Immediate
Immediate Unsigned Immediate Doubleword Immediate Doubleword Immediate Unsigned Load Upper Immediate Immediate Less Than Immediate Less Than Immediate Unsigned Exclusive Immediate
Table Arithmetic Instructions (ALU Immediate)
ADDU DADD DADDU DSUB DSUBU Unsigned
Doubleword Doubleword Unsigned Doubleword Subtract Doubleword Subtract Unsigned Less Than Table Arithmetic (3-Operand, R-Type)
SLTU SUBU Subtract Subtract Unsigned Exclusive
Less Than Unsigned
Table Arithmetic (3-Operand, R-Type)
DSLL DSRL DSRA DSLLV DSRLV DSRAV DSLL32 DSRL32 DSRA32 SLLV SRAV SRLV
Doubleword Shift Left Logical Doubleword Shift Right Logical Doubleword Shift Right Arithmetic Doubleword Shift Left Logical Variable Doubleword Shift Right Logical Variable Doubleword Shift Right Arithmetic Variable Doubleword Shift Left Logical Doubleword Shift Right Logical Doubleword Shift Right Arithmetic Shift Left Logical Shift Left Logical Variable Shift Right Arithmetic Shift Right Arithmetic Variable Shift Right Logical Shift Right Logical Variable Table Shift Instructions
DIVU DMULT DMULTU DDIV DDIVU MFHI MTHI MFLO MOVF MOVN Divide
Divide Unsigned Doubleword Multiply Doubleword Multiply Unsigned Doubleword Divide Doubleword Divide Unsigned Move From Move Move From Move Conditional Condition Code False Move Register Equal Zero Table Multiply Divide Instructions
MOVT MOVZ MTLO MULT MULTU
Move Conditional Condition Code True Move Register Equal Zero Move Multiply Multiply Unsigned Table Multiply Divide Instructions
!When operating 64-bit mode, 32-bit operands must sign-extended. 32-bit operand opcodes include non-doubleword operations, such ADD, ADDU, SUB, SUBU, ADDI, SLL, SRA, SLLV, etc. result operations that incorrect sign-extended 32-bit values unpredictable.
MFHI MFLO instructions interlocked that attempt read them before prior instructions complete, delays execution these instructions until prior instructions finish. Table gives number processor cycles (PCycles) required resolve interlock stall between various multiply divide instructions, subsequent MFHI MFLO instruction.
MULT/MULTU DIV/DIVU DMULT/DMULTU DDIV/DDIVU
Table Multiply/Divide Instruction Latency Repeat Rates
Jump branch instructions change control flow program. jump branch instructions occur with delay instruction; that instruction immediately following jump branch (the instruction delay slot) always executes while target instruction being fetched from storage. Subroutine calls high-level languages usually implemented with Jump Jump Link instructions, both which J-type instructions. J-type format, 26-bit target address shifts left bits combines with high-order bits current program counter form absolute address. Returns, dispatches, large cross-page jumps usually implemented with Jump Register Jump Link Register instructions. Both R-type instructions that take 32-bit 64-bit byte address contained general purpose registers. branch-instruction target addresses computed adding address instruction delay slot 16-bit offset (shifts left bits sign-extended bits). branches occur with delay instruction. conditional branch taken, instruction delay slot nullified.
Table 2.10 lists jump branch instructions.
BCzFL BCzTL BEQL BGEZ BGEZAL BGEZALL BGEZL BGTZ BGTZL BLEZ BLEZL BLTZ BLTZL BLTZAL BLTZALL BNEL JALR
Branch Coprocessor False Likely Branch Coprocessor True Likely Branch Equal Branch Equal Likely Branch Greater Than Equal Zero Branch Greater Than Equal Zero Link Branch Greater Than Equal Zero Link Likely Branch Greater Than Equal Zero Likely Branch Greater Than Zero Branch Greater Than Zero Likely Branch Less Than Equal Zero Branch Less Than Equal Zero Likely Branch Less Than Zero Branch Less Than Zero Likely Branch Less Than Zero Link Branch Less Than Zero Link Likely Branch Equal Branch Equal Likely Jump Jump Link Jump Link Register Jump Register Table 2.10 Jump Branch Instructions
Special instructions allow software initiate traps. They always R-type. Table 2.11 lists special instructions.
SYSCALL BREAK Break
System Call
Table 2.11 Special Instructions
Coprocessor instructions perform operations their respective coprocessors. Coprocessor loads stores I-type, coprocessor computational instructions have coprocessor-dependent formats. instructions perform operations specifically System Control Coprocessor registers manipulate memory management exception handling facilities processor. Table 2.12 Table 2.13 list coprocessor instructions. Table 2.14 lists instructions used exception processing.
BCzT BCzF CFCz COPz CTCz DMFCz DMTCz LDCz LWCz MFCz MTCz SDCz SWCz
Branch Coprocessor True Branch Coprocessor False Move Control From Coprocessor Coprocessor Operation Move Control Coprocessor Doubleword Move From Coprocessor Doubleword Move Coprocessor Load Double Coprocessor Load Word Coprocessor Move From Coprocessor Move Coprocessor Store Double Coprocessor Store Word from Coprocessor Table 2.12 Coprocessor Instructions
CACHE DCTR DCTW DMFC0 DMTC0 ERET MFC0 MTC0 TLBP TLBR TLBW TLBWI TLBWR WAIT Cache Operation
Data Cache Read Data Cache Write Doubleword Move From Doubleword Move Exception Return Move from Move Probe Matching Entry Read Indexed Entry Write Entry Write Indexed Entry Write Random Entry Enter Standby Mode Table 2.13 Instructions
TEQI TGEI TGEIU TGEU TLTI TLTIU TLTU TNEI Trap Equal Trap Equal Immediate Trap Greater Than Equal
Trap Greater Than Equal Immediate Trap Greater Than Equal Immediate Unsigned Trap Greater Than Equal Unsigned Trap Less Than Trap Less Than Immediate Trap Less Than Immediate Unsigned Trap Less Than Unsigned Trap Equal Trap Equal Immediate Table 2.14 Exception Instructions
following additions MIPS instruction included MIPS instruction set.
PREF Register Offset Format PREFX Register Register Format R5000 does implement prefetch actions; these instruction executed NO-OPS, fully compatible with MIPS instruction architecture. their normal implementation, rather than no-ops, prefetch instructions allow compiler issue instructions early corresponding data fetched placed close possible CPU. Each instruction contains 5-bit hint field which gives coherency status line being prefetched. line either shared, exclusive clean, exclusive dirty. contents general register specified base added either sign-extended offset contents general register specified index form virtual address. This address together with hint field sent cache controller memory access initiated. region bits, 63:62, effective address must supplied base. addition alters these bits address exception occurs. prefetch instruction never generates TLB-related exceptions. PREF instruction considered standard processor instruction while PREF instruction considered standard Coprocessor instruction.)
"#MOVT Move Conditional Condition Code True MOVF Move Conditional Condition Code False MOVN Move Conditional Register Equal Zero MOVZ Move Conditional Register Equal Zero four Integer Conditional Move instructions used test condition code general register then conditionally perform integer move. value floating-point condition code specified instruction 3-bit condition code specifier, value register indicated 5-bit general register specifier, compared zero. result indicates that move should performed, contents specified source register copied into specified destination register.
R5000 floating-point unit (FPU) operational functions consist adder, multiplier, divider. FPU, with associated system software, conforms fully ANSI/IEEE Standard 754-1985, IEEE Standard Binary Floating-Point Arithmetic. addition, MIPS architecture fully supports recommendations standard precise exceptions. operates coprocessor assigned coprocessor label CP1), extends instruction perform operations floating-point values. following basic features: 32-bit 64-bit Operation. Status register controls selection 32-bit 64-bit mode. Each register hold single- double-precision values. Load Store Instruction Set. Like CPU, uses load- store-oriented instruction set, with single-cycle load store operations. Tightly Coupled Coprocessor Interface. resides on-chip form tightly coupled unit with seamless integration floating-point fixed-point instruction sets. Since each unit receives executes instructions parallel, some floating-point instructions execute same single-cycle-per-instruction rate fixed-point instructions. Figure illustrates functional organization FPU.
Data Cache
Floating-Point Control
Control
Bypass Pipeline Chain Add/ Sub/Cvt Div/Sqrt
File
Figure Functional Block Diagram
Floating-Point General Registers (FGRs) that accessed following ways: general purpose registers FGRs), each which bits wide, when Status register equals general purpose registers FGRs), each which 64-bits wide, when equals accesses these registers through move, load, store instructions. floating-point registers (see next section description FPRs), each which bits wide, when Status register equals FPRs hold values either single- double-precision floating-point format. Each corresponds adjacently numbered FGRs, shown Figure 3.2. floating-point registers (see next section description FPRs), each which bits wide, when Status register equals FPRs hold values either single- double-precision floating-point format. Each corresponds shown Figure 3.2.
Floating-Point Registers (FPR)
FPR0 (least) ost) FPR2 (least) ost)
Floating-Point General Registers
(FGR) FGR0 FGR1 FGR2 FGR3
Floating-Point Registers (FPR)
FPR0 FPR1 FPR2 FPR3 FPR28 FPR29 FPR30 FPR31
Floating-Point General Registers
FGR0 FGR1 FGR2 FGR3
FPR28 (least) ost) FPR30 (least) ost)
FGR28 FGR29 FGR30 FGR31
Control/Status Register FCR31
Floating-Point Control Registers (FCR)
Implem entation/Revision Register FCR0
Figure Registers
FPRs shown Figure 3.2. These 64-bit registers hold floating-point values during floating-point operations physically formed from Floating-Point General Registers (FGRs). When references single 64-bit register numbers valid. equals only even numbers (the least register, shown Figure 3.2) used address FPRs. equals during double-precision floating-point operation, general registers accessed double pairs. Thus, double-precision operation, selecting Floating-Point Register (FPR0) actually addresses adjacent Floating-Point General Purpose registers FGR0 FGR1.
Table lists floating-point control registers (FCRs). These only accessed Move operations. FCRs include:
FCR0 FCR1 FCR30 FCR31 Reserved
Implementation/Revision register: holds revision information about FPU.
Control/Status register: controls monitors exceptions, holds result compare operations, establishes rounding modes. Table Floating-Point Control Register Assignments
read-only Implementation Revision register (FCR0) specifies implementation revision number FPU. This information determine coprocessor revision performance level, also used diagnostic software. Figure shows layout register. Table describes register fields.
Implementation/Revision Register (FCR0)
Figure Implementation/Revision Register
Implementation number (0x23)
Revision number form Reserved. Must written zeroes, returns zeroes when read. Table FCRO Fields
revision number value form y.x, where: major revision number held bits 7:4. minor revision number held bits 3:0. revision number distinguishes some chip revisions; however, MIPS does guarantee that changes chips necessarily reflected revision number, that changes revision number necessarily reflect real chip changes. this reason, revision number values listed, software should rely revision number characterize chip.
Control/Status register (FCR31) contains control status information that accessed instructions either Kernel User mode. FCR31 also controls arithmetic rounding mode enables User-mode traps, well identifying exceptions that have occurred most recently executed instruction, along with exceptions that have occurred without being trapped. Figure shows format Control/Status register, Table describes register fields. Figure shows Control/Status register Cause, Flag, Enable fields.
Control/Status Register (FCR31)
Cause EVZOUI Division zero Overflow Enables VZOUI Flags VZOUI Underflow Inexact Operation
Legend: Unimplemented Operation Invalid Operation
Figure Control/Status Register Assignments
Cause Enables Flags Condition code.
When set, denormalized results flushed instead causing unimplemented operation exception. Condition bit. description Control/Status register Condition bit. Cause bits. description Control/Status register Cause, Flag, Enable bits. Enable bits. description Control/Status register Cause, Flag, Enable bits. Flag bits. description Control/Status register Cause, Flag, Enable bits. Rounding mode bits. description Control/Status register Rounding Mode Control bits. Table Control/Status Register Fields
When Control/Status register read Move Control From Coprocessor (CFC1) instruction, unfinished instructions pipeline completed before contents register moved main processor. floating-point exception occurs pipeline empties, exception taken CFC1 instruction re-executed after exception serviced. bits Control/Status register cleared writing register using Move Control Coprocessor (CTC1) instruction. FCR31 must only written when actively executing floating-point operations; this ensured reading contents register empty pipeline.
IEEE Standard specifies that floating-point operations detect certain exceptional cases, raise flags, invoke exception handler when exception occurs. These features implemented MIPS architecture with Cause, Enable, Flag fields Control/Status register. Flag bits implement IEEE-754 exception-status flags, Cause Enable bits implement exception handling.
When set, denormalized results flushed instead causing unimplemented operation exception.
When floating-point Compare operation takes place, result stored Condition bit, save restore state condition line. condition true; cleared condition false. affected only Compare Move Control instructions.
Figure illustrates Cause, Flag, Enable fields Control/Status register.
Cause Bits Enable Bits Flag Bits
Inexact Operation Underflow Overflow Division Zero Invalid Operation Unimplemented Operation
Figure Control/Status Register Cause, Flag, Enable Fields
Bits 17:12 Control/Status register contain Cause bits, shown Figure 3.5. Cause bits reflect results most recently executed instruction. bits logical extension Cause register; they identify exceptions raised last floating-point operation raise interrupt exception corresponding enable set. more than exception occurs single instruction, each appropriate set. Cause bits written each floating-point operation (but Load, Store, Move operations). Unimplemented Operation software emulation required, otherwise remains other bits cleared indicate occurrence non-occurrence (respectively) IEEE-754 exception. When floating-point exception taken, results stored, only state affected Cause bit.
floating-point exception generated time Cause corresponding Enable set. floating-point operation that sets enabled Cause forces immediate exception, does setting both Cause Enable bits with CTC1. There enable Unimplemented Operation (E). Setting Unimplemented Operation always generates floating-point exception. Before returning from floating-point exception, software must first clear enabled Cause bits with CTC1 instruction prevent repeat interrupt. Thus, User-mode programs never observe enabled Cause bits set; this information required User-mode handler, must passed somewhere other than Status register.
floating-point operation that sets only unenabled Cause bits, exception occurs default result defined IEEE-754 stored. this case, exceptions that were caused immediately previous floating-point operation determined reading Cause field.
When exception case detected exception Enable set, corresponding flag set. exception taken, none flag bits modified. Note, however, that system software flag bits before invoking user exception handler. Flag bits cumulative indicate that exception raised operation that executed since they were explicitly reset. Flag bits IEEE-754 exception raised, otherwise they remain unchanged. Flag bits never cleared side effect floating-point operations; however, they cleared writing value into Status register, using Move-To-Coprocessor Control instruction.
Bits Control/Status register constitute Rounding Mode (RM) field. shown Table 3.4, these bits specify rounding mode that uses floatingpoint operations.
Round result nearest representable value; round value with least-significant when nearest representable values equally near. Round toward round value closest greater magnitude than infinitely precise result. Round toward round value closest less than infinitely precise result. Round toward round value closest greater than infinitely precise result.
Table Rounding Mode Decoding
supports both floating-point fixed-point data formats. floating-point formats single-precision binary double-precision binary. fixed-point formats 64-bit binary.
performs both 32-bit (single-precision) 64-bit (double-precision) IEEE standard floating-point operations. 32-bit single-precision format 24-bit signed-magnitude fraction field (f+s) 8-bit exponent (e), shown Figure 3.6.
Sign
Exponent
Fraction
Figure Single-Precision Floating-Point Format
64-bit double-precision format 53-bit signed-magnitude fraction field (f+s) 11-bit exponent, shown Figure 3.7.
Sign
Exponent
Fraction
Figure Double-Precision Floating-Point Format
shown above figures, numbers floating-point format composed three fields: sign field, biased exponent, bias fraction, .b1b2.bp-1 range unbiased exponent includes every integer between values Emin Emax inclusive, together with other reserved values: Emin encode denormalized numbers) Emax encode NaNs [Not Number]) single- double-precision formats, each representable non-zero numerical value just encoding. single- double-precision formats, value number, determined equations shown Table 3.5.
Emax+1 then (-1)s Emin Emax, then (-1)s2E(1.f)
Emax+1 then NaN, regardless
Emin-1 then (-1)s2Emin(0.f) Emin-1 then (-1)s0 Table Calculating Values Single Double-Precision Formats
floating-point formats, NaN, most-significant determines whether value signaling quiet NaN: signaling most-significant set, otherwise, quiet NaN. Table defines values format parameters; minimum maximum floating-point values given Table 3.7.
Emax Emin Exponent bias Exponent width bits Integer (Fraction width bits) Format width bits +127 -126 +127
+1023 -1022 +1023 hidden
hidden Table Floating-Point Format Parameter Values
Float Minimum Float Minimum Norm Float Maximum Double Minimum Double Minimum Norm Double Maximum
1.40129846e-45 1.17549435e-38 3.40282347e+38 4.9406564584124654e-324 2.2250738585072014e-308 1.7976931348623157e+308 Table Minimum Maximum Floating-Point Values
Binary fixed-point values held two's complement format. Unsigned fixed-point values directly provided floating-point instruction set. Figure illustrates binary fixed-point format; Table lists binary fixed-point format fields.
Sign
Integer
Figure Binary Fixed-Point Format
sign integer sign integer value
Table Binary Fixed-Point Format Fields
instructions bits long, aligned word boundary. They divided into following groups: Load, Store, Move instructions move data between memory, main processor, General Purpose registers. Conversion instructions perform conversion operations between various data formats. Computational instructions perform arithmetic operations floating-point values registers. Compare instructions perform comparisons contents registers conditional based results. Branch Condition instructions perform branch specified target specified coprocessor condition met. instruction formats shown Table through Table 3.12, appended instruction opcode specifies data format: specifies single-precision binary floating-point, specifies double-precision binary floating-point, specifies 32-bit binary fixed-point, specifies 64-bit (long) binary fixed-point.
LWC1 LWXC1 SWC1 SWXC1 LDC1 LDXC1 SDC1 SDXC1 MTC1 MFC1 CTC1 CFC1 DMTC1 DMFC1 PREF PREFX Load Word
Load Word Indexed Store Word from Store Word Indexed from Load Doubleword Load Doubleword Indexed Store Doubleword From Store Doubleword Indexed From Move Word Move Word From Move Control Word Move Control Word From Doubleword Move Doubleword Move From Prefetch Register Offset Prefetch Indexed Register Register Table Instruction Summary: Load, Move Store Instructions
CVT.S.fmt CVT.D.fmt CVT.W.fmt CVT.L.fmt ROUND.W.fmt ROUND.L.fmt TRUNC.W.fmt TRUNC.L.fmt CEIL.W.fmt CEIL.L.fmt FLOOR.W.fmt FLOOR.L.fmt
Floating-Point Convert Single Floating-Point Convert Double Floating-Point Convert 32-bit Fixed Point Floating-Point Convert 64-bit Fixed Point Floating-Point Round 32-bit Fixed Point Floating-Point Round 64-bit Fixed Point Floating-Point Truncate 32-bit Fixed Point Floating-Point Truncate 64-bit Fixed Point Floating-Point Ceiling 32-bit Fixed Point Floating-Point Ceiling 64-bit Fixed Point Floating-Point Floor 32-bit Fixed Point Floating-Point Floor 64-bit Fixed Point
Table 3.10 Instruction Summary: Conversion Instructions
ADD.fmt SUB.fmt MUL.fmt DIV.fmt ABS.fmt MOV.fmt NEG.fmt SQRT.fmt RECIP RSQRT Floating-Point
Floating-Point Subtract Floating-Point Multiply Floating-Point Divide Floating-Point Absolute Value Floating-Point Move Floating-Point Negate Floating-Point Square Root Floating-Point Reciprocal Floating-Point Reciprocal Square Root Table 3.11 Instruction Summary: Computational Instructions
C.cond.fmt BC1T BC1F BC1TL BC1FL Floating-Point Compare Branch True Branch False Branch True Likely Branch False Likely
Table 3.12 Instruction Summary: Compare Branch Instructions
This section discusses manner which uses load, store move instructions listed Table 3.9.
data movement between memory accomplished using following instructions: Load Word Coprocessor (LWC1) Store Word From Coprocessor (SWC1) instructions, which reference single 32-bit word general registers. Load Doubleword (LDC1) Store Doubleword (SDC1) instructions, which reference 64-bit doubleword. These load store operations unformatted; format conversions performed therefore floating-point exceptions occur these operations.
Data also moved directly between using following instructions: Move Coprocessor (MTC1). Move From Coprocessor (MFC1). Doubleword Move Coprocessor (DMTC1). Doubleword Move From Coprocessor (DMFC1). Like floating-point load store operations, these operations perform format conversions never cause floating-point exceptions.
instruction immediately following load contents loaded register. such cases hardware interlocks, requiring additional real cycles; this reason, scheduling load-delay slots desirable, although required.
coprocessor loads stores reference following aligned data items: word loads stores, access type always WORD, low-order bits address must always doubleword loads stores, access type always DOUBLEWORD, loworder bits address must always
Regardless byte-numbering order (endianness) data, address specifies byte that smallest byte address addressed field. big-endian system, leftmost byte; little-endian system, rightmost byte.
Conversion instructions perform conversions between various data formats such single- double-precision, fixed- floating-point formats.
Computational instructions perform arithmetic operations floating-point values, registers. There categories computational instructions: 3-operand register-type instructions, which perform floating-point addition, subtraction, multiplication, division. 2-operand register-type instructions, which perform floating-point absolute value, move, negate, square-root operations. detailed description each instruction, refer MIPS Microprocessor Family Software Manual.
Branch (coprocessor unit condition instructions that test result compare (C.cond) instructions. detailed description each instruction, refer MIPS Microprocessor Family Software Manual.
floating-point compare (C.fmt.cond) instructions interpret contents registers (fs, specified format (fmt) arithmetically compare them. result determined based comparison conditions (cond) specified instruction. Table 3.13 lists mnemonics compare-instruction conditions.
True Ordered Equal Ordered Less Than Greater Than Unordered Greater Than Equal Ordered Greater Than Unordered Greater Than Ordered Greater Than Signaling True Greater Than, Less Than Equal Signaling Equal Greater Than Less Than Less Than Greater Than Equal Less Than Equal Greater Than
NGLE
False Unordered Equal Unordered Equal Ordered Less Than Unordered Less Than Ordered Less Than Equal Unordered Less Than Equal Signaling False Greater Than Less Than Equal Signaling Equal Greater Than Less Than Less Than Greater Than Equal Less Than Equal Greater Than
Table 3.13 Mnemonics Compare-Instruction Conditions
following additions MIPS instruction included MIPS instruction set.
LWXC1 Load word indexed Coprocessor LDXC1 Load doubleword indexed Coprocessor
Indexed Floating-Point Load instructions transfer floating-point data types from memory floating-point registers using register register addressing mode. contents general register specified base added contents general register specified index form virtual address. contents word doubleword specified effective address loaded into floating-point register specified instruction. There indexed loads general registers. region bits (63:62) effective address must supplied base. addition alters these bits address exception occurs. Also, address aligned, address exception occurs.
SWXC1 Store word indexed Coprocessor SDXC1 Store doubleword indexed Coprocessor
Indexed Floating-Point Store instructions transfer floating-point data types from floating-point registers memory using register register addressing mode. contents general register specified base added contents general register specified index form virtual address. contents floating-point register specified instruction stored memory location specified effective address. region bits (63:62) effective address must supplied base. addition alters these bits address exception occurs. Also, address aligned, address exception occurs.
BC1T Branch Condition True BC1F Branch Condition False BC1TL Branch Condition True Likely BC1FL Branch Condition False Likely four Branch Floating-Point Coprocessor instructions extensions branch instructions various prior MIPS instruction sets, with which they upward-compatible. BC1T BC1F instructions extensions MIPS BC1TL BC1FL extensions MIPS III. These instructions test eight floating-point condition codes. condition code specified, condition code zero selected. This encoding downward-compatible with previous MIPS architectures. branch target address computed from address instruction delay slot 16-bit offset, shifted left bits sign-extended bits. contents floating-point condition code specified instruction equal test value, target address branched with delay instruction. conditional branch taken nullify delay instruction set, instruction branch delay slot nullified.
&)'%
MADD Floating-Point Multiply-Add MSUB Floating-Point Multiply-Subtract NMADD Floating-Point Negative Multiply-Add NMSUB Floating-Point Negative Multiply-Subtract four Floating-Point Multiply-Add/Subtract instructions compute floating-point operations with instruction. Each instructions performs intermediate rounding.
C.cond Compare C.cond Implies cc=0
Floating-Point Compare instructions upward-compatible extensions floatingpoint compare instructions MIPS instruction produce boolean result which stored condition codes. contents source registers specified instruction interpreted arithmetically compared. result determined based comparison conditions specified instruction. values number high order condition field set, invalid operations trap occurs. Comparisons exact neither overflow underflow.
implication compiler code scheduling that compare instruction immediately followed dependent floating-point conditional move instruction, immediately followed dependent branch floating-point coprocessor condition instruction dependent integer conditional move instruction. This restriction applies only condition code specified 3-bit condition code specifier instruction. other condition codes unaffected.
MOVT.fmt Floating-Point Conditional Move condition code true MOVF.fmt Floating-Point Conditional Move condition code false MOVN.fmt Floating-Point Conditional Move register equal zero MOVZ.fmt Floating-Point Conditional Move register equal zero four Floating-Point Conditional Move instructions used test condition code general register then conditionally perform floating-point move. value floating-point condition code specified 3-bit condition code specifier, value register indicated 5-bit general register specifier, compared zero. result indicates that move should performed, contents specified source register copied into specified destination register. these conditional floating-point move operations non-arithmetic. Consequently, IEEE-754 exceptions occur result these instructions.
RECIP.fmt Reciprocal Approximation RSQRT.fmt Reciprocal Square Root Approximation
Reciprocal Approximation instruction performs reciprocal approximation floating-point value. reciprocal value floating-point source register approximated placed destination register. numerical accuracy this operation implementation-dependent, based rounding mode used. Reciprocal Square Root Approximation instruction performs reciprocal square root approximation floating-point value. reciprocal positive square root value floatingpoint source register approximated placed destination register. numerical accuracy this operation implementation-dependent, based rounding mode used. approximation fact that neither these instructions meets IEEE accuracy requirements. both cases small amount precision been sacrificed, thereby significantly reducing execution time. example, case RECIP instruction, computed taking reciprocal multiplying that result reduced execution time reciprocal operation allows RECIP followed (multiply) instruction executed faster than single (divide) instruction. performance difference between RSQRT instruction SQRT followed instruction implementation-dependent. R5000, RECIP instruction same latency instruction, RSQRT faster than SQRT followed RECIP.
Table 3.14 shows execution-stage latencies repeat throughput instructions. values assume result operation immediately used succeeding operation.
Absolute BC1T BC1F BC1TL BC1FL CEIL.w CEIL.l CFC1 Compare CTC1 CVT.s.d CVT.s.w CVT.s.l2 CVT.d.s CVT.d.w CVT.d.l
CVT.w.s CVT.w.d CVT.l.s CVT.l.d DIV.s DIV.d DMFC1 DMTC1 FLOOR.w FLOOR.l LDC1 Load Load Indexed LWC1 MADD.s MADD.d MFC1 Move
Trap greater than bits significance Trap greater than bits significance.
Table 3.14 Floating-Point Instruction Latencies
Move Conditional MSUB.s MSUB.d MTC1 MUL.s MUL.d Negative NMADD.s NMADD.d NMSUB.s NMSUB.d Prefetch Prefetch Indexed RECIP.s RECIP.d ROUND.w ROUND.l
RSQRT.s RSQRT.d SDC1 SQRT.s SQRT.d Store Store Indexed Subtract SWC1 TRUNC.w TRUNC.l
Trap greater than bits significance Trap greater than bits significance.
Table 3.14 Floating-Point Instruction Latencies
R5000 processor dual-issue, five-stage instruction pipeline with parallel paths, integer (CPU) instructions other floating-point (FPU) instructions. Each stage CPU-instruction path takes PCycle (one cycle processor clock, which runs multiple frequency system clock, SysClock). Thus, execution each instruction takes least five PCycles. instruction take longer-for example, required data cache, data must retrieved from main memory. FPU-instruction path, most instructions require more than PCycle execution stage. Once pipeline been filled, five instructions executed simultaneously. Figure shows five stages instruction pipeline.
PCycle
Figure Instruction Pipeline Stages
Instruction Fetch, Phase Instruction Fetch, Phase Register Read, Phase Register Read, Phase Execution, Phase Execution, Phase Data Load/Store, Phase Data Load/Store, Phase Write Back, Phase Write Back, Phase
Figure shows pipeline activities occurring during each pipeline stage, load, store, branch instructions.
Clock Phase IFetch Decode Load/Store ITLBM ITLBR IDEC DCAD DTLBM Branch DCAA JTLB1 DTLBR DCLA JTLB2
ITLBM IDEC DCAA JTLB1 DTLBM
Instruction-cache address decode Instruction address translation match Instruction check Instruction address translation phase Execute operation phase Data virtual-address calculation Data-cache array access JTLB address translation phase Data address translation match Data check Data-cache write
ITLBR DCAD DCLA JTLB2 DTLBR
Instruction-cache array access Instruction address translation read Register operand read Execute operation phase Write back register file Data-cache address decode Data-cache load align JTLB address translation phase Data address translation read Store align Branch address calculation
Figure Integer (CPU) Pipeline Activities
R5000 dual-issue mechanism allows instructions dispatched processor cycle (PCycle) under following condition: floating-point operation dispatched along with other type instruction, long other instruction another floating-point operation. this context, "any other type instruction" includes integer instructions well floatingpoint loads stores. Figure shows simplified diagram dual issue mechanism.
2-deep buffer Instr Cache instr
Read Integer Register File
Read Register File Stage
Stage
Integer File Write
Integer Load/Store
Integer Execution
Register File Write Stage
Load/Store Stage
Execution Stage
Figure Dual-Issue Mechanism, Showing Pipelines
events that occur each stage are:
Stage
instructions fetched from instruction cache placed 2-deep instruction buffer. Issue logic determines type instruction which pipeline instruction routed Also, instruction cache checked against page frame number (PFN) obtained from ITLB.
Stage
required operands fetched from appropriate register file, decision made either proceed slip instruction based interlock conditions. branch instruction, branch address calculated.
Stage
appropriate begins arithmetic, logical, shift operation. data virtual address calculated load store instructions. appropriate determines whether branch condition true. data cache access started.
Stage
data cache access completed. Data shifted down extended. Data address translation DTLB completes. virtual physical address translation JTLB performed. data cache checked against from DTLB JTLB data cache access.
Stage
processor resolves exceptions. register-to-register load instructions, result written back appropriate register file.
pipeline branch delay cycle load delay cycle. one-cycle branch delay result branch comparison logic operating during pipeline stage branch. This allows branch-target address calculated previous stage used instruction access following stage. Figure illustrates branch delay.
Cycle
Cycle
Cycle
Cycle
Cycle
Branch Delay Slot
Branch fall-through address calculated Address selection made Figure CPU-Pipeline Branch Delay
completion load pipeline stage produces operand that available pipeline stage subsequent instruction following load delay slot. Figure shows load delay.
Cycle Cycle Cycle Cycle Cycle
Load Delay Slot
Figure CPU-Pipeline Load Delay
Smooth pipeline flow interrupted when cache misses exceptions occur, when data dependencies detected. Interruptions handled using hardware, such cache misses, referred interlocks, while those that handled using software called exceptions. There types interlocks: Stalls, which resolved halting pipeline. Slips, which require part pipeline advance while another part pipeline held static.
each cycle, exception interlock conditions checked active instructions. Because each exception interlock condition corresponds particular pipeline stage, condition traced back particular instruction exception/interlock stage. instance, Reserved Instruction (RI) exception raised execution stage.
Stall I
Slip
MDSt FCBusy
Exceptions
ITLB
IPErr
DTLB DTMod Intr
Reset DPErr Trap
Table Relationship CPU-Pipeline Stage Interlock Condition
ITLB Intr IPErr ExTrap DTLB TLBMod DPErr Reset External Interrupt IBus Error Reserved Instruction Breakpoint System Call Coprocessor Unusable Instruction Parity Error Integer Overflow Interrupt Stage Traps
Instruction Translation Address Exception
Data Translation Address Exception Modified Data Error Data Parity Error Non-maskable Interrupt Reset Table CPU-Pipeline Exceptions
IICM CPBE MDSt FCBsy Instruction Miss Instruction Cache Miss Coprocessor Possible Exception Data Cache Miss Load Interlock Multiply/Divide Start Busy
Table CPU-Pipeline Interlocks
When exception condition occurs, relevant instruction those that follow pipeline cancelled. Accordingly, stall conditions later exception conditions that have referenced this instruction inhibited; there benefit servicing stalls cancelled instruction. When exception condition detected, processor aborts instruction which caused exception, well subsequent instructions. When this instruction reaches stage, three events occur; exception flag causes instruction write various registers with exception state, current changed appropriate exception vector address, exception bits earlier pipeline stages cleared. This implementation allows instructions which occurred before exception complete, instructions which occurred after instruction aborted. Hence value such that execution restarted. addition, exceptions guaranteed taken order. Figure illustrates exception detection mechanism Reserved Instruction (RI) exception.
Cycle Exception
Cycle
Cycle
Cycle
Cycle
Instruction Aborted
Exception Vector Address
Figure CPU-Pipeline Exception Detection Mechanism
stall condition used suspend pipeline conditions detected after pipeline stage. When stall occurs, processor resolves condition then restarts pipeline. Once interlock removed, restart sequence begins cycles before pipeline resumes execution. restart sequence reverses pipeline overrun inserting correct information into pipeline. Figure shows data cache miss stall.
Detect cache miss Start moving dirty cache line data write buffer Fetch first doubleword into cache restart pipeline Load remainder cache line into cache
Figure CPU-Pipeline Servicing Data Cache Miss
data cache miss detected stage pipeline. cache line replaced dirty, data moved internal write buffer next cycle. squiggly line Figure indicates memory access. Once memory accessed first doubleword data returned, pipeline restarted. remainder cache line returned subsequent cycles. dirty data write buffer written memory after cache-line fill completed.
During pipeline stages, internal logic determines whether possible start current instruction this cycle. required source operands available, well hardware resources needed complete operation, instruction issued. Otherwise, instruction slips. Slipped cycles retried subsequent cycles until they issued. Pipeline stages advance normally during slips attempt resolve conflict. NOPs automatically inserted into bubbles which created pipeline. Instructions caused "branch likely" instructions, ERET, exceptions cause slips. Figure shows instruction slip during instruction-cache miss.
Issue Issue Slip Slip Slip Slip Slip Issue Issue
Issue
Detect cache miss Start moving dirty cache line data write buffer Fetch first doubleword into cache restart pipeline Load remainder cache line into cache Figure Slips During Instruction-Cache Miss
Instruction-cache misses detected R-stage pipeline. Slips detected stage. Instruction-cache misses never require writeback operation because writes allowed instruction cache. Unlike data cache, early restart, where pipeline restarted after only portion cache-line fill occurred, implemented instruction cache. requested cache line loaded into instruction cache entirety before pipeline restarted.
processor write buffer which improves performance write operations external memory. write cycles write buffer. write buffer holds four 64-bit address data pairs. cache miss requiring write-back, entire buffer used write-back data allows processor proceed parallel with memory update. uncached write-through stores, write buffer decouples from write memory. write buffer full, additional stores stalled until there room them write buffer.
This section describes integer exception processing done CPU, including explanation exception processing, followed format each exception register. exception processing described later chapter. processor receives exceptions from number sources, including translation lookaside buffer (TLB) misses, arithmetic overflows, interrupts, system calls. When detects these exceptions, normal sequence instruction execution suspended processor enters Kernel mode. processor then disables interrupts forces execution software exception processor (called handler) located fixed address. handler typically saves context processor, including contents program counter, current operating mode (User Supervisor), status interrupts (enabled disabled). This context saved restored when exception been serviced. When exception occurs, loads Exception Program Counter (EPC) register with location where execution restart after exception been serviced. restart location register address instruction that caused exception instruction executing branch-delay slot, address branch instruction immediately preceding delay slot.
System Control Coprocessor (CP0) registers used exception processing. Table lists these registers their unique register numbers. instance, register register number remaining registers used memory management described Chapter Software examines registers during exception processing determine cause exception state time exception occurred. registers Table used exception processing, described sections that follow.
Context BadVAddr (Bad Virtual Address) Count Compare register Status Cause (Exception Program Counter) XContext CacheErr (Cache Error Status) ErrorEPC (Error Exception Program Counter)
diagnostics, this register R/W.
Table Exception Processing Registers
general registers interlocked result instruction normally used next instruction; result available right away, processor stalls until available. registers interlocked, however; there some delay before value written instruction available following instructions. This delay need explicitly coded software.
Context register read/write register containing pointer entry page table entry (PTE) array; this array operating system data structure that stores virtual-to-physical address translations. When there miss, loads with missing translation from array. Normally, operating system uses Context register address current page which resides kernel-mapped segment, kseg3. Context register duplicates some information provided BadVAddr register, information arranged form that more useful software exception handler. Figure shows format Context register; Table describes Context register fields. Context Register
32-bit Mode 64-bit Mode
PTEBase
BadVPN2
PTEBase
BadVPN2
Figure Context Register Format
BadVPN2
This field written hardware miss. contains virtual page number (VPN) most recent virtual address that have valid translation. This field operating system. normally written with value that allows operating system Context register pointer into current array memory. Table Context Register Fields
PTEBase
19-bit BadVPN2 field contains bits 31:13 virtual address that caused miss; excluded because single entry maps even-odd page pair. 4-Kbyte page size, this format directly address pair-table 8-byte PTEs. other page sizes, shifting masking this value produces appropriate address.
Virtual Address register (BadVAddr) read-only register that displays most recent virtual address that caused following exceptions: Invalid, Modified, Refill, Virtual Coherency Data Access, Virtual Coherency Instruction Fetch. Figure shows format BadVAddr register. BadVAddr register does save information errors, since errors addressing errors.
BadVAddr Register
32-bit Mode 64-bit Mode
Virtual Address
Virtual Address
Figure BadVAddr Register Format
Count register acts timer, incrementing constant rate-half maximum instruction issue rate-whether instruction executed, retired, forward progress made through pipeline. This register read written. written diagnostic purposes system initialization; example, synchronize processors. Figure shows format Count register. Count Register
Count
Figure Count Register Format
Compare register acts timer (see also Count register); maintains stable value that does change own. When value Count register equals value Compare register, interrupt IP(7) Cause register set. This causes interrupt soon interrupt enabled. Writing value Compare register, side effect, clears timer interrupt. diagnostic purposes, Compare register read/write register. normal however, Compare register write-only. Figure shows format Compare register. Compare Register
Compare
Figure Compare Register Format
Status register (SR) read/write register that contains operating mode, interrupt enabling, diagnostic states processor. following list describes more important Status register fields; Figures show format entire register, including descriptions fields. Some important fields include: 8-bit Interrupt Mask (IM) field controls enabling eight interrupt conditions. Interrupts must enabled before they asserted, corresponding bits both Interrupt Mask field Status register Interrupt Pending field Cause register. IM[1:0] software interrupt masks, while IM[7:2] correspond Int[5:0]. 4-bit Coprocessor Usability (CU) field controls usability possible coprocessors. Regardless setting, always usable Kernel mode. other cases, access unusable coprocessor causes exception. 9-bit Diagnostic Status (DS) field used self-testing, checks cache virtual memory system. Reverse-Endian (RE) bit, reverses endianness machine. processor configured either little-endian big-endian system reset; reverse-endian selection used Kernel Supervisor modes, User mode when Setting inverts User mode endianness. Figure shows format Status register. Table describes Status register fields. Figure Table provide additional information Diagnostic Status (DS) field. bits field except readable writable.
Status Register
(Cu3:.Cu0)
Figure Status Register
Controls usability each four coprocessor unit numbers. always usable when Kernel mode, regardless setting bit. Setting enables MIPS instruction set, usable unusable Reserved. Enables additional floating-point registers registers registers Reverse-Endian bit, valid User mode. Diagnostic Status field (see Figure 5.6). Interrupt Mask: controls enabling each external, internal, software interrupts. interrupt taken interrupts enabled, corresponding bits both Interrupt Mask field Status register Interrupt Pending field Cause register. disabled enabled Enables 64-bit addressing Kernel mode. extended-addressing refill exception used misses kernel addresses. 32-bit 64-bit Enables 64-bit addressing operations Supervisor mode. extended-addressing refill exception used misses supervisor addresses. 32-bit 64-bit Enables 64-bit addressing operations User mode. extended-addressing refill exception used misses user addresses. 32-bit 64-bit Mode bits User Supervisor Kernel Error Level; processor when Reset, Soft Reset, NMI, Cache Error exception taken. normal error Exception Level; processor when exception other than Reset, Soft Reset, NMI, Cache Error exception taken. normal exception Interrupt Enable disable interrupts enables interrupts Table Status Register Fields
Diagnostic Status Field
Figure Status Register Field
Controls location refill general exception vectors. normal bootstrap Reserved. Must written zeroes. Returns zeroes when read. Indicates that soft reset occurred. (tag match valid state) miss indication last CACHE Invalidate, Write Back Invalidate, Write Back, Virtual, Create Dirty Exclusive secondary cache. miss Contents register modify check bits caches when description register. Specifies that cache parity errors cannot cause exceptions. parity/ECC remain enabled disables parity/ECC Reserved. Must written zeroes, returns zeroes when read. Table Status Register Diagnostic Status Bits
Fields Status register following modes access states: Interrupt Enable: Interrupts enabled settings bits when following conditions true: Operating Modes: following Status register settings required User, Kernel, Supervisor modes. User Mode: 102, Supervisor Mode: 012, Kernel Mode: 002, 64-bit Modes: following Status register settings select 64-bit operation User, Kernel, Supervisor operating modes. Enabling 64-bit operation permits execution 64-bit opcodes translation 64-bit addresses. 64-bit operation User, Kernel Supervisor modes independently. 64-bit addressing Kernel mode enabled when 64-bit operations always valid Kernel mode. 64-bit addressing operations enabled Supervisor mode when 64-bit addressing operations enabled User mode when Access kernel address space allowed when processor Kernel mode. Access supervisor address space allowed when processor Kernel Supervisor operating mode. Access user address space allowed three operating modes.
contents Status register undefined reset, except bits, which distinguishes between Reset exception Soft Reset exception (caused either Reset* Nonmaskable Interrupt [NMI]).
32-bit read/write Cause register describes cause most recent exception. Figure shows fields this register. Table describes Cause register fields. bits Cause register, with exception IP(1:0) bits, read-only; IP(1:0) used software interrupts.
Indicates whether last exception taken occurred branch-delay slot. delay slot normal Coprocessor unit number referenced when Coprocessor Unusable exception taken. Indicates interrupt pending. interrupt pending interrupt Exception code field (see Table 5.6) Reserved. Must written zeroes, returns zeroes when read. Table Cause Register Fields
ExcCode
Cause Register
Code
Figure Cause Register Format
TLBL TLBS AdEL AdES Interrupt
modification exception exception (load instruction fetch) exception (store) Address error exception (load instruction fetch) Address error exception (store) error exception (instruction fetch) error exception (data reference: load store) Syscall exception Breakpoint exception Reserved instruction exception Table Cause Register ExcCode Fields
16-31
-FPE
Coprocessor Unusable exception Arithmetic Overflow exception Trap exception Reserved Floating-Point exception Reserved
Table Cause Register ExcCode Fields
Exception Program Counter (EPC) read/write register that contains address which processing resumes after exception been serviced. synchronous exceptions, register contains either: virtual address instruction that direct cause exception, virtual address immediately preceding branch jump instruction (when instruction branch-delay slot, Branch Delay Cause register set). processor does write register when Status register Figure shows format register. Register
32-bit Mode
64-bit Mode
Figure Register Format
read/write XContext register contains pointer entry page table entry (PTE) array, operating system data structure that stores virtual-to-physical address translations. When there miss, operating system software loads with missing translation from array. XContext register duplicates some information provided BadVAddr register, puts form useful software exception handler. XContext register with XTLB refill handler, which loads entries references 64-bit address space, included solely operating system use. operating system sets base field register, needed. Normally, operating system uses Context register address current page map, which resides kernel-mapped segment kseg3. Figure shows format XContext register; Table describes XContext register fields.
XContext Register
PTEBase
BadVPN2
Figure XContext Register Format
27-bit BadVPN2 field bits 39:13 virtual address that caused miss; excluded because single entry maps even-odd page pair. 4-Kbyte page size, this format used directly address pair-table 8-byte PTEs. other page sizes, shifting masking this value produces appropriate address.
BadVPN2
Virtual Page Number/2 field written hardware miss. contains most recent invalidly translated virtual address. Region field contains bits 63:62 virtual address. user supervisor kernel. Page Table Entry Base read/write field normally written with value that allows operating system Context register pointer into current array memory. Table XContext Register Fields
PTEBase
8-bit Error Checking Correcting (ECC) register reads writes primary-cache data parity bits cache initialization, cache diagnostics, cache error processing. (Tag parity loaded from stored TagLo register.) Figure 5.10 shows format register; Table describes register fields. register loaded Data Cache Index Load operation. content register written into primary data cache store instructions (instead computed parity) when Status register set. substituted computed instruction parity Instruction Cache Line Fill operation.
Register
Figure 5.10 Register Format
8-bit field specifying parity bits read from written primary cache. Reserved. Must written zeroes, returns zeroes when read. Table Register Fields
32-bit read-only CacheErr register processes errors secondary cache parity errors primary cache. register holds cache index status bits that indicate source nature error; loaded when Cache Error exception asserted. Parity errors cannot corrected. Figure 5.11 shows format CacheErr register Table describes CacheErr register fields.
CacheErr Register
Figure 5.11 CacheErr Register Format
Type reference instruction data Cache level error primary reserved
Indicates data field error occurred error error Indicates field error occurred error error This error occurred SysAD bus. This data error occurred addition instruction error (indicated remainder bits). this requires flushing data cache after fixing instruction error. Reserved. Must written zeroes, returns zeroes when read. Table CacheErr Register Fields
read/write ErrorEPC register similar register, except that ErrorEPC used parity-error exceptions. also used store program counter (PC) Reset, Soft Reset, nonmaskable interrupt (NMI) exceptions. ErrorEPC register contains virtual address which instruction processing resume after servicing error. This address virtual address instruction that caused exception virtual address immediately preceding branch jump instruction, when this address branch-delay slot. There branch-delay slot indication ErrorEPC register. Figure 5.12 shows format ErrorEPC register.
.//0
ErrorEPC Register
32-bit Mode
ErrorEPC
64-bit Mode
ErrorEPC
Figure 5.12 ErrorEPC Register Format
When Status register either User, Supervisor, Kernel operating mode specified bits Status register. When processor Kernel mode. When processor takes exception, which means system Kernel mode. After saving appropriate state, exception handler typically changes Kernel mode resets back When restoring state restarting, handler restores previous value field sets back Returning from exception also resets
following sections, sample hardware processes various exceptions shown, together with servicing required handler (software).
Figure 5.13 shows Reset exception process.
undefined Random TLBENTRIES-1 Wired Config 00000000 undefined undefined6 ErrorEPC SR31:23 SR19:3 0xFFFF FFFF BFC0 0000
Figure 5.13 Reset Exception Processing
.//0
Figure 5.14 shows Cache Error exception process.
ErrorEPC CacheErr SR31:3 ||SR1:0 SR22 then /*What setting*/ 0xFFFF FFFF BFC0 0200 0x100 /*Access boot-PROM area*/ else 0xFFFF FFFF A000 0000 0x100 /*Access main memory area*/ endif
Figure 5.14 Cache Error Exception Processing
Figure 5.15 shows Soft Reset exception process.
ErrorEPC SR31:23 SR19:3 SR1:0 0xFFFF FFFF BFC0 0000 Figure 5.15 Soft Reset Exception Processing
Figure 5.16 shows process used exceptions, other than Reset, Soft Reset, NMI, Cache Error.
Cause Cause15:8 ExcCode then System User Supervisor mode with current exception endif SR31:2 SR22 then 0xFFFF FFFF BFC0 0200 vector /*access uncached space*/ else 0xFFFF FFFF 8000 0000 vector /*access cached space*/ endif Figure 5.16 General Exception Processing
Reset, Soft Reset, exceptions always vectored location 0xFFFF_FFFF_BFC0_0000. Addresses other exceptions combination vector offset base address. vector associated with general exception called common exception vector; base address determined Status register. Table 5.10 shows 64-bit-mode vector base address exceptions; 32-bit mode address low-order bits (for instance, base address 32-bit mode 0xBFC0 0000). Table 5.11 shows vector offset added base address create exception address. When vector base address cache error exception changes from kseg0 (0xFFFF FFFF 8000 0000) kseg1 (0xFFFF FFFF A000 0000). This change indicates that caches initialized that vector cached. When vector base cache error exception 0xFFFF FFFF BFC0 0200. This uncached unmapped space, allowing exception bypass cache TLB.
.//0
0xFFFF FFFF 8000 0000 0xFFFF FFFF BFC0 0200 Table 5.10 Exception Vector Base Address
refill, XTLB refill, 64-bit TLB) Cache Error Others Reset, Soft Reset,
0x000 0x080 0x100 0x180 none Table 5.11 Exception Vector Offsets
Table 5.12 describes exceptions order highest lowest priority. While more than exception occur single instruction, only exception with highest priority reported. generally, exceptions described following sections first processed hardware, then serviced software.
Reset (highest priority) Soft Reset Nonmaskable Interrupt (NMI) Address error Instruction fetch refill Instruction fetch invalid Instruction fetch Cache error Instruction fetch error Instruction fetch Integer overflow, Trap, System Call, Breakpoint, Reserved Instruction, Coprocessor Unusable, FloatingPoint Exception Address error Data access refill Data access invalid Data access modified Data write Cache error Data access error Data access Interrupt (lowest priority) Table 5.12 Exception Priority Order
Cause Reset exception occurs when ColdReset* signal asserted then deasserted. This exception maskable. Processing provides special interrupt vector this exception: location 0xFFFF_FFFF_BFC0_0000 64-bit mode. Reset vector resides unmapped uncached address space, hardware need initialize cache process this exception. also means processor fetch execute instructions while caches virtual memory undefined state. contents registers undefined when this exception occurs, except following register fields: Random register initialized value upper bound. Wired register initialized Some Config register bits initialized from boot-time mode stream. Status register, cleared other bits undefined. Figure 5.13 additional information this process. Servicing Reset exception serviced initializing processor registers, coprocessor registers, caches, memory system performing diagnostic tests bootstrapping operating system
Cause Soft Reset exception occurs response assertion Reset* input signal. Execution begins Reset vector when Reset* signal negated. Soft Reset exception maskable. Processing Reset vector used this exception. Reset vector located within uncached unmapped address space. Hence, cache need initialized order process exception. Regardless cause, when this exception occurs Status register set, distinguishing this exception from Reset exception. Cache memory states undefined when Soft Reset exception occurs because Soft Reset abort cache operations. primary purpose Soft Reset exception reinitialize processor after fatal error during normal operation. Unlike NMI, cache state machines reset this exception. When Soft Reset exception occurs, register contents preserved with following exceptions: ErrorEPC register, which contains restart ERL, BEV, bits Status Register, each which Figure 5.15 additional information this process. Servicing Soft Reset exception serviced saving current processor state diagnostic purposes, reinitializing Reset exception.
Cause Maskable Interrupt exception occurs response falling edge signal, external write Int*[6] Interrupt Register. interrupt maskable occurs regardless settings EXL, ERL, bits Status Register. Processing Reset vector used this exception. Reset vector located within uncached unmapped address space. Hence, cache need initialized order process exception. Regardless cause, when this exception occurs Status register set, distinguishing this exception from Reset exception. Because occur midst another exception, typically possible continue program execution after servicing NMI. exception taken only instruction boundaries. state caches memory system preserved. When exception occurs, register contents preserved with following exceptions: ErrorEPC register, which contains restart ERL, BEV, bits Status Register, each which Figure 5.15 additional information this process. Servicing exception serviced saving current processor state diagnostic purposes, reinitializing Reset exception.
Cause Address Error exception occurs when attempt made execute following: load store doubleword that aligned doubleword boundary load, fetch, store word that aligned word boundary load store halfword that aligned halfword boundary reference kernel address space from User Supervisor mode reference supervisor address space from User mode This exception maskable. Processing common exception vector used this exception. AdEL AdES code Cause register set, indicating whether instruction caused exception with instruction reference, load operation, store operation shown register Cause register. When this exception occurs, BadVAddr register retains virtual address that properly aligned that referenced protected address space. contents field Context EntryHi registers undefined, contents EntryLo register. register contains address instruction that caused exception, unless this instruction branch-delay slot. branch-delay slot, register contains address preceding branch instruction Cause register indication. Servicing process executing time handed segmentation violation signal. This error usually fatal process incurring exception.
Three types exceptions occur: Refill occurs when there entry that matches attempted reference mapped address space. Invalid occurs when virtual address reference matches entry that marked invalid. Modified occurs when store operation virtual address reference memory matches entry which marked valid dirty (the entry writable). following three sections describe these exceptions.
Cause refill exception occurs when there entry match reference mapped address space. This exception maskable. Processing There special exception vectors this exception; references 32-bit address spaces, references 64-bit address spaces. bits Status register determine whether user, supervisor kernel address spaces referenced 32-bit 64bit spaces. references these vectors when Status register. This exception sets TLBL TLBS code ExcCode field Cause register. This code indicates whether instruction, shown register Cause register, caused miss instruction reference, load operation, store operation. When this exception occurs, BadVAddr, Context, XContext EntryHi registers hold virtual addres

Other recent searches


ZX85-12G+ - ZX85-12G+   ZX85-12G+ Datasheet
ST7548 - ST7548   ST7548 Datasheet
PPC405CR - PPC405CR   PPC405CR Datasheet
MVL-663SGK-S - MVL-663SGK-S   MVL-663SGK-S Datasheet
MS09-S - MS09-S   MS09-S Datasheet
LBS140A40 - LBS140A40   LBS140A40 Datasheet
IA2910A - IA2910A   IA2910A Datasheet
IA211030314-03 - IA211030314-03   IA211030314-03 Datasheet
EEH2013 - EEH2013   EEH2013 Datasheet
BGA432B - BGA432B   BGA432B Datasheet

 

Privacy Policy | Disclaimer
© 2012 Datasheet Archive