NEW DATABASE - 350 MILLION DATASHEETS FROM 8500 MANUFACTURERS
7182230F ST40-C200 SH4-202 SH4-103/202 IEEE754 SH4-103 FPR15 BANK0-FPR15 - Datasheet Archive
SH-4 CPU Core Architecture Last updated 12 September 2002 2:29 ADCS 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core
PRELIMINARY DATA SH-4 CPU Core Architecture Last updated 12 September 2002 2:29 ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA ii Issued by the MCDT Documentation Group on behalf of STMicroelectronics Information furnished is believed to be accurate and reliable. However, STMicroelectronics assumes no responsibility for the consequences of use of such information nor for any infringement of patents or other rights of third parties which may result from its use. No license is granted by implication or otherwise under any patent or patent rights of STMicroelectronics. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all information previously supplied. STMicroelectronics products are not authorized for use as critical components in life support devices or systems without the express written approval of STMicroelectronics. Notice: When using this document, keep the following in mind: 1. This document may, wholly or partially, be subject to change without notice. 2. All rights are reserved: No one is permitted to reproduce or duplicate, in any form, the whole or part of this document without Hitachi's permission. 3. Hitachi will not be held responsible for any damage to the user that may result from accidents or any other reasons during operation of the user's unit according to this document. 4. Circuitry and other examples described herein are meant merely to indicate the characteristics and performance of Hitachi's semiconductor products. Hitachi assumes no responsibility for any intellectual property claims or other problems that may result from applications based on the examples described herein. 5. No license is granted by implication or otherwise under any patents or other rights of any third party or Hitachi, Ltd. 6. MEDICAL APPLICATIONS: Hitachi's products are not authorized for use in MEDICAL APPLICATIONS without the written consent of the appropriate officer of Hitachi's sales company. Such use includes, but is not limited to, use in life support systems. Buyers of Hitachi's products are requested to notify the relevant Hitachi sales offices when planning to use the products in MEDICAL APPLICATIONS. The ST logo is a registered trademark of STMicroelectronics. SuperH is a registered trademark for products originally developed by Hitachi, Ltd. and is owned by Hitachi Ltd. © 2000, 2001, 2002 STMicroelectronics and Hitachi, Ltd. All Rights Reserved. STMicroelectronics Group of Companies Australia - Brazil - Canada - China - Finland - France - Germany - Hong Kong - India - Israel - Italy - Japan Malaysia - Malta - Morocco - Singapore - Spain - Sweden - Switzerland - United Kingdom - U.S.A. http://www.st.com STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA Contents Preface 2 Overview 15 1.1 1.2 1 xi 15 19 SH-4 CPU core features Block diagram Programming model 21 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 General registers System registers Control registers Floating-point registers Memory-mapped registers Data format in registers Data formats in memory Processor states 22 25 31 34 36 37 37 38 2.8.1 2.8.2 2.8.3 2.8.4 38 38 38 39 2.9 ADCS 7182230F 7182230F Reset state Exception-handling state Program execution state Power-down state Processor modes 40 STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA iv 3 Memory management unit (MMU) 41 3.1 3.2 3.3 Overview Role of the MMU Register descriptions 41 41 42 3.3.1 3.3.2 3.3.3 3.3.4 3.3.5 43 44 47 47 47 3.4 Address space 3.4.1 3.4.2 3.4.3 3.4.4 3.4.5 3.4.6 3.4.7 3.5 Page table entry high register (PTEH) Page table entry low register (PTEL) Translation table base register (TTB) TLB exception address register (TEA) MMU control register (MMUCR) 51 Physical address space 51 External memory space 52 Virtual address space 55 On-chip RAM space 56 Address translation 57 Single virtual memory mode and multiple virtual memory mode 57 Address space identifier (ASID) 58 58 3.5.1 3.5.2 3.5.3 3.6 TLB functions 58 59 59 Unified TLB (UTLB) configuration Instruction TLB (ITLB) configuration Address translation method 62 3.6.1 3.6.2 3.6.3 3.6.4 3.6.5 3.7 MMU functions 62 62 63 64 64 MMU hardware management MMU software management MMU instruction (LDTLB) Hardware ITLB miss handling Avoiding synonym problems Handling MMU exceptions 65 3.7.1 3.7.2 3.7.3 65 65 66 ITLBMULTIHIT ITLBMISS EXECPROT STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA v 3.7.4 3.7.5 3.7.6 3.7.7 3.8 OTLBMULTIHIT TLBMISS READPROT FIRSTWRITE 67 67 68 68 69 3.8.1 3.8.2 3.8.3 3.8.4 4 Memory-mapped TLB configuration 70 71 72 74 ITLB address array ITLB data array 1 UTLB address array UTLB data array 1 Caches 75 4.1 Overview 75 4.1.1 75 4.2 Features 77 4.2.1 4.2.2 4.2.3 4.3 Register descriptions 77 80 81 Cache control register (CCR) Queue address control register 0 (QACR0) Queue address control register 1 (QACR1) 82 4.3.1 4.3.2 4.3.3 4.3.4 4.3.5 4.3.6 4.3.7 4.3.8 4.3.9 4.4 Operand cache (OC) 82 84 86 88 88 88 91 91 91 Configuration Read operation Write operation Write-back buffer Write-through buffer RAM mode OC index mode Coherency between cache and external memory Prefetch operation 92 4.4.1 4.4.2 4.4.3 ADCS 7182230F 7182230F Instruction cache (IC) 92 94 94 Configuration Read operation IC index mode STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA vi 4.5 95 4.5.1 4.5.4 4.5.5 4.5.6 4.6 Memory-mapped cache configuration 95 97 98 99 Store queues 4.6.1 4.6.2 4.6.3 4.6.4 5 IC address array IC data array OC address array OC data array SQ configuration SQ writes SQ reads (implementation dependant) Transfer to external memory Exceptions 5.1 5.2 Overview Register descriptions 5.2.1 5.2.2 5.2.3 5.3 Exception handling functions 5.3.1 5.3.2 5.4 5.5 Exception flow Exception source acceptance Exception requests and BL bit Return from exception handling Description of exceptions 5.6.1 5.6.2 5.6.3 5.6.4 5.7 Exception handling flow Exception handling vector addresses Exception types and priorities Exception flow 5.5.1 5.5.2 5.5.3 5.5.4 5.6 Exception event register (EXPEVT) Interrupt event register (INTEVT) TRAPA exception register (TRA) Resets General exceptions Interrupts Priority order with multiple exceptions Usage notes STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture 101 101 102 102 102 105 105 105 106 106 107 108 108 108 109 110 110 112 114 114 115 115 120 138 141 142 ADCS 7182230F 7182230F PRELIMINARY DATA vii 6 Floating-point unit 6.1 6.2 Overview Floating-point format 6.2.1 6.2.2 6.3 6.4 6.5 148 149 Rounding Floating-point exceptions Graphics support functions 149 150 152 8 Geometric operation instructions Pair single-precision data transfer Instruction set 7.1 7.2 7.3 Execution environment Addressing modes Instruction set summary Instruction specification 8.1 8.2 Overview Variables and types 8.2.1 8.2.2 8.2.3 8.2.4 8.2.5 8.3 Integer Boolean Bit-fields Arrays Floating point values Expressions 8.3.1 8.3.2 8.3.3 8.3.4 8.3.5 8.3.6 ADCS 7182230F 7182230F 145 146 Non-numbers (NaN) Denormalized numbers 6.5.1 6.5.2 7 145 Integer arithmetic operators Integer shift operators Integer bitwise operators Relational operators Boolean operators Single-value functions 152 154 155 155 158 163 179 179 180 180 181 181 181 182 182 182 184 184 186 186 187 STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA viii 8.4 Statements 8.4.1 8.4.2 8.4.3 8.4.4 8.4.5 8.4.6 8.5 8.6 Architectural state Memory model 8.6.1 8.6.2 8.6.3 8.6.4 8.7 8.8 Initial conditions Instruction execution loop State changes Example instructions 8.10.1 8.10.2 9 Functions to access SR and FPSCR Functions to model floating-point behavior Floating-point special cases and exceptions Abstract sequential model 8.9.1 8.9.2 8.9.3 8.10 Support functions Reading memory Prefetching memory Writing memory Cache model Floating-point model 8.8.1 8.8.2 8.8.3 8.9 Undefined behavior Assignment Conditional Repetition Exceptions Procedures ADD #imm, Rn FADD FRm, FRn Instruction descriptions 9.1 Alphabetical list of instructions STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture 190 190 190 192 192 193 193 194 196 197 198 200 200 202 202 202 204 206 206 207 207 208 209 209 211 213 213 ADCS 7182230F 7182230F PRELIMINARY DATA ix 10 Pipelining 10.1 10.2 10.3 Pipelines Parallel executables Execution cycles and pipeline stalling 483 483 490 494 A Address list 513 B Instruction prefetch side effects 515 Index 517 ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA x STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA Preface This document is part of the SuperH Documentation Suite detailed below. Comments on this or other manuals in the SuperH Documentation Suite should be made by contacting your local STMicroelectronics Limited Sales Office or distributor. Document identification and control Each book carries a unique identifier in the form: ADCS nnnnnnnx Where, nnnnnnn is the document number and x is the revision. Whenever making comments on a document the complete identification ADCS nnnnnnnx should be quoted. ST40 Micro Toolset Getting Started ADCS 7379953. This manual provides an introduction to the ST40 Micro Toolset and instructions for getting a simple OS21 application run on an STMicroelectronics' MediaRef platform. It also describes how to boot OS21 applications from ROM and how to port applications which use STMicroelectronics' STLite/OS20 operating systems to OS21. OS21 User's Manual ADCS 7358306. This manual describes the generic use of OS21 across supported platforms. It describes all the core features of OS21and their use and details the OS21 function definitions.It also explains how OS21 differs to STLite/OS20, the API targeted at ST20. ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA xii OS21 for ST40 User Manual ADCS 7358673. This manual describes the use of OS21 on ST40 platforms. It describes how specific ST40 facilities are exploited by the OS21 API. It also describes the OS21 board support packages for ST40 platforms. 32-Bit RISC Series, SH-4 CPU Core Architecture ADCS 7182230. This manual describes the architecture and instruction set of the SH4-1xx (previously known a ST40-C200 ST40-C200) core as used by STMicroelectronics. 32-Bit RISC Series, SH-4, ST40 System Architecture This manual describes the ST40 family system architecture. It is split into four volumes: ST40 System Architecture - Volume 1 System - ADCS 7153464. ST40 System Architecture - Volume 2 Bus Interfaces - ADCS 7171720. ST40 System Architecture - Volume 3 Video Devices - ADCS 7225754. ST40 System Architecture - Volume 4 I/O Devices - ADCS 7225754. Conventions used in this guide General notation The notation in this document uses the following conventions: · Sample code, keyboard input and file names, · Variables and code variables, · Equations and math, · Screens, windows and dialog boxes, · Instructions. Hardware notation The following conventions are used for hardware notation: · REGISTER NAMES and FIELD NAMES, STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA xiii · PIN NAMES and SIGNAL NAMES. Software notation Syntax definitions are presented in a modified Backus-Naur Form (BNF). Briefly: 1 Terminal strings of the language, that is those not built up by rules of the language, are printed in teletype font. For example, void. 2 Nonterminal strings of the language, that is those built up by rules of the language, are printed in italic teletype font. For example, name. 3 If a nonterminal string of the language starts with a nonitalicized part, it is equivalent to the same nonterminal string without that nonitalicized part. For example, vspace-name. 4 Each phrase definition is built up using a double colon and an equals sign to separate the two sides. 5 Alternatives are separated by vertical bars (`|'). 6 Optional sequences are enclosed in square brackets (`[' and `]'). 7 Items which may be repeated appear in braces (`{' and `}'). ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA xiv STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA Overview 1 1.1 SH-4 CPU core features1 This manual describes the architecture of the SH-4 CPU core. The core is a highly encapsulated design component that can be integrated into any product, you will therefore find no references to clock speeds, system facilities, pin-outs or similar data in this manual. For this information you are referred to the Datasheet and/or System Architecture Manual of the appropriate product. The SH-4 is a 32-bit RISC (reduced instruction set computer) microprocessor, featuring object code upward-compatibility with Hitachi SuperH SH-1, SH-2, SH-3, and SH-3E microcomputers. It includes an instruction cache, a operand cache that can be switched between copy-back and write-through modes, a 4-entry full-associative instruction TLB (translation look aside buffer), and MMU (memory management unit) with 64-entry full-associative shared TLB. The SH-4's 16-bit fixed-length instruction set enables program code size to be reduced by almost 50% compared with 32-bit instructions. The SH-4 200 series includes an enhanced mode which enables 2-way set associative instruction and operand cache (rather than direct mapped as for the SH-4 100 series and SH-4 200 series when running in default compatibility mode). In particular, the SH4-202 SH4-202 has a 32 Kbyte 2-way operand cache and a 16 Kbyte 2-way instruction cache. On power up this behaves as a 16Kbyte direct mapped operand cache and an 8Kbyte direct mapped instruction cache. 1. Naming conventions: SH-4: for non-variant specific information SH-4 100/200 series: for series specific features SH4-103/202 SH4-103/202: for variant specific features ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 16 The features of the SH-4 CPU core are summarized as follows: CPU · Original Hitachi SH architecture · 32-bit internal data bus · General register file: - Sixteen 32-bit general registers (and eight 32-bit shadow registers) - Seven 32-bit control registers - Four 32-bit system registers · RISC-type instruction set (upward-compatible with SH Series) - Fixed 16-bit instruction length for improved code efficiency - Load-store architecture - Delayed branch instructions - Conditional execution · Superscalar architecture: Parallel execution of two instructions · Instruction execution time: Maximum 2 instructions/cycle · Virtual address space: 4 Gbytes (448-Mbyte external memory space) · Space identifier ASIDs: 8 bits, 256 virtual address spaces · On-chip multiplier · Five-stage pipeline FPU · On-chip floating-point coprocessor · Supports single-precision (32 bits) and double-precision (64 bits) · Supports IEEE754-compliant data types and exceptions · Two rounding modes: Round to Nearest and Round to Zero · Handling of denormalized numbers: Truncation to zero or interrupt generation for compliance with IEEE754 IEEE754 · Floating-point registers: - 2 banks of sixteen 32-bit single precision registers or, - 2 banks of eight 64-bit double precision registers or, - 2 banks of four 128-bit vector registers (each vector is 4 single precision elements) STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 17 · 32-bit CPU-FPU floating-point communication register (FPUL) · Supports FMAC (multiply-and-accumulate) instruction · Supports FDIV (divide) and FSQRT (square root) instructions · Supports FLDI0/FLDI1 (load constant 0/1) instructions · Instruction execution times - Latency (FMAC/FADD/FSUB/FMUL): 3 cycles (single-precision), 8 cycles (double-precision) - Pitch (FMAC/FADD/FSUB/FMUL): 1 cycle (single-precision), 6 cycles (double-precision) - Note: FMAC is supported for single-precision only. · 3-D graphics instructions (single-precision only): - 4-dimensional vector conversion and matrix operations (FTRV): 4 cycles (pitch), 7 cycles (latency) - 4-dimensional vector (FIPR) inner product: 1 cycle (pitch), 4 cycles (latency) · Five-stage pipeline Power-down · Power-down modes - Sleep mode - Standby mode - Module standby function MMU · 4-Gbyte address space, 256 address space identifiers (8-bit ASIDs) · Single virtual mode and multiple virtual memory mode · Supports multiple page sizes: 1 kbyte, 4 kbytes, 64 kbytes, 1 Mbyte · 4-entry fully-associative TLB for instructions · 64-entry fully-associative TLB for instructions and operands · Supports software-controlled replacement and random-counter replacement algorithm · TLB contents can be accessed directly by address mapping ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 18 Cache memory SH4-103 SH4-103 · Instruction cache (IC) - 8 kbytes, direct mapping - 256 entries, 32-byte block length - Normal mode (8-Kbyte cache) - Index mode · Operand cache (OC) - 16 kbytes, direct mapping - 512 entries, 32-byte block length - Normal mode (16-kbyte cache) - Index mode - RAM mode (8-kbyte cache + 8-kbyte RAM) - Choice of write method (copy-back or write-through) · Single-stage copy-back buffer, single-stage write-through buffer · Cache memory contents can be accessed directly by address mapping (usable as on-chip memory) · Store queue (32 bytes x 2 entries) SH4-202 SH4-202 · Instruction cache (IC): - 16 Kbyte, 2-way set associative - 512 entries, 32-bytes block length - Compatibility mode (8 Kbyte direct mapped) - Index modea · - Operand cache (OC) - 32 Kbyte, 2-way set associative - 1024 entries, 32 bytes block length - Compatibility mode (16 Kbyte direct mapped) - Index mode - RAM mode (16 Kbyte cache + 16 Kbyte RAM) · Single-stage copy-back buffer, single-stage write-through buffer · Cache memory contents can be accessed directly by address mapping (usable as on-chip memory) · Store queue (32 bytes x 2 entries) a. Index mode (IC and OC) is only supported when in SH4-1xx compatibility mode. STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 19 1.2 Block diagram Figure 1 shows an internal block diagram of the SH-4 32-Bit CPU Core . I cache Lower data ITLB CCN UTLB Data (store) Data (store) Lower data Data (store) FPU Data (load) Address (data) Address (instruction) CCN: Cache and TLB controller FPU: Floating point unit ITLB: Instruction Translation lookaside buffer UTLB: Unified Translation lookaside buffer Data (instruction) CPU O Cache Figure 1 SH-4 32-Bit CPU core ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 20 STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA Programming model 2 The SH-4 CPU core has two processor modes, user mode and privileged mode. The SH-4 normally operates in user mode, and switches to privileged mode when an exception occurs, or an interrupt is accepted. There are four kinds of registers: · general registers There are 16 general registers, R0 to R15. General registers R0 to R7 are banked registers which are switched by a processor mode change. · system registers Access to these registers does not depend on the processor mode. · control registers · floating-point registers There are thirty-two floating-point registers, FR0FR15 and XF0XF15. FR0FR15 and XF0XF15 can be assigned to either of two banks (FPR0_BANK0FPR15 FPR15_BANK0 or FPR0_BANK1FPR15 FPR15_BANK1). The registers that can be accessed differ in the two processor modes. ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 22 Register values after a reset are shown in Table 1. Type Registers Initial valuea General registers R0_BANK0R7_BANK0, R0_BANK1R7_BAN K1, R8R15 Undefined Control registers SR MD bit = 1, RB bit = 1, BL bit = 1, FD bit = 0, I3I0 = 1111 (0xF), reserved bits = 0, others undefined GBR, SSR, SPC, SGR, DBR Undefined VBR 0x00000000 MACH, MACL, PR, FPUL Undefined PC 0xA0000000 FPSCR 0x00040001 FR0FR15, XF0XF15 Undefined System registers Floating-point registers Table 1: Initial register values a. Initialized by a power-on reset and manual reset 2.1 General registers Figure 2 shows the relationship between the processor modes and the general registers. The SH-4 CPU core has twenty-four 32-bit general registers (R0_BANK0R7_BANK0, R0_BANK1R7_BANK1, and R8R15). However, only 16 of these can be accessed as general registers, R0R15, in either processor mode. The assignment of R0R7, in both modes, is shown below. · R0_BANK0R7_BANK0 In user mode (SR.MD = 0), R0R7 are always assigned to R0_BANK0R7_BANK0. In privileged mode (SR.MD = 1), R0R7 are assigned to R0_BANK0R7_BANK0 only when SR.RB = 0. STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 23 · R0_BANK1R7_BANK1 In user mode, R0_BANK1R7_BANK1 cannot be accessed. In privileged mode, R0R7 are assigned to R0_BANK1R7_BANK1 only when SR.RB = 1. SR.MD = 0 or (SR.MD = 1, SR.RB = 0) (SR.MD = 1, SR.RB = 1) R0 R1 R2 R3 R4 R5 R6 R7 R0_BANK0 R1_BANK0 R2_BANK0 R3_BANK0 R4_BANK0 R5_BANK0 R6_BANK0 R7_BANK0 R0_BANK0 R1_BANK0 R2_BANK0 R3_BANK0 R4_BANK0 R5_BANK0 R6_BANK0 R7_BANK0 R0_BANK1 R1_BANK1 R2_BANK1 R3_BANK1 R4_BANK1 R5_BANK1 R6_BANK1 R7_BANK1 R0_BANK1 R1_BANK1 R2_BANK1 R3_BANK1 R4_BANK1 R5_BANK1 R6_BANK1 R7_BANK1 R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 R8 R9 R10 R11 R12 R13 R14 R15 R8 R9 R10 R11 R12 R13 R14 R15 Figure 2: General registers ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 24 Programming Note: As the user's R0R7 are assigned to R0_BANK0R7_BANK0, and after an exception or interrupt R0R7 are assigned to R0_BANK1R7_BANK1, it is not necessary for the interrupt handler to save and restore the user's R0R7 (R0_BANK0R7_BANK0). After a reset, the values of R0_BANK0R7_BANK0, R0_BANK1R7_BANK1, and R8R15 are undefined. STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 25 2.2 System registers Name Size Initial value Synopsis MACH 32 Undefined Multiply-and-accumulate register high Operation MACL MACH is used for the added value in a MAC instruction, and to store a MAC instruction or MUL instruction operation result. 32 Multiply-and-accumulate register low Undefined Operation PR MACL is used for the added value in a MAC instruction, and to store a MAC instruction or MUL instruction operation result. 32 Procedure register Undefined Operation PC The return address is stored when a subroutine call using a BSR, BSRF or JSR instruction. PR is referenced by the subroutine return instruction (RTS). 32 Program counter 0xA000 0000 Operation FPSCR PC indicates the executing instruction address. 32 Floating-point status/control register 0x0004 0001 Operation FPUL Refer to Table 3: FPSCR register description 32 Floating-point communication register undefined Operation Data transfer between FPU registers and CPU registers is carried out via the FPUL register. The FPUL register is a system register, and is accessed from the CPU side by means of LDS and STS instructions. For example, to convert the integer stored in general register R1 to a single-precision floating-point number, the processing flow is as follows: R1 (LDS instruction) FPUL (single-precision FLOAT instruction) FR1 Table 2: System registers ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 26 FPSCR Field Bits Size Synopsis Type RM [0,1] 2 Rounding mode. RW Operation RM = 00: Round to Nearest. RM = 01: Round to Zero. RM = 10: Reserved. RM = 11: Reserved. For details see Section 6.3: Rounding Power-on reset Flag inexact 1 2 FPU inexact exception flag. 1 Operation Flag underflow Set to 1 if Inexact exception occurs. Power-on reset 0 3 FPU underflow exception flag. 1 Operation 0 4 FPU overflow exception flag. 1 Operation 0 5 RW Set to 1 if overflow exception occurs Power-on reset Flag division by zero RW Set to 1 if Underflow exception occurs Power-on reset Flag overflow RW FPU division by zero exception flag. 1 RW Set to 1 if division by zero exception occurs Power-on reset Flag invalid operation Operation 0 6 FPU invalid operation exception flag. 1 RW Operation Set to 1 if Invalid operation exception occurs Power-on reset 0 Table 3: FPSCR register description STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 27 FPSCR Field Bits Size Synopsis Type Enable inexact 7 1 FPU invalid exception enable field. RW Operation Power-on reset Enable underflow Set to 1 to cause a trap when an inexact exception occurs. 0 8 FPU underflow exception enable field. 1 RW Operation Power-on reset Enable overflow Set to 1 to cause a trap when an underflow exception occurs. 0 9 FPU overflow exception enable field. 1 RW Operation Power-on reset Enable by zero division Set to 1 to cause a trap when an overflow exception occurs. 0 10 FPU division by zero exception enable field. 1 RW Set to 1 to cause a trap when a division by zero exception occurs. Power-on reset Enable invalid Operation 0 11 FPU invalid exception enable field. 1 RW Operation Power-on reset Cause inexact Set to 1 to cause a trap when an Invalid exception occurs. 0 12 FPU inexact exception cause field. 1 RW Operation Set to 0 before an FPU instruction is executed. Set to 1 if an Inexact exception occurs. Power-on reset 0 Table 3: FPSCR register description ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 28 FPSCR Field Bits Size Synopsis Type Cause underflow 13 1 FPU underflow exception cause field. RW Operation Power-on reset Cause overflow Set to 0 before an FPU instruction is executed. Set to 1 if an underflow exception occurs. 0 14 FPU overflow exception cause field. 1 RW Operation Power-on reset Cause division by zero Set to 0 before an FPU instruction is executed. Set to 1 if an overflow exception occurs. 0 15 FPU division by zero exception cause field. 1 RW Set to 0 before an FPU instruction is executed. Set to 1 if a division by zero exception occurs. Power-on reset Cause invalid Operation 0 16 FPU invalid exception cause field. 1 RW Operation Power-on reset Cause FPU error Set to 0 before an FPU instruction is executed. Set to 1 if an invalid exception occurs. 0 17 FPU error exception cause field. 1 RW Operation Power-on reset DN Set to 0 before an FPU instruction is executed. Set to 1 if an FPU error exception occurs. 0 18 Denormalization mode. 1 Operation RW DN = 0: A denormalizing number is treated as such. DN = 1: A denormalized number is treated as zero. Power-on reset 0 Table 3: FPSCR register description STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 29 FPSCR Field Bits Size Synopsis Type PR 19 1 Precision mode. RW Operation PR = 0: Floating point instructions are executed as single precision operations. PR = 1: Floating point instructions are executed as double-precision operations (the result of instructions for which double-precision is not supported is undefined). Mode setting [SZ = 1, PR = 1] is reserved. FPU operation results are undefined in this mode. Power-on reset SZ 1 20 Transfer size mode. 1 Operation RW SZ = 0: The data size of the FMOV instruction is 32 bits. SZ = 1: The data size of the FMOV instruction is a 32-bit register pair (64 bits). Programming note: When SZ = 1 and big endian mode is selected, FMOV can be used for double-precision floating-point data load or store operations. In little endian mode, two 32-bit data size moves must be executed, with SZ = 0, to load or store a double-precision floating-point number. Power-on reset FR 0 21 Floating-point register bank. 1 Operation RW FR = 0: FPR0_BANK0-FPR15 BANK0-FPR15_BANK0 are assigned to FR0-FR15 FR0-FR15; FPR0_BANK1-FPR15 BANK1-FPR15_BANK1 are assigned to XF0-XF15 XF0-XF15. FR = 1: FPR0_BANK0-FPR15 BANK0-FPR15_BANK1 are assigned to FR0-FR15 FR0-FR15. Power-on reset 0 Table 3: FPSCR register description ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 30 FPSCR Field Bits Size Synopsis Type RES [22,31] 10 Bits reserved RW Power-on reset Undefined Table 3: FPSCR register description STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 31 2.3 Control registers Name Size Initial value SR 32 See Table 5 for individual bits. Privilege protection Synopsis Yes Status register Operation SSR Refer to Table 5: SR register description 32 Yes Undefined Saved status register Operation SPC The current contents of SR are saved to SSR in the event of an exception or interrupt. 32 Yes Undefined Saved program counter Operation GBR The address of an instruction at which an interrupt or exception occurs is saved to SPC. 32 No Undefined Global base register Operation VBR GBR is referenced as the base address in a GBR-referencing MOV instruction. 32 Yes 0x0000 0000 Operation Vector base register VBR is referenced as the branch destination base address in the event of an exception or interrupt. For details, see Chapter 5: Exceptions. SGR 32 Undefined Yes Saved general register Operation DBR The contents of R15 are saved to SGR in the event of an exception or interrupt. 32 Yes undefined Operation Debug base register When the user break debug function is enabled (BRCR.UBDE = 1), DBR is referenced as the user break handler branch destination address instead of VBR. Table 4: Control registers ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 32 SR Field Bits Size Synopsis Type T 0 1 True/False condition or carry/borrow bit. RW Operation Power-on reset S Refer to individual instruction descriptions, which affect the T bit. Undefined 1 Specifies a saturation operation for a MAC instruction. 1 RW Operation Power-on reset IMASK Refer to individual instruction descriptions, which affect the S bit. Undefined [4,7] Interrupt mask level. 4 RW Operation Power-on reset Q External interrupts of a lower level than IMASK are masked. 1 8 State for divide step. 1 RW Operation Power-on reset M Used by the DIV0S, DIV0U and DIV1 instructions. Undefined 9 State for divide step. 1 RW Operation Power-on reset FD Used by the DIV0S, DIV0U and DIV1 instructions. Undefined 15 FPU disable bit (cleared to 0 by a reset). 1 Operation RW FD = 1: An FPU instruction causes a general FPU disable exception, and if the FPU instruction is in a delay slot, a slot FPU disable exception is generated. For further details see FPUDIS description in section Section 6.4: Floating-point exceptions Power-on reset 0 Table 5: SR register description STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 33 SR Field Bits Size Synopsis Type BL 28 1 Exception/interrupt block bit (set to 1 by a reset, exception, or interrupt). RW Operation Power-on reset RB BL = 1: Interrupt requests are masked. If a general exception, other than a user break occurs while BL = 1, the processor switches to the reset state. 1 29 General register bank specifier in privileged mode (set to 1 by a reset, exception or interrupt). 1 Operation RW RB = 0: R0_BANK0-R7_BANK0 are accessed as general registers R0-R7. (R0_BANK1-R7_BANK1 can be accessed using LDC/STC R0_BANK-R7_BANK instructions.) RB = 1: R0_BANK1-R7_BANK1 are accessed as general registers R0-R7. (R0_BANK0-R7_BANK0 can be accessed using LDC/STC R0_BANK-R7_BANK instructions.) Power-on reset MD 1 30 Processor mode. 1 Operation RW MD = 0: User mode (Some instructions cannot be executed, and some resources cannot be accessed). MD = 1: Privileged mode. Power-on reset RES [2,3], [10,14][ 16,27] 31 20 Power-on reset 1 Bits reserved RW Undefined Table 5: SR register description ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 34 2.4 Floating-point registers Figure 3 shows the floating-point registers. There are thirty-two 32-bit floating-point registers, divided into two banks (FPR0_BANK0FPR15 FPR15_BANK0 and FPR0_BANK1FPR15 FPR15_BANK1). These 32 registers are referenced as FR0FR15, DR0/2/4/6/8/10/12/14 DR0/2/4/6/8/10/12/14, FV0/4/8/12 FV0/4/8/12, XF0XF15, XD0/2/4/6/8/10/12/14 XD0/2/4/6/8/10/12/14, or XMTRX. The correspondence between FPRn_BANKi and the reference name is determined by the FR bit in FPSCR. · Floating-point registers, FPRn_BANKi (32 registers) · Single-precision floating-point registers, FRi (16 registers) FPSCR.FR = 0 : FR0FR15 are assigned to FPR0_BANK0FPR15 FPR15_BANK0. FPSCR.FR = 1 : FR0FR15 are assigned to FPR0_BANK1FPR15 FPR15_BANK1. · Double-precision floating-point registers or single-precision floating-point register pairs, DRi (8 registers): A DR register comprises two FR registers. DR0 = {FR0, FR1}, DR2 = {FR2, FR3}, DR4 = {FR4, FR5}, DR6 = {FR6, FR7}, DR8 = {FR8, FR9}, DR10 = {FR10, FR11}, DR12 = {FR12, FR13}, DR14 = {FR14, FR15} · Single-precision floating-point vector registers, FVi (4 registers): An FV register comprises four FR registers FV0 = {FR0, FR1, FR2, FR3}, FV4 = {FR4, FR5, FR6, FR7}, FV8 = {FR8, FR9, FR10, FR11}, FV12 = {FR12, FR13, FR14, FR15} · Single-precision floating-point extended registers, XFi (16 registers) FPSCR.FR = 0 : XF0-XF15 XF0-XF15 are assigned to FPR0_BANK1-FPR15 BANK1-FPR15_BANK1. FPSCR.FR = 1 : XF0-XF15 XF0-XF15 are assigned to FPR0_BANK0-FPR15 BANK0-FPR15_BANK0. · Single-precision floating-point extended register pairs, XDi (8 registers): An XD register comprises two XF registers XD0 = {XF0, XF1}, XD2 = {XF2, XF3}, XD4 = {XF4, XF5}, XD6 = {XF6, XF7}, XD8 = {XF8, XF9}, XD10 = {XF10, XF11}, XD12 = {XF12, XF13}, XD14 = {XF14, XF15} · Single-precision floating-point extended register matrix, XMTRX: XMTRX comprises all 16 XF registers STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 35 XMTRX = XF0 XF4 XF8 XF12 XF1 XF5 XF9 XF13 XF2 XF6 XF10 XF14 XF3 XF7 XF11 XF15 FPSCR.FR = 1 FPSCR.FR = 0 FV4 FV8 FV12 FR0 FR1 DR2 FR2 FR3 DR4 FR4 FR5 DR6 FR6 FR7 DR8 FR8 FR9 DR10 FR10 FR11 DR12 FR12 FR13 DR14 FR14 FR15 FPR0_BANK0 FPR1_BANK0 FPR2_BANK0 FPR3_BANK0 FPR4_BANK0 FPR5_BANK0 FPR6_BANK0 FPR7_BANK0 FPR8_BANK0 FPR9_BANK0 FPR10 FPR10_BANK0 FPR11 FPR11_BANK0 FPR12 FPR12_BANK0 FPR13 FPR13_BANK0 FPR14 FPR14_BANK0 FPR15 FPR15_BANK0 XF0 XF1 XD2 XF2 XF3 XD4 XF4 XF5 XD6 XF6 XF7 XD8 XF8 XF9 XD10 XF10 XF11 XD12 XF12 XF13 XD14 XF14 XF15 FV0 FPR0_BANK1 FPR1_BANK1 FPR2_BANK1 FPR3_BANK1 FPR4_BANK1 FPR5_BANK1 FPR6_BANK1 FPR7_BANK1 FPR8_BANK1 FPR9_BANK1 FPR10 FPR10_BANK1 FPR11 FPR11_BANK1 FPR12 FPR12_BANK1 FPR13 FPR13_BANK1 FPR14 FPR14_BANK1 FPR15 FPR15_BANK1 DR0 XMTRX XD0 XF0 XF1 XF2 XF3 XF4 XF5 XF6 XF7 XF8 XF9 XF10 XF11 XF12 XF13 XF14 XF15 FR0 FR1 FR2 FR3 FR4 FR5 FR6 FR7 FR8 FR9 FR10 FR11 FR12 FR13 FR14 FR15 XD0 XMTRX XD2 XD4 XD6 XD8 XD10 XD12 XD14 DR0 FV0 DR2 DR4 FV4 DR6 DR8 FV8 DR10 DR12 FV12 DR14 Figure 3: Floating-point registers ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 36 Programming Note: After a reset, the values of FPR0_BANK0FPR15 FPR15_BANK0 and FPR0_BANK1FPR15 FPR15_BANK1 are undefined. 2.5 Memory-mapped registers Appendix A summarizes how the control registers are mapped in to the address space. The control registers are double-mapped to the following two memory areas. All registers have two addresses. 0x1F00 0000-0x1FFF FFFF 0xFF00 0000-0xFFFF FFFF These two areas are used as follows. · 0x1F00 00000x1FFF FFFF This area must be accessed in address translation mode using the TLB. Since external memory area is defined as a 29-bit address space in the SH-4 CPU core architecture, the TLB's physical page numbers do not cover a 32-bit address space. In address translation, the page numbers of this area can be set in the corresponding field of the TLB by accessing a memory-mapped register. The page numbers of this area should be used as the actual page numbers set in the TLB. When address translation is not performed, the operation of accesses to this area is undefined. · 0xFF00 00000xFFFF FFFF Access to area 0xFF00 0000-0xFFFF FFFF in user mode will cause an address error. Memory-mapped registers can be referenced in user mode by means of access that involves address translation. Note: Do not access undefined locations in either area. The operation of an access to an undefined location is undefined. Memory-mapped registers must be accessed using a load/store instruction of an equal size to that of the register. The operation of an access using an invalid data size is undefined. STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 37 2.6 Data format in registers Register operands are always longwords (32 bits). When a memory operand is only a byte (8 bits) or a word (16 bits), it is sign-extended into a longword when loaded into a register. 2.7 Data formats in memory Memory can be accessed in 8-bit byte, 16-bit word, or 32-bit longword form. A memory operand less than 32 bits in length is sign-extended before being loaded into a register. A word operand must be accessed starting from a word boundary (even address of a 2-byte unit: address 2n), and a longword operand starting from a longword boundary (even address of a 4-byte unit: address 4n). An address error will result if this rule is not observed. A byte operand can be accessed from any address. Big endian or little endian byte order can be selected for the data format. This endian selection cannot be changed dynamically and is selected by the system during power-on reset. Refer to the system architecture manual of the relevant product for details of how to perform endian selection. Bit positions are numbered left to right from most-significant to least-significant. Thus, in a 32-bit longword, the left-most bit, bit 31, is the most significant bit and the right-most bit, bit 0, is the least significant bit. The data format in memory is shown in Figure 4. A 31 7 A+1 23 A+2 15 07 A+3 A + 11 A + 10 A + 9 7 31 07 0 7 0 07 0 15 Address A Byte 0 Byte 1 Byte 2 Byte 3 Address A + 4 Address A + 8 15 0 15 Word 0 31 Big endian 15 07 0 07 07 0 0 15 Word 1 0 A+8 7 Byte 3 Byte 2 Byte 1 Byte 0 Address A + 8 Word 1 Longword 23 31 0 Word 0 Longword 0 Address A + 4 Address A Little endian Figure 4: Data formats in memory Note: The SH-4 CPU core does not support endian conversion for the 64-bit data format. Therefore, if double-precision floating-point format (64-bit) access is performed in little endian mode, the upper and lower 32 bits will be reversed. ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 38 2.8 Processor states The SH-4 CPU core has four processor states. Transitions between the states are shown in Figure 5 2.8.1 Reset state In this state the CPU is reset. The CPU can be placed in one of two reset states, either power on reset or manual reset. Which of these is selected is determined by the system architecture. Refer to the relevant system architecture manual for details. For more information on resets, see section 5, Exceptions. The purpose of having two reset modes is to allow some flexibility over which system components are reset. Typically: · power-on reset will cause all system components to be reset, · manual reset may, for example, avoid resetting DRAM controllers so that memory contents are preserved. 2.8.2 Exception-handling state This is a transient state during which the CPU's processor state flow is altered by a reset, general exception, or interrupt exception source. In the case of a reset, the CPU branches to address 0xA000 0000 and starts executing the user-coded exception handling program. In the case of a general exception or interrupt, the program counter (PC) contents are saved in the saved program counter (SPC), the status register (SR) contents are saved in the saved status register (SSR), and the R15 contents are saved in saved general register (SGR). The CPU branches to the start address of the user-coded exception service routine, found from the sum of the contents of the vector base address and the vector offset. See Chapter 5: Exceptions, for more information on resets, general exceptions, and interrupts. 2.8.3 Program execution state In this state the CPU executes program instructions in sequence. STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 39 2.8.4 Power-down state The power-down state is entered by executing a SLEEP instruction. In this state the CPU stops executing instructions and signals to the system that the CPU has been put to sleep. The system response to receiving this signal is described in the System Architecture Manual of the appropriate product. The CPU is restarted by raising an interrupt. Power-on reset state Power-on reset state Manual reset state Manual reset state Reset state Exception-handling state Interrupt Exception interrupt Interrupt End of exception transition processing Program execution state SLEEP instruction with STBY bit cleared Sleep mode SLEEP instruction with STBY bit set Standby mode Power-down state Figure 5: Processor state transitions Note: For conditions determining state transitions, see the System Architecture Manual. ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 40 2.9 Processor modes There are two processor modes: user mode and privileged mode. The processor mode is determined by the processor mode bit (MD) in the status register (SR). User mode is selected when the MD bit is cleared to 0, and privileged mode when the MD bit is set to 1. When the reset state or exception-handling state is entered, the MD bit is set to 1. When exception handling ends, the MD bit returns to the value held before the exception occurred. STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA Memory management unit (MMU) 3 3.1 Overview The SH-4 CPU core manages a 29-bit external memory space by providing 8-bit address space identifiers, and a 32-bit logical (virtual) address space. Address translation from virtual address to physical address is performed using the memory management unit (MMU), built into the SH-4 CPU core. The MMU performs high-speed address translation by caching user-created address translation table information, in an address translation buffer (translation lookaside buffer: TLB). The SH-4 has four instruction TLB (ITLB) entries and 64 unified TLB (UTLB) entries. UTLB copies are stored in the ITLB by hardware. It is possible to set the virtual address space access right, and implement storage protection independently, for privileged mode and user mode. 3.2 Role of the MMU The main purpose of an MMU is to ensure that efficient use is made of physical memory, which in most systems is a limiting resource. The MMU is normally managed by the OS, which allocates physical pages of memory to virtual pages of memory, as required by a task. Pages which are switched out by the OS are placed in a secondary storage device, such as a hard disk. A page refers to a contiguous range of addresses, which can all be translated by a single translation table entry. On SH-4 there is support for 4 page sizes: 1-kbyte, 4-kbyte, 64-kbyte and 1-Mbyte. Memory protection functions are provided to prevent physical memory from inadvertently being accessed and reset by a process. ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 42 Although the functions of the MMU could be implemented by software alone, having address translation performed by software each time a process accessed physical memory would be very inefficient. For this reason, a buffer for address translation (TLB) is provided in hardware, and frequently used address translation information is placed here. The TLB can be described as a cache for address translation information. However, unlike a cache, if address translation fails-that is, if an exception occurs-switching of the address translation information is normally performed by software. Thus memory management can be performed in a flexible manner by software. 3.3 Register descriptions There are six MMU-related registers. P4 addressb Area 7 addressB Undefined 0xFF00 0000 0x1F00 0000 32 R/W Undefined 0xFF00 0004 0x1F00 0004 32 TTB R/W Undefined 0xFF00 0008 0x1F00 0008 32 Translation table address register TEA R/W Undefined 0xFF00 000C 0x1F00 000C 32 MMU control register MMUCR R/W 0x0000 0000 0xFF00 0010 Name Abbreviation R/W Page table entry high register PTEH R/W Page table entry low register PTEL Translation table base register Initial valuea 0x1F00 0010 Access size 32 Table 6: MMU registers a. The initial value is the value after a power-on reset or manual reset. b. This is the address when using the virtual/physical address space P4 region. When making an access from physical address space Area 7 using the TLB, the upper 3 bits of the address are ignored. Note: Behavior is undefined if an area designated as a reserved area in this manual is accessed. STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 43 3.3.1 Page table entry high register (PTEH) Longword access to PTEH can be performed from 0xFF00 0000 in the P4 region, and 0x1F00 0000 in Area 7. When an MMU exception or address error exception occurs, the VPN of the virtual address at which the exception occurred, is set in the VPN field by hardware. VPN varies according to the page size, but the VPN set by hardware when an exception occurs, always consists of the upper 22 bits of the virtual address which caused the exception. VPN setting can also be carried out by software. The number of the currently executing process is set in the ASID field by software. ASID is not updated by hardware. VPN and ASID are recorded in the UTLB by means of the LDLTB instruction. PTEH Field Bits Size Synopsis Type ASID [0,7] 8 Address space identifier. RW Operation Indicates the process that can access a virtual page. In single virtual memory mode and user mode, or in multiple virtual memory mode, if the SH bit is 0, this identifier is compared with the ASID in PTEH when address comparison is performed. See section 3.3.7 Address space identifier. Power-on reset VPN Undefined [10,31] Virtual page number. 22 Operation RW For 1-kbyte: upper 22 bits of virtual address. For 4-kbyte: upper 20 bits of virtual address. For 64-kbyte: upper 16 bits of virtual address. For 1-Mbyte: upper 12 bits of virtual address. Power-on reset Undefined Table 7: PTEH register description ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 44 3.3.2 Page table entry low register (PTEL) Longword access to PTEL can be performed from 0xFF00 0004 in the P4 region, and 0x1F00 0004 in Area 7. PTEL is used to hold the physical page number and page management information to be recorded in the UTLB, by means of the LDTLB instruction. The contents of this register are not changed unless a software directive is issued. PTEL Field Bits Size Synopsis Type WT 0 1 Write-through bit. RW Operation Specifies the cache write mode. 0: Copy-back mode. 1: Write-through mode. Power-on reset SH Undefined 1 Share status bit. 1 Operation RW 0: pages are not shared by processes. 1: pages are shared by processes. Power-on reset D Undefined 2 Dirty bit 1 Operation RW Indicates whether a write has been performed to a page. 0: Write has not been performed. 1: Write has been performed. Power-on reset C Undefined 3 Cacheability bit. 1 Operation RW Indicates whether a page is cacheable. 0: Not cacheable. 1: Cacheable. When control register is mapped, this bit must be cleared to 0. Power-on reset Undefined Table 8: PTEL register description STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 45 PTEL Field Bits Size Synopsis Type SZ0 4 1 Page size bit. RW Operation Specify page size. Bit SZ1 Bit SZ0 Page Size 0 0 1-kbyte 0 1 4-kbyte 1 0 64-kbyte 1 1 1-Mbyte Power-on reset PR Undefined [5,6] Protection key data. 2 Operation RW 2-bit data expressing the page access right as a code. 00: Can be read only in privileged mode. 01: Can be read and written in privileged mode. 10: Can be read only, in privileged or user mode. 11: Can be read and written in privileged or user mode. Power-on reset SZ1 Undefined 7 Page size bit 1 Operation Refer to SZ0 for operation details. Power-on reset RW 0 Table 8: PTEL register description ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 46 PTEL Field Bits Size Synopsis Type V 8 1 Validity bit. RW Operation Indicates whether the entry is valid. 0: Invalid 1: Valid Cleared to 0 by a power-on reset. Not affected by a manual reset. Power-on reset PPN Undefined [10,28] Physical page number 19 Operation RW Upper 19 bits of the physical address. With a 1-kbyte page, PPN bits [28:10] are valid. With a 4-kbyte page, PPN bits [28:12] are valid. With a 64-kbyte page, PPN bits [28:16] are valid. With a 1-Mbyte page, PPN bits [28:20] are valid. The synonym problem must be taken into account when setting the PPN (Section 3.6.5: Avoiding synonym problems on page 64). Power-on reset RES 9, [29,31] 4 Power-on reset Undefined Bits reserved RW Undefined Table 8: PTEL register description STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 47 3.3.3 Translation table base register (TTB) Long word access to the TTB can be performed from 0xFF00 0008 in the P4 region, and 0x1F00 0008 in Area 7. The contents of the TTB are not changed unless a software directive is issued. This register can be freely used by software. TTB Field Bits Size Synopsis Type TTB [0,31] 32 Translation table base register. RW Operation TTB is used, for example, to hold the base address of the currently used page table. Power-on reset Undefined Table 9: TTB register description 3.3.4 TLB exception address register (TEA) Longword access to TEA can be performed from 0xFF00 000C in the P4 region and 0x1F00 000C in Area 7. The contents of this register can be changed by software. TEA Field Bits Size Synopsis Type TEA [0,31] 32 TLB exception address register. RW Operation After an MMU exception or address error exception occurs, the virtual address at which the exception occurred is set in TEA by hardware. Power-on reset Undefined Table 10: TEA register description 3.3.5 MMU control register (MMUCR) Longword access to MMUCR can be performed from 0xFF00 0010 in the P4 region, and 0x1F00 0010 in Area 7. The individual bits perform MMU settings as shown below. Therefore, MMUCR rewriting should be performed by a program in the P1 or P2 region. After MMUCR is updated, an instruction that performs data access to the ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 48 P0, P3, U0, or store queue region should be located at least four instructions after the MMUCR update instruction. Also, a branch instruction to the P0, P3, or U0 region should be located at least eight instructions after the MMUCR update instruction. MMUCR contents can be changed by software. The LRUI bits and URC bits may also be updated by hardware. MMUCR Field Bits Size Synopsis Type AT 0 1 Address translation bit. RW Operation Specifies MMU enabling or disabling. 0: MMU disabled. 1: MMU enabled. MMU exceptions are not generated when the AT bit is 0. Therefore, in the case of software that does not use the MMU, the AT bit should be cleared to 0. Power-on reset TI 0 2 TLB invalidate. 1 RW Operation Power-on reset SV Writing 1 to this bit invalidates (clears to 0) all valid UTLB/ITLB bits. This bit always returns 0 when read. 0 8 Single virtual mode bit. 1 Operation RW Bit that switches between single virtual memory mode and multiple virtual memory mode. 0: Multiple virtual memory mode. 1: Single virtual memory mode. When this bit is changed, ensure that 1 is also written to the TI bit. Power-on reset 0 Table 11: MMUCR register description STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 49 MMUCR Field Bits Size Synopsis Type SQMD 9 1 Store queue mode bit. RW Operation Specifies the right of access to the store queues. 0: User/privileged access possible. 1: Privileged access possible (address error exception in case of user access). Power-on reset URC 0 [10,15] UTLB replace counter. 6 RW Operation Power-on reset URB Random counter for indicating the UTLB entry for which replacement is to be performed with an LDTLB instruction. URC is incremented each time the UTLB is accessed. When URB > 0, URC is reset to 0 when the condition URC = URB occurs. Also note that, if a value is written to URC by software which results in the condition URC > URB, incrementing is first performed in excess of URB until URC = 0x3F. URC is not incremented by an LDTLB instruction. 0 [18,23] UTLB replace boundary. 6 RW Operation Bits that indicate the UTLB entry boundary at which replacement is to be performed. Valid only when URB > 0. Power-on reset 0 Table 11: MMUCR register description ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 50 MMUCR Field Bits Size Synopsis Type LRUI [26, 31] 6 Least recently used ITLB. RW Operation The LRU (least recently used) method is used to decide the ITLB entry to be replaced in the event of an ITLB miss. The entry to be purged from the ITLB can be confirmed using the LRUI bits. LRUI is updated by means of the algorithm shown below. A dash in this table means that updating is not performed . [5] [4] [3] [2] [1] [0] When ITLB entry 0 is used 0 0 0 - - - When ITLB entry 1 is used 1 - - 0 0 - When ITLB entry 2 is used - 1 - 1 - 0 When ITLB entry 3 is used - - 1 - 1 1 Other than the above - - - - - - When the LRUI bit settings are as shown below, the corresponding ITLB entry is updated by an ITLB miss. An asterisk in this table means "don't care". [5] [4] [3] [2] [1] [0] ITLB entry 0 is updated 1 1 1 * * * ITLB entry 1 is updated 0 * * 1 1 * ITLB entry 2 is updated * 0 * 0 * 1 ITLB entry 3 is updated * * 0 * 0 0 Other than the above Setting prohibited Ensure that values for which "Setting prohibited" is indicated in the above table are not set at the discretion of software. After a power-on manual reset the bits are initialized to 0,and therefore a prohibited setting is never made by a hardware update. Power-on reset 0 Table 11: MMUCR register description STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 51 MMUCR Field Bits Size Synopsis Type RES 1, [3,7], [16,17], [24,25] 10 Bits reserved RW Power-on reset Undefined Table 11: MMUCR register description 3.4 Address space 3.4.1 Physical address space The SH-4 CPU core supports a 32-bit (4-Gbyte) physical address space. When the MMUCR.AT bit is cleared to 0 and the MMU is disabled, the address space accessed by the program is this physical address space. The physical address space is divided into a number of regions, as shown in Figure 7. The region is selected using the top 3 bits of the physical address. Bit 31 Region accessed 30 29 0 0 0 0 0 1 0 0 1 1 1 0 1 User mode 1 0 Privileged mode P0 U0 0 P1 Address error 0 1 P2 Address error 1 1 0 P3 Address error 1 1 1 P4 Address errora Table 12: Region selection a. Except for address from 0xe000 0000 - 0xe3FF FFFF which the user can use to access the store queues. ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 52 The region selected determines how the remaining 29 bits are interpreted. For example P0, P1 and P3 all access the 29 bits of external memory via the cache. P4 is used exclusively to access the cores internal devices. See the system architecture manual for more details of the internal devices available on a particular product. 3.4.2 External memory space The SH-4 CPU core supports a 29-bit external memory space.The external memory space is divided into eight Areas as shown in Figure 7. Areas 0 to 6 relate to memory, Area 7 is a reserved area, and is only accessed via the P4 region. 0x0000 0000 0x0400 0000 0x0800 0000 0x0C00 0000 0x1000 0000 Area 0 Area 1 Area 2 Area 3 Area 4 0x1400 0000 Area 5 0x1800 0000 Area 6 0x1C00 0000 Area 7 (reserved area) 0x1FFF FFFF Figure 6: External memory Space STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 53 Privileged mode External memory space 0x0000 0000 Area 0 Area 1 Area 2 Area 3 Area 4 Area 5 Area 6 Area 7 * P0 region Cacheable User mode 0x0000 0000 U0 region Cacheable 0x8000 0000 0x8000 0000 P1 region Cacheable 0xA000 0000 P2 region Non-cacheable 0xC000 0000 P3 region Cacheable 0xE000 0000 P4 region Non-cacheable Address error Store queue region Address error 0xE000 0000 0xE400 0000 0xFFFF FFFF 0xFFFF FFFF * Area 7 is reserved Figure 7: Physical address space (MMUCR.AT = 0) P0, P1, P3, U0 Regions: The P0, P1, P3, and U0 regions can be accessed using the cache. Whether or not the cache is used is determined by the cache control register (CCR). When the cache is used, with the exception of the P1 region, switching between the copy-back method and the write-through method for write accesses is specified by the CCR.WT bit. For the P1 region, switching is specified by the CCR.CB bit. Zeroing the upper 3 bits of an address in these regions gives the corresponding external memory space address. However, since Area 7 in the external memory space is a reserved Area, a reserved area also appears in these regions. P2 Region: The P2 region cannot be accessed using the cache. In the P2 region, zeroing the upper 3 bits of an address gives the corresponding external memory space address. However, since Area 7 in the external memory space is a reserved Area, a reserved area also appears in this region. P4 Region: The P4 region is mapped onto SH-4 CPU core on-chip I/O channels. This region cannot be accessed using the cache. The P4 region is shown in detail in Table 13. ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 54 Start address End address Function 0xE000 0000 0xE3FF FFFF Comprises addresses for accessing the store queues (SQs). When the MMU is disabled (MMUCR.AT=0), the SQ access right is specified by the MMUCR.SQMD bit. For details, see Section 4.6: Store queues on page 101. 0xF000 0000 0xF0FF FFFF Used for direct access to the instruction cache address array. For details, see Section 4.5.1: IC address array on page 95. 0xF100 0000 0xF1FF FFFF Used for direct access to the instruction cache data array. For details, see Section 4.5.4: IC data array on page 97. 0xF200 0000 0xF2FF FFFF Used for direct access to the instruction TLB address array. For details, see Section 3.8.1: ITLB address array on page 70 0xF300 0000 0xF3FF FFFF Used for direct access to instruction TLB data arrays 1 and 2. For details, see Section 3.8.2: ITLB data array 1 on page 71. 0xF400 0000 0xF4FF FFFF Used for direct access to the operand cache address array. For details, see Section 4.5.5: OC address array on page 98. 0xF500 0000 0xF5FF FFFF Used for direct access to the operand cache data array. For details, see Section 4.5.6: OC data array on page 99. 0xF600 0000 0xF6FF FFFF Used for direct access to the unified TLB address array. For details, see Section 3.8.3: UTLB address array on page 72. 0xF700 0000 0xF7FF FFFF Used for direct access to unified TLB data arrays 1 and 2. For details, see Section 3.8.4: UTLB data array 1 on page 74. 0xFC00 0000 0xFFFF FFFF Control register area. Table 13: P4 area STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 55 3.4.3 Virtual address space Setting the MMUCR.AT bit to 1, enables the P0, P3, and U0 regions of the address space in the SH-4 CPU core to be mapped onto any external memory space in 1-, 4-, or 64-kbyte, or 1-Mbyte, page units. Mapping from virtual address space to 29-bit external memory space is carried out using the TLB. When accessed using virtual addressing, Area 7 is equivalent to the P4 region in physical address space. Virtual address space is illustrated in Figure 8. External memory space P0 region Cacheable Address Translation Possible P1 region Cacheable Address Translation Not Possible P2 region Non-cacheable Address Translation Not Possible P3 region Cacheable Address Translation Possible P4 region Non-cacheable Address Translation Not Possible Privileged mode Area 0 Area 1 Area 2 Area 3 Area 4 Area 5 Area 6 Area 7 U0 region Cacheable Address Translation Possible Address error Store queue region Address error User mode Figure 8: Virtual memory space (MMUCR.AT = 1) ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 56 P0, P3, U0 Regions: The P0 region (excluding addresses 0x7C00 0000 to 0x7FFF FFFF), P3 region, and U0 region, allow access using the cache, and address translation using the TLB. These regions can be mapped onto any external memory space in 1, 4, or 64-kbyte, or 1-Mbyte, page units. When CCR is in the cache-enabled state, and the TLB enable bit (C bit) is 1, accesses can be performed using the cache. In write accesses to the cache, switching between the copy-back method and the write-through method is indicated by the TLB write-through bit (WT bit), and is specified in page units. Only when the P0, P3, and U0 regions are mapped onto external memory space by means of the TLB, are addresses 0x1C00 0000 to 0x1FFF FFFF of Area 7 in external memory space allocated to the control register area. This enables control registers to be accessed from the U0 region in user mode. In this case, the C bit for the corresponding page must be cleared to 0. P1, P2, P4 Regions: Address translation using the TLB cannot be performed for the P1, P2, or P4 region (except for the store queue region). Accesses to these regions are the same as for physical address space. The store queue region can be mapped onto any external memory space by the MMU. However, operation in the case of an exception differs from that for normal P0, U0, and P3 spaces. For details, see section 4.6, Store Queues. 3.4.4 On-chip RAM space In the SH-4 CPU core, half of the (16 kbyte) operand cache can be used as on-chip RAM. This can be done by changing the CCR settings. When the operand cache is used as on-chip RAM (CCR.ORA = 1), the P0/ U0 region addresses 0x7C00 0000 to 0x7FFF FFFF are an on-chip RAM area. Data accesses (byte/word/longword/quadword) can be used in this area. This area can only be used in RAM mode. Note: It is not possible to execute instructions out of this on-chip RAM. STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 57 3.4.5 Address translation In the SH-4 CPU core, the ITLB is used for instruction accesses and the UTLB for data accesses. In the event of an access to an region other than the P4 region, the accessed virtual address is translated to a physical address. If the virtual address belongs to the P1 or P2 region, the physical address is uniquely determined without accessing the TLB. If the virtual address belongs to the P0, U0, or P3 region, the TLB is searched using the virtual address, and if the virtual address is recorded in the TLB, a TLB hit is made and the corresponding physical address is read from the TLB. If the accessed virtual address is not recorded in the TLB, a TLB miss exception is generated and processing switches to the TLB miss exception handling routine. In the TLB miss exception handling routine, the address translation table in external memory is searched, and the corresponding physical address and page management information are recorded in the TLB. After the return from the exception handling routine, the instruction which caused the TLB miss exception is re-executed. 3.4.6 Single virtual memory mode and multiple virtual memory mode There are two virtual memory systems, either of which can be selected with the MMUCR.SV bit: · single virtual memory A number of processes run simultaneously, using non-overlapping virtual address spaces, so that the physical address corresponding to a particular virtual address is uniquely determined. · multiple virtual memory A number of processes run with overlapping virtual address spaces, consequently, virtual addresses may need to be translated into different physical addresses depending on the process i.d. The only difference between the single virtual memory and multiple virtual memory systems in terms of operation is in the TLB address comparison method (see Section 3.5.3: Address translation method on page 59). ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 58 3.4.7 Address space identifier (ASID) In multiple virtual memory mode, the 8-bit address space identifier (ASID) is used to distinguish between processes running simultaneously, while sharing the virtual address space. Software can set the ASID of the currently executing process in PTEH in the MMU. The TLB does not have to be purged when processes are switched by means of ASID. In single virtual memory mode, ASID is used to provide memory protection for processes running simultaneously while using the virtual memory space on an exclusive basis. 3.5 TLB functions 3.5.1 Unified TLB (UTLB) configuration The unified TLB (UTLB) is so called because of its use for the following two purposes: 1 To translate a virtual address to a physical address in a data access 2 As a table of address translation information, to be recorded in the instruction TLB in the event of an ITLB miss Information in the address translation table located in external memory is cached into the UTLB. The address translation table contains virtual page numbers and address space identifiers, and corresponding physical page numbers and page management information. Figure 9 shows the overall configuration of the UTLB. The UTLB consists of 64 fully-associative type entries. Entry 0 Entry 1 Entry 2 ASID [7:0] VPN [31:10] V ASID [7:0] VPN [31:10] V ASID [7:0] VPN [31:10] V PPN [28:10] SZ [1:0] SH C PR [1:0] D WT PPN [28:10] SZ [1:0] SH C PR [1:0] D WT PPN [28:10] SZ [1:0] SH C PR [1:0] D WT Entry 63 ASID [7:0] VPN [31:10] V PPN [28:10] SZ [1:0] SH C PR [1:0] D WT Figure 9: UTLB configuration STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 59 3.5.2 Instruction TLB (ITLB) configuration The ITLB is used to translate a virtual address to a physical address in an instruction access. Information in the address translation table located in the UTLB, is cached into the ITLB. Figure 10 shows the overall configuration of the ITLB. The ITLB consists of 4 fully-associative type entries. The address translation information is almost the same as that in the UTLB, but with the following differences: 1 D and WT bits are not supported. 2 There is only one PR bit, corresponding to the upper of the PR bits in the UTLB. Entry 0 Entry 1 Entry 2 Entry 3 ASID [7:0] ASID [7:0] ASID [7:0] ASID [7:0] VPN [31:10] VPN [31:10] VPN [31:10] VPN [31:10] V V V V PPN [28:10] PPN [28:10] PPN [28:10] PPN [28:10] SZ [1:0] SZ [1:0] SZ [1:0] SZ [1:0] SH SH SH SH C C C C PR PR PR PR Figure 10: ITLB configuration 3.5.3 Address translation method Figure 11 and Figure 12 show flowcharts of memory accesses using the UTLB and ITLB ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 60 . Data access to virtual address (VA) VA is in P4 area VA is in P2 area On-chip I/O access 0 VA is in P1 area VA is in P0, U0, or P3 area No CCR.OCE? MMUCR.AT = 1 1 0 Yes CCR.CB? CCR.WT? 0 1 SH = 0 and (MMUCR.SV = 0 or SR.MD = 0) No Yes No VPNs match and ASIDs match and V=1 No VPNs match and V = 1 Yes Yes No Only one entry matches Data TLB miss exception Yes SR.MD? 0 (User) 1 (Privileged) PR? Memory access 10 11 R/W? R/W? R 00 or 01 W Data TLB multiple hit exception 01 or 11 R W W D? 0 Data TLB protection violation exception 00 or 10 R/W? 1 R/W? R W R Data TLB protection violation exception Initial page write exception C=1 and CCR.OCE = 1 No Yes Cache access in copy-back mode 0 WT? 1 Cache access in write-through mode Memory access (Non-cacheable) Figure 11: Flowchart of memory access using UTLB figure STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 61 Instruction access to virtual address (VA) VA is in P4 area Access prohibited VA is in P2 area VA is in P1 area VA is in P0, U0, or P3 area No 0 MMUCR.AT = 1 CCR.ICE? 1 Yes No SH = 0 and (MMUCR.SV = 0 or SR.MD = 0) Yes No No VPNs match and V = 1 VPNs match and ASIDs match and V=1 Yes Only one entry matches Hardware ITLB miss handling Search UTLB Yes Match? Yes No Yes Record in ITLB No SR.MD? Instruction TLB miss exception 0 (User) 1 (Privileged) 0 PR? Instruction TLB multiple hit exception 1 Instruction TLB protection violation exception C=1 and CCR.ICE = 1 No Yes Cache access Memory access (Non-cacheable) Figure 12: Flowchart of memory access using ITLB ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 62 3.6 MMU functions 3.6.1 MMU hardware management The SH-4 CPU core supports the following MMU functions. 1 The MMU decodes the virtual address to be accessed by software, and performs address translation by controlling the UTLB/ITLB, in accordance with the MMUCR settings. 2 The MMU determines the cache access status, on the basis of the page management information read during address translation (C, WT bits). 3 If address translation cannot be performed normally in a data access or instruction access, the MMU notifies software by means of an MMU exception. 4 If address translation information is not recorded in the ITLB in an instruction access, the MMU searches the UTLB, and if the necessary address translation information is recorded in the UTLB, the MMU copies this information into the ITLB in accordance with MMUCR.LRUI. 3.6.2 MMU software management Software processing for the MMU consists of the following: 1 Setting of MMU-related registers. Some registers are also partially updated by hardware automatically. 2 Recording, deletion, and reading of TLB entries. There are two methods of recording UTLB entries: by using the LDTLB instruction, or by writing directly to the memory-mapped UTLB. ITLB entries can only be recorded by writing directly to the memory-mapped ITLB. For deleting or reading UTLB/ITLB entries, it is possible to access the memory-mapped UTLB/ITLB. 3 MMU exception handling. When an MMU exception occurs, processing is performed based on information set by hardware. STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 63 3.6.3 MMU instruction (LDTLB) A TLB load instruction (LDTLB) is provided for recording UTLB entries. When an LDTLB instruction is issued, the SH-4 CPU core copies the contents of PTEH and PTEL, to the UTLB entry indicated by MMUCR.URC. ITLB entries are not updated by the LDTLB instruction, and therefore address translation information purged from the UTLB entry may still remain in the ITLB entry. As the LDTLB instruction changes address translation information, ensure that it is issued by a program in the P1 or P2 region. The operation of the LDTLB instruction is shown in Figure 13. MMUCR 31 26 25 24 23 LRUI - 18 17 16 15 URB - 10 9 8 7 URC SV 3 2 1 0 - TI - AT SQMD Entry specification PTEL 31 29 28 - PTEH 31 10 9 8 7 VPN - 10 9 8 7 6 5 4 3 2 1 0 - V SZ PR SZ C D SHWT PPN 0 PTEA ASID 31 4 3 2 - TC 0 SA Write Entry 0 ASID [7:0] VPN [31:10] V PPN [28:10] SZ [1:0] SH C PR [1:0] D WT SA [2:0] TC Entry 1 ASID [7:0] VPN [31:10] V PPN [28:10] SZ [1:0] SH C PR [1:0] D WT SA [2:0] TC Entry 2 ASID [7:0] VPN [31:10] V PPN [28:10] SZ [1:0] SH C PR [1:0] D WT SA [2:0] TC Entry 63 ASID [7:0] VPN [31:10] V PPN [28:10] SZ [1:0] SH C PR [1:0] D WT SA [2:0] TC UTLB Figure 13: Operation of LDTLB instruction ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 64 3.6.4 Hardware ITLB miss handling In an instruction access, the SH-4 CPU core searches the ITLB. If it cannot find the necessary address translation information (i.e. in the event of an ITLB miss), the UTLB is searched by hardware, and if the necessary address translation information is present, it is recorded in the ITLB. This procedure is known as hardware ITLB miss handling. If the necessary address translation information is not found in the UTLB search, an instruction TLB miss exception is generated and processing passes to software. 3.6.5 Avoiding synonym problems When 1 or 4-kbyte pages are recorded in TLB entries, a synonym problem may arise. The problem is that, when a number of virtual addresses are mapped onto a single physical address, the same physical address data may be recorded in a number of cache entries, and it becomes impossible to guarantee data integrity. This problem does not occur with the instruction TLB or instruction cache. In the SH-4 CPU core, line selection is performed using bits [13:5] of the virtual address, as this avoids the cache having to go via the TLB and thus achieves faster operand cache operation. However, bits [13:10] of the virtual address in the case of a 1-kbyte page, and bits [13:12] of the virtual address in the case of a 4-kbyte page, are subject to address translation. As a result, bits [13:10] of the physical address after translation may differ from bits [13:10] of the virtual address. Great care must therefore be taken whenever translations are set up which could cause synonyms, in particular, if two operand translations are to the same physical page but their virtual addresses differ in their synonym bits: · Do not allow both the translations to be active at the same time. · Always separate activations of the two translations by an appropriate cache purge. STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 65 3.7 Handling MMU exceptions There are seven MMU exceptions. 3.7.1 ITLBMULTIHIT An instruction TLB multiple hit exception occurs when, more than one ITLB entry matches the virtual address to which an instruction access has been made. If multiple hits occur when the UTLB is searched by hardware, in hardware ITLB miss handling, a data TLB multiple hit exception will result. When an instruction TLB multiple hit exception occurs a reset is executed, and cache coherency is not guaranteed. Hardware processing See Chapter 5: Exceptions on page 105, ITLBMULTIHIT - Instruction TLB Multiple-Hit Exception on page 118. Software processing (reset routine) The ITLB entries which caused the multiple hit exception are checked in the reset handling routine. This exception is intended for use in program debugging, and should not normally be generated. 3.7.2 ITLBMISS An instruction TLB miss exception occurs when, address translation information for the virtual address to which an instruction access is made, is not found in the UTLB entries by the hardware ITLB miss handling procedure. The instruction TLB miss exception processing, carried out by software, is shown below. This is the same as the processing for a data TLB miss exception. Hardware processing See, Chapter 5: Exceptions on page 105, ITLBMISS - Instruction TLB Miss Exception on page 122. ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 66 Software processing (instruction TLB miss exception handling routine) Software is responsible for searching the external memory page table and assigning the necessary page table entry. Software should carry out the following processing in order to find and assign the necessary page table entry. 1 Write to PTEL the values of the PPN, PR, SZ, C, D, SH, V, and WT bits in the page table entry recorded in the external memory address translation table. 2 When the entry to be replaced in entry replacement is specified by software, write that value to URC in the MMUCR register. If URC is greater than URB at this time, the value should be changed to an appropriate value after issuing an LDTLB instruction. 3 Execute the LDTLB instruction and write the contents of PTEH, PTEL, and to the TLB. 4 Finally, execute the exception handling return instruction (RTE), terminate the exception handling routine, and return control to the normal flow. The RTE instruction should be issued at least one instruction after the LDTLB instruction. 3.7.3 EXECPROT An instruction TLB protection violation exception occurs when, even though an ITLB entry contains address translation information matching the virtual address to which an instruction access is made, the actual access type is not permitted by the access right specified by the PR bit. The instruction TLB protection violation exception processing, carried out by software, is shown below. Hardware processing See Chapter 5: Exceptions on page 105, EXECPROT - Instruction TLB Protection Violation Exception on page 126. Software processing (instruction TLB protection violation exception handling routine) Resolve the instruction TLB protection violation, execute the exception handling return instruction (RTE), terminate the exception handling routine, and return control to the normal flow. The RTE instruction should be issued at least one instruction after the LDTLB instruction. STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 67 3.7.4 OTLBMULTIHIT An operand TLB multiple hit exception occurs when, more than one UTLB entry matches the virtual address to which a data access has been made. A data TLB multiple hit exception is also generated if multiple hits occur, when the UTLB is searched in hardware ITLB miss handling. When an operand TLB multiple hit exception occurs, a reset is executed, and cache coherency is not guaranteed. The contents of PPN in the UTLB prior to the exception may also be corrupted. Hardware processing See Chapter 5: Exceptions on page 105, OTLBMULTIHIT - Operand TLB Multiple-Hit Exception on page 119. Software processing (reset routine) The UTLB entries which caused the multiple hit exception are checked in the reset handling routine. This exception is intended for use in program debugging, and should not normally be generated. 3.7.5 TLBMISS A data TLB miss exception occurs when, address translation information for the virtual address to which a data access is made is not found in the UTLB entries. The data TLB miss exception processing, carried out by software, is shown below. Hardware processing See Chapter 5: Exceptions on page 105, RTLBMISS - Read Data TLB Miss Exception on page 120. Software processing (data TLB miss exception handling routine) Software is responsible for searching the external memory page table and assigning the necessary page table entry. Software should carry out the following processing in order to find and assign the necessary page table entry. 1 Write to PTEL the values of the PPN, PR, SZ, C, D, SH, V, and WT bits in the page table entry recorded in the external memory address translation table. ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 68 2 When the entry to be replaced in entry replacement is specified by software, write that value to URC in the MMUCR register. If URC is greater than URB at this time, the value should be changed to an appropriate value after issuing an LDTLB instruction. 3 Execute the LDTLB instruction and write the contents of PTEH, PTEL, and to the UTLB. 4 Finally, execute the exception handling return instruction (RTE), terminate the exception handling routine, and return control to the normal flow. The RTE instruction should be issued at least one instruction after the LDTLB instruction. 3.7.6 READPROT A data TLB protection violation exception occurs when, even though a UTLB entry contains address translation information matching the virtual address to which a data access is made, the actual access type is not permitted by the access right specified by the PR bit. The data TLB protection violation exception processing, carried out by software, is shown below. Hardware processing See Chapter 5: Exceptions on page 105, READPROT - Data TLB Protection Violation Exception on page 124 Software processing (data TLB protection violation exception handling routine) Resolve the data TLB protection violation, execute the exception handling return instruction (RTE), terminate the exception handling routine, and return control to the normal flow. The RTE instruction should be issued at least one instruction after the LDTLB instruction. 3.7.7 FIRSTWRITE An initial page write exception occurs when, the D bit is 0 even though a UTLB entry contains address translation information matching the virtual address to which a data access (write) is made, and the access is permitted. The initial page write exception processing, carried out by software, is shown below. STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 69 Hardware processing See Chapter 5: Exceptions on page 105, FIRSTWRITE - Initial Page Write Exception on page 123 Software processing (initial page write exception handling routine) The following processing should be carried out as the responsibility of software: 1 Retrieve the necessary page table entry from external memory. 2 Write 1 to the D bit in the external memory page table entry. 3 Write to PTEL the values of the PPN, PR, SZ, C, D, WT, SH, and V bits in the page table entry recorded in external memory. 4 When the entry to be replaced in entry replacement is specified by software, write that value to URC in the MMUCR register. If URC is greater than URB at this time, the value should be changed to an appropriate value after issuing an LDTLB instruction. 5 Execute the LDTLB instruction and write the contents of PTEH, PTEL, and to the UTLB. 6 Finally, execute the exception handling return instruction (RTE), terminate the exception handling routine, and return control to the normal flow. The RTE instruction should be issued at least one instruction after the LDTLB instruction. 3.8 Memory-mapped TLB configuration To enable the ITLB and UTLB to be managed by software, their contents can be read and written by a P2 region program, with a MOV instruction in privileged mode. Operation is not guaranteed if access is made from a program in another region. A branch to a region other than the P2 region should be made at least 8 instructions after this MOV instruction. The ITLB and UTLB are allocated to the P4 region in physical address space. VPN, V and ASID in the ITLB can be accessed as an address array, PPN, V, SZ, PR, C, and SH as data array 1. VPN, D, V, and ASID in the UTLB can be accessed as an address array, PPN, V, SZ, PR, C, D, WT, and SH as data array 1. V and D can be accessed from both the address array side and the data array side. Only longword access is possible. Instruction fetches cannot be performed in these regions. For reserved bits, a write value of 0 should be specified; their read value is undefined. ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 70 3.8.1 ITLB address array The ITLB address array is allocated to addresses 0xF200 0000 to 0xF2FF FFFF in the P4 region. An address array access requires a 32-bit address field specification (when reading or writing), and a 32-bit data field specification (when writing). Information for selecting the entry to be accessed is specified in the address field, and VPN, V, and ASID to be written to the address array are specified in the data field. In the address field, bits [31:24] have the value 0xF2 indicating the ITLB address array, and the entry is selected by bits [9:8]. As longword access is used, 0 should be specified for address field bits [1:0]. In the data field, VPN is indicated by bits [31:10], V by bit [8], and ASID by bits [7:0]. The following two kinds of operation can be used on the ITLB address array: 1 ITLB address array read VPN, V, and ASID are read into the data field from the ITLB entry corresponding to the entry set in the address field. 2 ITLB address array write VPN, V, and ASID specified in the data field are written to the ITLB entry corresponding to the entry set in the address field. 24 23 31 Address field 1 1 1 1 0 0 1 0 10 9 8 7 31 Data field 0 E 10 9 8 7 V VPN VPN: Virtual page number V: Validity bit E: Entry 0 ASID ASID: Address space identifier : Reserved bits (0 write value, undefined read value) Figure 14: Memory-mapped ITLB address array STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 71 3.8.2 ITLB data array 1 ITLB data array 1 is allocated to addresses 0xF300 0000 to 0xF37F FFFF in the P4 region. A data array access requires a 32-bit address field specification (when reading or writing), and a 32-bit data field specification (when writing). Information for selecting the entry to be accessed is specified in the address field, and PPN, V, SZ, PR, C, and SH to be written to the data array are specified in the data field. In the address field, bits [31:23] have the value 0xF30 indicating ITLB data array 1, and the entry is selected by bits [9:8]. In the data field, PPN is indicated by bits [28:10], V by bit [8], SZ by bits [7] and [4], PR by bit [6], C by bit [3], and SH by bit [1]. The following two kinds of operation can be used on ITLB data array 1: 1 ITLB data array 1 read PPN, V, SZ, PR, C, and SH are read into the data field from the ITLB entry corresponding to the entry set in the address field. 2 ITLB data array 1 write PPN, V, SZ, PR, C, and SH specified in the data field are written to the ITLB entry corresponding to the entry set in the address field. 31 24 23 Address field 1 1 1 1 0 0 1 1 0 10 9 8 7 31 30 29 28 Data field 10 9 8 7 6 5 4 3 2 1 0 PPN PPN: V: E: SZ: 0 E Physical page number Validity bit Entry Page size bits PR: C: SH: : V C PR SZ SH Protection key data Cacheability bit Share status bit Reserved bits (0 write value, undefined read value) Figure 15: Memory-mapped ITLB data array 1 ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 72 3.8.3 UTLB address array The UTLB address array is allocated to addresses 0xF600 0000 to 0xF6FF FFFF in the P4 region. An address array access requires a 32-bit address field specification (when reading or writing), and a 32-bit data field specification (when writing). Information for selecting the entry to be accessed is specified in the address field, and VPN, D, V, and ASID to be written to the address array are specified in the data field. In the address field, bits [31:24] have the value 0xF6 indicating the UTLB address array, and the entry is selected by bits [13:8]. The address array bit [7] association bit (A bit), specifies whether or not address comparison is performed when writing to the UTLB address array. In the data field, VPN is indicated by bits [31:10], D by bit [9], V by bit [8], and ASID by bits [7:0]. STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 73 The following three kinds of operation can be used on the UTLB address array: 1 UTLB address array read VPN, D, V, and ASID are read into the data field from the UTLB entry corresponding to the entry set in the address field. In a read, associative operation is not performed, regardless of whether the association bit specified in the address field is 1 or 0. 2 UTLB address array write (non-associative) VPN, D, V, and ASID specified in the data field are written to the UTLB entry corresponding to the entry set in the address field. The A bit in the address field should be cleared to 0. 3 UTLB address array write (associative) When a write is performed with the A bit in the address field set to 1, comparison of all the UTLB entries is carried out using the VPN specified in the data field and PTEH.ASID. The usual address comparison rules are followed, but if a UTLB miss occurs, the result is no operation, and an exception is not generated. If the comparison identifies a UTLB entry, corresponding to the VPN specified in the data field, D and V specified in the data field are written to that entry. If there is more than one matching entry, a data TLB multiple hit exception results. This associative operation is simultaneously carried out on the ITLB, and if a matching entry is found in the ITLB, V is written to that entry. Even if the UTLB comparison results in no operation, a write to the ITLB side only is performed as long as there is an ITLB match. If there is a match in both the UTLB and ITLB, the UTLB information is also written to the ITLB. 31 24 23 Address field 1 1 1 1 0 1 1 0 10 9 8 7 VPN VPN: V: E: D: Virtual page number Validity bit Entry Dirty bit 2 1 0 A E 31 30 29 28 Data field 8 7 14 13 D V 0 ASID ASID: Address space identifier A: Association bit : Reserved bits (0 write value, undefined read value) Figure 16: Memory-mapped UTLB address array ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 74 3.8.4 UTLB data array 1 UTLB data array 1 is allocated to addresses 0xF700 0000 to 0xF77F FFFF in the P4 region. A data array access requires a 32-bit address field specification (when reading or writing), and a 32-bit data field specification (when writing). Information for selecting the entry to be accessed is specified in the address field, and PPN, V, SZ, PR, C, D, SH, and WT to be written to the data array, are specified in the data field. In the address field, bits [31:23] have the value 0xF70 indicating UTLB data array 1, and the entry is selected by bits [13:8]. In the data field, PPN is indicated by bits [28:10], V by bit [8], SZ by bits [7] and [4], PR by bits [6:5], C by bit [3], D by bit [2], SH by bit [1], and WT by bit [0]. The following two kinds of operation can be used on UTLB data array 1: 1 UTLB data array 1 read PPN, V, SZ, PR, C, D, SH, and WT are read into the data field, from the UTLB entry corresponding to the entry set in the address field. 2 UTLB data array 1 write PPN, V, SZ, PR, C, D, SH, and WT specified in the data field, are written to the UTLB entry corresponding to the entry set in the address field. 31 24 23 Address field 1 1 1 1 0 1 1 1 0 14 13 31 30 29 28 Data field 8 7 10 9 8 7 6 5 4 3 2 1 0 PPN PPN: V: E: SZ: D: 0 E Physical page number Validity bit Entry Page size bits Dirty bit V PR: C: SH: WT: : PR C D Protection key data SZ SH WT Cacheability bit Share status bit Write-through bit Reserved bits (0 write value, undefined read value) Figure 17: Memory-mapped UTLB data array 1 STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture ADCS 7182230F 7182230F PRELIMINARY DATA 4 Caches 4.1 Overview 4.1.1 Features Note: This chapter details both the SH4-103 SH4-103 and SH4-202 SH4-202 variants. Please refer to your datasheet for specific core details. The SH-4 CPU core has an on-chip 8-kbyte instruction cache (IC) for instructions and 16-kbyte operand cache (OC) for data. Half of the memory of the operand cache (8 kbytes) can also be used as on-chip RAM. The features of these caches are summarized in Table 14. The SH4-202 SH4-202 has an on-chip 16-kbyte instruction cache (IC) for instructions and 32-kbyte operand cache (OC) for data. Half of the operand cache (16 kbytes) can also be used as on-chip RAM. The features of these caches are summarized in Table 14 and Table 15. The SH-4 CPU supports two 32-byte store queues (SQ) to perform high-speed writes to external memory. The features of the SQ are summarized in Table 16. Item Instruction cache Operand cache Capacity 8-kbyte cache 16-kbyte cache or 8-kbyte cache + 8-kbyte RAM Type Direct mapping Direct mapping Line size 32 bytes 32 bytes Table 14: Cache features (SH4-103 SH4-103, SH4-202 SH4-202 in compatibility mode) ADCS 7182230F 7182230F STMicroelectronics and Hitachi, Ltd. SH-4 CPU Core Architecture PRELIMINARY DATA 76 Item Instruction cache Operand cache Entries 256 512 Write method Copy-back/write-through selectable Table 14: Cache features (SH4-103 SH4-103, SH4-202 SH4-202 in compatibility mode) Item Instruction cache Capacity 16-kbyte cache Operand cache 32-kbyte cache or 16-kbyte cache + 16-kbyte RAM Type 2way set associative 2way set associative Line size 32 bytes 32 bytes Entries 256 entry /way 512 entry / way Write method Copy-back/write-through selectable Replace algorithm LRU LRU Table 15: Cache features (SH4-202 SH4-202 in the enhanced mode) Item Store queues Capacity 2 × 32 bytes Addresses 0xE000 0000 to 0xE3FF FFFF Write Store instruction Write-back Prefetch instruction Access right MMU off: according to MMUCR.SQMD MM