NEW DATABASE - 350 MILLION DATASHEETS FROM 8500 MANUFACTURERS
ARM1026EJ-S 0244C ARM102600E VFP10 ETM10RV ARM10 EPOC-32 4KB-128KB ARM10EJ-S - Datasheet Archive
TM Revision: r0p2 Technical Reference Manual Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C ARM1026EJ-S
ARM1026EJ-S ARM1026EJ-S TM Revision: r0p2 Technical Reference Manual Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C ARM1026EJ-S ARM1026EJ-S Technical Reference Manual Copyright © 2003 ARM Limited. All rights reserved. Release Information Change history Date Issue 24 September, 2002 A Change First release. 20 December, 2002 B Second release. Updated for ARM1026EJ-S ARM1026EJ-S r0p1 processor. 20 June, 2003 C Third release. Updated for ARM1026EJ-S ARM1026EJ-S r0p2 processor. Proprietary Notice Words and logos marked with ® or TM are registered trademarks or trademarks of ARM Limited in the EU and other countries, except as otherwise stated below in this proprietary notice. Other brands and names mentioned herein may be the trademarks of their respective owners. Neither the whole nor any part of the information contained in, or the product described in, this document may be adapted or reproduced in any material form except with the prior written permission of the copyright holder. The product described in this document is subject to continuous developments and improvements. All particulars of the product and its use contained in this document are given by ARM Limited in good faith. However, all warranties implied or expressed, including but not limited to implied warranties of merchantability, or fitness for purpose, are excluded. This document is intended only to assist the reader in the use of the product. ARM Limited shall not be liable for any loss or damage arising from the use of any information in this document, or any error or omission in such information, or any incorrect use of the product. Confidentiality Status This document is Open Access. It has no restriction on distribution. Product Status The information in this document is final (information on a developed product). Web Address http://www.arm.com ii Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Contents ARM1026EJ-S ARM1026EJ-S Technical Reference Manual Preface About this document . xviii Feedback . xxiv Chapter 1 Introduction 1.1 1.2 1.3 Chapter 2 Integer Core 2.1 2.2 2.3 2.4 2.5 2.6 ARM DDI 0244C 0244C About the processor . 1-2 Components of the processor . 1-4 Silicon revision information . 1-10 About the integer core . Pipeline . Prefetch unit . Typical ALU/multiply operations . Load/store unit . Typical load/store operations . Copyright © 2003 ARM Limited. All rights reserved. 2-2 2-4 2-6 2-7 2-8 2-9 iii Chapter 3 Programmer's Model 3.1 3.2 3.3 3.4 3.5 Chapter 4 Clocking and Reset Timing 4.1 4.2 4.3 Chapter 5 5-2 5-3 5-6 5-8 About the bus interface . 6-2 Bus transfer characteristics . 6-3 Bus transfer cycle timing . 6-8 Topology . 6-23 Endianness of BIU transfers . 6-24 64-bit and 32-bit AHB data buses . 6-25 Coprocessor Interface 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 iv About the prefetch unit . Branch prediction activity . Branch instruction cycle summary . Instruction memory barriers . Bus Interface 6.1 6.2 6.3 6.4 6.5 6.6 Chapter 7 About clock and reset signals . 4-2 Clock interfaces . 4-3 Reset . 4-4 Prefetch Unit 5.1 5.2 5.3 5.4 Chapter 6 About the programmer's model . 3-2 Program status registers . 3-3 About the CP15 system control coprocessor registers . 3-5 CP15 register descriptions . 3-9 CP15 instruction summary . 3-70 About the coprocessor interface . 7-2 Coprocessor interface signals . 7-3 Design considerations . 7-5 Parallel execution . 7-8 Rules for the interface . 7-9 Pipeline signal assertion . 7-10 Instruction issue . 7-11 Hold signals . 7-21 Instruction cancelation . 7-40 Bounced instructions . 7-47 Data buses . 7-53 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Chapter 8 Debug 8.1 8.2 8.3 8.4 8.5 8.6 8.7 Chapter 9 Debug Test Access Port 9.1 9.2 9.3 Chapter 10 About the MPU . 11-2 MPU software-accessible registers . 11-3 Configuring the MPU . 11-5 Overlapping protection regions . 11-8 Fault priority . 11-9 MPU aborts and external aborts . 11-10 Caches 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 12.9 12.10 ARM DDI 0244C 0244C About the MMU . 10-2 MMU software-accessible registers . 10-6 Address translation . 10-8 MMU memory access control . 10-26 MMU cachable and bufferable information . 10-28 MMU and pending write buffer . 10-29 Fault checking sequence . 10-30 Fault priority . 10-33 MMU aborts and external aborts . 10-34 Memory parity . 10-35 Memory Protection Unit 11.1 11.2 11.3 11.4 11.5 11.6 Chapter 12 Debug test access port and halt mode . 9-2 DBGTAP instructions . 9-4 Scan chain descriptions . 9-7 Memory Management Unit 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 10.10 Chapter 11 About the debug unit . 8-2 Register descriptions . 8-6 Software lockout function . 8-18 Halt mode . 8-19 Monitor mode . 8-22 Values in the link register after exceptions . 8-24 Comms channel . 8-25 About the caches . 12-2 Enabling the caches . 12-3 Cache and TCM access priorities . 12-6 Cache MVA and set/way formats . 12-7 Cache size support . 12-9 Cache support for external aborts . 12-10 Castout functionality, DCache only . 12-11 Cache support for MBIST . 12-12 Cache memory parity . 12-13 Code examples of CP15 cache operations . 12-15 Copyright © 2003 ARM Limited. All rights reserved. v Chapter 13 Pending Write Buffer 13.1 13.2 Chapter 14 Interrupt Latency 14.1 14.2 14.3 Chapter 15 ARM1026EJ-S ARM1026EJ-S processor . 20-2 Test signal connections . 20-10 MBIST . 20-13 Instruction Cycle Count 21.1 21.2 21.3 vi About power management . 19-2 Wait for interrupt mode . 19-3 Leakage control . 19-5 Design for Test 20.1 20.2 20.3 Chapter 21 About vectored interrupt controllers . 18-2 About the VIC port . 18-3 Timing of the VIC port . 18-4 Power Management 19.1 19.2 19.3 Chapter 20 About the tightly-coupled memories . 17-2 Programming the TCM . 17-3 Interface timing . 17-10 TCM parity . 17-16 Vectored Interrupt Controller Port 18.1 18.2 18.3 Chapter 19 About external aborts . 16-2 External abort reporting . 16-3 External abort rules of conduct . 16-4 Tightly-Coupled Memories 17.1 17.2 17.3 17.4 Chapter 18 About noncachable instruction fetches . 15-2 External aborts . 15-4 External Aborts 16.1 16.2 16.3 Chapter 17 About interrupt latency . 14-2 Worst-case interrupt latency . 14-3 Tuning interrupt latency . 14-4 Noncachable Instruction Fetches 15.1 15.2 Chapter 16 About the pending write buffer . 13-2 External aborts . 13-5 Cycle timing considerations . 21-2 Instruction cycle counts . 21-3 Interlocks . 21-22 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Appendix A Signal Descriptions A.1 A.2 A.3 A.4 A.5 A.6 A.7 A.8 A.9 A.10 AHB signals in normal mode . A-2 Coprocessor signals . A-7 Debug interface signals . A-9 DFT signals . A-10 MBIST signals . A-11 ETM signals . A-12 TCM signals . A-13 Interrupt signals . A-15 Memory parity signals . A-16 Other signals . A-17 Glossary Index ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. vii viii Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C List of Tables ARM1026EJ-S ARM1026EJ-S Technical Reference Manual Table 3-1 Table 3-2 Table 3-3 Table 3-4 Table 3-5 Table 3-6 Table 3-7 Table 3-8 Table 3-9 Table 3-10 Table 3-11 Table 3-12 Table 3-13 Table 3-14 Table 3-15 Table 3-16 Table 3-17 Table 3-18 Table 3-19 Table 3-20 Table 3-21 ARM DDI 0244C 0244C Change history . ii Register notation conventions . xxii CP15 register summary . 3-6 Address types . 3-8 Encoding of the Device ID Register . 3-10 Encoding of the Cache Type Register . 3-11 Encoding of the TCM Status Register . 3-13 Control Register instructions . 3-14 Encoding of the Control Register . 3-15 Effects of Control Register on caches . 3-17 Effects of Control Register on TCM interface . 3-18 Encoding of the Auxiliary Control Register . 3-19 Translation Table Base Register instructions . 3-20 Encoding of the Translation Table Base Register . 3-20 L2C and L2B encoding . 3-21 DCache and ICache Configuration Register instructions . 3-21 Encoding of the DCache and ICache Configuration Registers . 3-22 Domain Access Control Register instructions . 3-23 Encoding of the Domain Access Control Register . 3-23 Access permission summary when using the MMU . 3-24 Write Buffer Control Register instructions . 3-25 Encoding of the Write Buffer Control Register . 3-25 Data and Instruction Fault Status Register instructions . 3-26 Copyright © 2003 ARM Limited. All rights reserved. vii Table 3-22 Table 3-23 Table 3-24 Table 3-25 Table 3-26 Table 3-27 Table 3-28 Table 3-29 Table 3-30 Table 3-31 Table 3-32 Table 3-33 Table 3-34 Table 3-35 Table 3-36 Table 3-37 Table 3-38 Table 3-39 Table 3-40 Table 3-41 Table 3-42 Table 3-43 Table 3-44 Table 3-45 Table 3-46 Table 3-47 Table 3-48 Table 3-49 Table 3-50 Table 3-51 Table 3-52 Table 3-53 Table 3-54 Table 3-55 Table 3-56 Table 3-57 Table 3-58 Table 3-59 Table 3-60 Table 3-61 Table 3-62 Table 3-63 Table 3-64 Table 5-1 Table 5-2 Table 6-1 Table 6-2 viii Encoding of the Data and Instruction Fault Status Registers . 3-27 MMU and MPU faults . 3-28 DEAPR and IEAPR instructions . 3-29 Encoding of the DEAPR and IEAPR . 3-30 Encoding of the extended access permission bit fields . 3-30 DSAPR and ISAPR instructions . 3-31 Encoding of the DSAPR and ISAPR . 3-32 Encoding of the standard access permission bit fields . 3-32 DFAR and IFAR instructions . 3-33 Protection Region Registers instructions . 3-34 Encoding of the Protection Region Registers . 3-34 Cache operation instructions . 3-36 Encoding of the cache operations bit fields in MVA format . 3-38 Encoding of the cache operation bit fields in set/way format . 3-39 TLB operation instructions . 3-40 Encoding of the invalidate single TLB entry bit fields . 3-40 DCache and ICache Lockdown Register instructions . 3-41 Encoding of the DCache and ICache Lockdown Registers . 3-42 DTCM and ITCM Region Register instructions . 3-44 Encoding of the DTCM and ITCM Region Registers . 3-45 TLB Lockdown Register instructions . 3-46 Encoding of the TLB Lockdown Register . 3-47 FCSE Process ID Register instructions . 3-49 Encoding of the FSCE Process ID Register . 3-49 Context ID Register instructions . 3-52 Debug Override Register instructions . 3-53 Encoding of the Debug Override Register . 3-54 Prefetch Unit Debug Override Register instructions . 3-55 Encoding of the Prefetch Unit Override Register . 3-56 Debug and Test Address Register instructions . 3-56 Memory Region Remap Register instructions . 3-57 Encoding of the Memory Region Remap Register . 3-58 Encoding of the remap fields . 3-58 MMU test operation instructions . 3-60 Encoding of the main TLB entry-select bit fields . 3-61 Encoding of the TLB MVA tag bit fields . 3-62 Encoding of the TLB entry PA and AP bit fields . 3-63 Encoding of the lockdown TLB entry-select bit fields . 3-64 Cache Debug Control Register instructions . 3-65 Encoding of the Cache Debug Control Register . 3-66 MMU Debug Control Register instructions . 3-67 Encoding of the MMU Debug Control Register . 3-68 CP15 instruction summary . 3-70 Penalty for a mispredicted branch . 5-4 ARM and Thumb branch instruction cycle counts . 5-6 DBIU transfer characteristics . 6-4 IBIU transfer characteristics . 6-5 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Table 6-3 Table 6-4 Table 6-5 Table 6-6 Table 6-7 Table 6-8 Table 6-9 Table 6-10 Table 6-11 Table 6-12 Table 7-1 Table 7-2 Table 7-3 Table 7-4 Table 7-5 Table 7-6 Table 7-7 Table 7-8 Table 7-9 Table 7-10 Table 7-11 Table 7-12 Table 7-13 Table 7-14 Table 7-15 Table 7-16 Table 7-17 Table 7-18 Table 7-19 Table 8-1 Table 8-2 Table 8-3 Table 8-4 Table 8-5 Table 8-6 Table 8-7 Table 8-8 Table 8-9 Table 8-10 Table 8-11 Table 8-12 Table 8-13 Table 8-14 Table 8-15 Table 9-1 Table 10-1 Table 10-2 ARM DDI 0244C 0244C Definition of variables in cache linefills with 64-bit interface . 6-9 Symbols used in linefill cycle counts with 64-bit AHB . 6-10 Definition of variables in cache linefills with 32-bit interface . 6-11 Symbols used in linefill cycle counts with a 32-bit AHB . 6-13 Definition of variables in castouts . 6-14 Symbols used in linefill cycle counts with 64-bit AHB . 6-15 Definition of variables in level 1 and level 2 table walks . 6-17 Symbols used in level 1 and level 2 table walk cycle counts . 6-19 Definition of variables in NC loads and NCNB stores . 6-20 Symbols used in NC load and NCNB store cycle counts . 6-21 Pipeline stages and active signals . 7-10 CPINSTR interactions with other signals . 7-12 CPINSTRV interactions with other signals . 7-14 CPVALIDD interactions with other signals . 7-15 CPLSLEN interactions with other signals . 7-18 CPLSSWP interactions with other signals . 7-19 CPLSDBL interactions with other signals . 7-20 Hold signals summary . 7-22 ASTOPCPD interactions with other signals . 7-23 ASTOPCPE interactions with other signals . 7-25 LSHOLDCPE interactions with other signals . 7-27 LSHOLDCPM interactions with other signals . 7-29 CPBUSYE interactions with other signals . 7-31 CPLSBUSY interactions with other signals . 7-39 ACANCELCP interactions with other signals . 7-40 AFLUSHCP interactions with other signals . 7-44 CPBOUNCEE interactions with other signals . 7-48 STCMRCDATA interactions with signals . 7-53 LDCMRCDATA interactions with signals . 7-54 CP14 registers and scan chain numbers . 8-4 Debug ID Register instructions . 8-6 Encoding of the Debug ID Register . 8-7 Debug Status and Control Register instructions . 8-7 Encoding of Debug Status and Control Register . 8-8 DSCR bits from the core . 8-10 Data Transfer Register instructions . 8-11 Breakpoint Address Register instructions . 8-12 Breakpoint Control Register instructions . 8-13 Encoding of Breakpoint Control Registers . 8-14 Watchpoint Address Register instructions . 8-15 Watchpoint Control Register instructions . 8-15 Encoding of Watchpoint Control Registers . 8-16 Read PC value after debug state entry . 8-20 Link register values after exceptions . 8-24 Supported public JTAG instructions . 9-4 CP15 MMU registers . 10-6 Access type encoding in a level 1 descriptor . 10-11 Copyright © 2003 ARM Limited. All rights reserved. ix Table 10-3 Table 10-4 Table 10-5 Table 10-6 Table 10-7 Table 10-8 Table 10-9 Table 11-1 Table 11-2 Table 12-1 Table 12-2 Table 12-3 Table 12-4 Table 12-5 Table 12-6 Table 12-7 Table 12-8 Table 12-9 Table 12-10 Table 12-11 Table 12-12 Table 14-1 Table 14-2 Table 14-3 Table 14-4 Table 14-5 Table 14-6 Table 16-1 Table 17-1 Table 17-2 Table 17-3 Table 17-4 Table 18-1 Table 20-1 Table 20-2 Table 20-3 Table 20-4 Table 20-5 Table 20-6 Table 20-7 Table 20-8 Table 20-9 Table 20-10 Table 20-11 Table 20-12 Table 21-1 Table 21-2 x Access type encoding in a coarse page table descriptor . 10-15 Access type encoding in a fine page table descriptor . 10-20 Domain access encoding . 10-26 MMU memory access control . 10-27 C and B bit access control . 10-28 MMU faults . 10-33 MMU TLB parity interfaces . 10-35 CP15 MPU registers . 11-4 MPU faults . 11-9 Enabling the ICache with the processor configured for MMU operation . 12-3 Enabling the ICache with the processor configured for MPU operation . 12-3 Enabling the DCache with the processor configured for MMU operation . 12-4 Enabling the DCache with the processor configured for MPU operation . 12-4 Enabling data caching and buffering with the C and B bits . 12-5 Priorities of instruction accesses to the TCMs and caches . 12-6 Priorities of data accesses to the TCMs and caches . 12-6 Cache size and number of sets . 12-8 ICache and DCache size configurations . 12-9 Aborts on linefills and castouts . 12-10 ICache parity interfaces . 12-13 DCache parity interfaces . 12-14 Worst-case interrupt latency cycle count . 14-3 Tuning interrupt latency with a 1:1 HCLK-to-CLK ratio . 14-4 Tuning interrupt latency with a 4:1 HCLK-to-CLK ratio . 14-5 LDM restricted to nine registers . 14-5 TLB locking and write-through caches . 14-6 LDM restricted to nine registers, TLB locking, and write-through caches . 14-6 External abort summary . 16-3 ITCM initialization . 17-3 TCM mapping of chip select and byte enable mapping . 17-6 ITCM parity interface . 17-16 DTCM parity interface . 17-17 VIC port signals . 18-3 Selecting mode of operation of dedicated wrapper cells . 20-4 Wrapper scan chains . 20-6 Test port signals during internal test . 20-9 Test port connections in internal test mode . 20-10 Test port connections in functional mode . 20-11 Test port connections in external test mode . 20-12 MBIST interface in test mode . 20-13 MBISTTX external interface . 20-15 MBISTRXCGR[2:0] and MBISTRXTCM[2:0] external interface . 20-16 Memory test interface cycle counts . 20-22 Scanout formats of fail data . 20-24 Array enables . 20-25 Subcategories of data processing instructions . 21-5 Cycle counts of data processing instructions . 21-5 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Table 21-3 Table 21-4 Table 21-5 Table 21-6 Table 21-7 Table 21-8 Table 21-9 Table 21-10 Table 21-11 Table 21-12 Table 21-13 Table 21-14 Table 21-15 Table 21-16 Table 21-17 Table A-1 Table A-2 Table A-3 Table A-4 Table A-5 Table A-6 Table A-7 Table A-8 Table A-9 Table A-10 ARM DDI 0244C 0244C Cycle counts of multiply instructions . 21-7 Cycle counts of branch instructions . 21-8 Cycle counts of MRS and MSR instructions . 21-9 Cycle counts of load instructions . 21-10 Cycle counts of store instructions . 21-12 Cycle counts of load multiple and store multiple instructions . 21-14 Cycle counts of preload instructions . 21-15 Cycle counts of coprocessor instructions . 21-15 Cycle counts of swap instructions . 21-16 Cycle counts of Thumb data processing instructions . 21-17 Cycle count of the Thumb multiply instruction . 21-19 Cycle counts of Thumb branch instructions . 21-19 Cycle counts of Thumb load instructions . 21-20 Cycle counts of Thumb store instruction . 21-20 Cycle counts of Thumb load/store multiple instructions . 21-21 AHB signals . A-2 Coprocessor signals . A-7 Debug interface signals . A-9 DFT signals . A-10 MBIST signals . A-11 ETM signals . A-12 TCM signals . A-13 Interrupt signals . A-15 Memory parity signals . A-16 Other signals . A-17 Copyright © 2003 ARM Limited. All rights reserved. xi xii Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C List of Figures ARM1026EJ-S ARM1026EJ-S Technical Reference Manual Figure 1-1 Figure 2-1 Figure 2-2 Figure 2-3 Figure 2-4 Figure 2-5 Figure 2-6 Figure 3-1 Figure 3-2 Figure 3-3 Figure 3-4 Figure 3-5 Figure 3-6 Figure 3-7 Figure 3-8 Figure 3-9 Figure 3-10 Figure 3-11 Figure 3-12 Figure 3-13 Figure 3-14 Figure 3-15 ARM DDI 0244C 0244C Key to timing diagram conventions . xxi ARM1026EJ-S ARM1026EJ-S processor block diagram . 1-5 Integer core block diagram . 2-3 Pipeline stages of the ARM1026EJ-S ARM1026EJ-S processor . 2-5 Pipeline stages of a typical ALU operation . 2-7 Pipeline stages of a typical multiply operation . 2-7 Pipeline stages of a load or store operation . 2-9 Pipeline stages of a load multiple or store multiple operation . 2-10 Program Status Registers . 3-3 CP15 MCR and MRC instruction format . 3-5 Device ID Register . 3-10 Cache Type Register . 3-11 TCM Status Register . 3-13 Control Register . 3-15 Auxiliary Control Register . 3-19 Translation Table Base Register . 3-20 DCache and ICache Configuration Registers . 3-22 Domain Access Control Register . 3-23 Write Buffer Control Register . 3-25 Data and Instruction Fault Status Registers . 3-27 Data and Instruction Extended Access Permission Registers . 3-29 Data and Instruction Standard Access Permission Registers . 3-31 Data and Instruction Fault Address Registers . 3-33 Copyright © 2003 ARM Limited. All rights reserved. xiii Figure 3-16 Figure 3-17 Figure 3-18 Figure 3-19 Figure 3-20 Figure 3-21 Figure 3-22 Figure 3-23 Figure 3-24 Figure 3-25 Figure 3-26 Figure 3-27 Figure 3-28 Figure 3-29 Figure 3-30 Figure 3-31 Figure 3-32 Figure 3-33 Figure 3-34 Figure 3-35 Figure 3-36 Figure 4-1 Figure 4-2 Figure 4-3 Figure 6-1 Figure 6-2 Figure 6-3 Figure 6-4 Figure 6-5 Figure 6-6 Figure 6-7 Figure 6-8 Figure 6-9 Figure 6-10 Figure 7-1 Figure 7-2 Figure 7-3 Figure 7-4 Figure 7-5 Figure 7-6 Figure 7-7 Figure 7-8 Figure 7-9 Figure 7-10 Figure 7-11 Figure 7-12 Figure 7-13 xiv Protection Region Registers 0-7 . 3-34 Rd format for cache operations in MVA format . 3-38 Rd format for cache operations in set/way format . 3-39 Rd format for invalidate single TLB entry operations . 3-40 DCache and ICache Lockdown Registers . 3-41 DTCM and ITCM Region Registers . 3-44 TLB Lockdown Register . 3-47 FSCE Process ID Register . 3-49 FCSE address mapping . 3-50 Context ID Register . 3-52 Debug Override Register . 3-53 Prefetch Unit Debug Override Register . 3-55 Debug and Test Address Register . 3-56 Memory Region Remap Register . 3-57 Memory region attribute resolution . 3-59 Rd format for selecting main TLB entry . 3-60 Rd format for accessing MVA tag of main or lockdown TLB entry . 3-61 Rd format for accessing PA and AP data of main or lockdown TLB entry . 3-62 Rd format for selecting lockdown TLB entry . 3-64 Cache Debug Control Register . 3-65 MMU Debug Control Register . 3-68 HCLK derivation . 4-2 TCK derivation . 4-3 HRESETn assertion . 4-4 Cache linefill cycle count with 64-bit AHB . 6-10 Cache linefill cycle count with 32-bit AHB . 6-12 Cache castout cycle count with 64-bit AHB interface . 6-15 Cache castout cycle count with 32-bit AHB interface . 6-16 Level 1 and level 2 table walk cycle count . 6-18 Cycle count of NC loads and NCNB stores with one data phase . 6-21 Cycle count of NC loads and NCNB stores with two data phases . 6-22 Bus interface block diagram . 6-23 Endianness of byte lane strobes . 6-24 AHB bus alignment . 6-26 ARM1026EJ-S ARM1026EJ-S and CP pipeline stages . 7-2 ARM1026EJ-S ARM1026EJ-S coprocessor inputs . 7-6 Instruction issue example . 7-16 ASTOPCPD example . 7-24 ASTOPCPE example . 7-26 LSHOLDCPE example . 7-28 LSHOLDCPM example . 7-30 CPBUSYE example . 7-32 CPBUSYE ignored due to ASTOPCPD assertion . 7-33 CPBUSYE asserted before ASTOPCPD . 7-33 ASTOPCPD with CPBUSYE . 7-34 CPBUSYE ignored due to ASTOPCPE assertion . 7-35 CPBUSYE asserted before ASTOPCPE . 7-35 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Figure 7-14 Figure 7-15 Figure 7-16 Figure 7-17 Figure 7-18 Figure 7-19 Figure 7-20 Figure 7-21 Figure 7-22 Figure 7-23 Figure 8-1 Figure 8-2 Figure 8-3 Figure 8-4 Figure 8-5 Figure 8-6 Figure 8-7 Figure 8-8 Figure 9-1 Figure 9-2 Figure 9-3 Figure 9-4 Figure 9-5 Figure 9-6 Figure 9-7 Figure 9-8 Figure 9-9 Figure 9-10 Figure 9-11 Figure 10-1 Figure 10-2 Figure 10-3 Figure 10-4 Figure 10-5 Figure 10-6 Figure 10-7 Figure 10-8 Figure 10-9 Figure 10-10 Figure 10-11 Figure 10-12 Figure 10-13 Figure 11-1 Figure 11-2 Figure 12-1 Figure 17-1 Figure 17-2 ARM DDI 0244C 0244C I2 held up by ASTOPCPE and CPBUSYE . 7-36 I1 held up by ASTOPCPE and I2 held up by CPBUSYE . 7-37 I1 held up by CPBUSYE and I2 held up by ASTOPCPD . 7-38 ACANCELCP example . 7-41 ACANCELCP with ASTOPCPE example . 7-42 ACANCELCP with CPBUSYE example . 7-43 AFLUSHCP example . 7-45 CPBOUNCEE example . 7-50 CPBOUNCEE with ASTOPCPE example . 7-51 CPBOUNCEE with CPBUSYE example . 7-52 Debug ID Register . 8-6 Debug Status and Control Register . 8-8 Data Transfer Register . 8-11 Breakpoint Address Registers . 8-12 Breakpoint Control Registers . 8-13 Watchpoint Address Registers . 8-15 Watchpoint Control Registers . 8-16 Comms channel output . 8-26 JTAG DBGTAP state diagram . 9-2 Bypass Register bit order . 9-7 TAP ID Register . 9-8 TAP ID Register bit order . 9-8 Instruction Register bit order . 9-9 Scan Chain Select Register bit order . 9-10 Scan chain 0 bit order . 9-10 Scan chain 1 bit order . 9-10 Scan chain 2 bit order . 9-11 Scan chain 4 bit order . 9-12 Scan chain 5 bit order . 9-14 Address translation . 10-9 Translating a level 1 descriptor address . 10-10 Level 1 descriptor formats . 10-11 Translating a section base address . 10-12 Level 2 descriptor formats . 10-13 Translating a coarse page table address . 10-14 Translating a large page or subpage address from a coarse page table . 10-16 Translating a small page or subpage address from a coarse page table . 10-18 Translating a fine page table address . 10-19 Translating a large page or subpage address from a fine page table . 10-21 Translating a small page or subpage address from a fine page table . 10-23 Translating a tiny page address . 10-25 Fault checking flowchart . 10-31 MPU block diagram . 11-2 Overlapping protection regions . 11-8 Cache read block diagram . 12-7 TCM interface timing . 17-5 TCM controller and DMA arbitration state diagram . 17-8 Copyright © 2003 ARM Limited. All rights reserved. xv Figure 17-3 Figure 17-4 Figure 17-5 Figure 17-6 Figure 17-7 Figure 17-8 Figure 17-9 Figure 17-10 Figure 18-1 Figure 18-2 Figure 19-1 Figure 19-2 Figure 19-3 Figure 19-4 Figure 20-1 Figure 20-2 Figure 20-3 Figure 20-4 Figure 20-5 Figure 20-6 Figure 20-7 Figure 20-8 Figure 20-9 Figure 20-10 Figure 20-11 Figure 20-12 Figure 20-13 Figure 20-14 Figure 20-15 Figure 20-16 Figure 20-17 Figure 20-18 Figure 20-19 Figure 20-20 Figure 20-21 Figure 21-1 xvi TCM reads with zero wait states . 17-10 TCM reads with one wait state . 17-11 TCM reads with four wait states . 17-11 TCM writes with zero wait states . 17-12 TCM writes with one wait state . 17-13 TCM writes with two wait states . 17-14 TCM reads and writes with wait states of varying length . 17-14 TCM and DMA interaction . 17-15 VIC port timing example with HCLK:CLK = 1:1 . 18-4 VIC port timing example with HCLK:CLK = 2:1 . 18-5 Using STANDBYWFI to control system clocks . 19-3 Deassertion of STANDBYWFI after an IRQ interrupt . 19-4 Using STANDBYWFI to control ARM1026EJ-S ARM1026EJ-S clocks . 19-4 Cache power-down . 19-5 Dedicated input wrapper cell . 20-2 Dedicated output wrapper cell . 20-3 Shared input wrapper cell . 20-3 Shared output wrapper cell . 20-3 Wrapper segments . 20-5 HWDATA bus output ports . 20-5 HRDATA bus input ports . 20-6 Wrapper falling-edge logic . 20-7 Reset synchronizer . 20-7 RSTSAFE signal . 20-8 Reset wrapper cell . 20-8 MBIST block diagram . 20-13 ATPG view of read datapath . 20-17 Chip-select implementation example . 20-18 Data RAM MBIST arrays . 20-19 Instruction RAM MBIST arrays . 20-20 MMU RAM MBIST array . 20-20 TCM MBIST array . 20-21 MBIST Instruction Register . 20-23 MBIST test start waveforms . 20-25 MBIST test end waveforms . 20-26 Pipeline forwarding paths . 21-23 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Preface This preface introduces the ARM1026EJ-S ARM1026EJ-S r0p2 Technical Reference Manual. It contains the following sections: · About this document on page xviii · Feedback on page xxiv. ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. xvii Preface About this document This is the technical reference manual for the ARM1026EJ-S ARM1026EJ-S r0p2 processor. Intended audience This document is written to help designers develop systems around the ARM1026EJ-S ARM1026EJ-S processor. Using this document This document is organized into the following chapters: Chapter 1 Introduction Learn about the features and components of the ARM1026EJ-S ARM1026EJ-S processor. Chapter 2 Integer Core Learn how overlapping pipeline stages and simultaneous execution of instructions achieve a peak throughput of one instruction per cycle. Chapter 3 Programmer's Model Learn how to use CP15 registers to configure, control, and monitor the ARM1026EJ-S ARM1026EJ-S system. Chapter 4 Clocking and Reset Timing Learn about the clock signals and clock enable signals that control the ARM1026EJ-S ARM1026EJ-S integer unit and the AHB and JTAG interfaces. Chapter 5 Prefetch Unit Learn how the ARM1026EJ-S ARM1026EJ-S processor prefetches and buffers instructions, predicts branches and subroutine calls and returns, and how instruction memory barriers flush the prefetch buffer. Chapter 6 Bus Interface Learn how the separate instruction and data bus interfaces handle AMBATM transfers. Chapter 7 Coprocessor Interface Learn how multiple coprocessors interact with the ARM1026EJ-S ARM1026EJ-S processor. xviii Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Preface Chapter 8 Debug Learn about the ARM1026EJ-S ARM1026EJ-S debug functionality. Chapter 9 Debug Test Access Port Learn about the JTAG-based ARM1026EJ-S ARM1026EJ-S Debug Test Access Port (DBGTAP). Chapter 10 Memory Management Unit Learn how the MMU translates modified virtual addresses to physical addresses and controls access to external memory. Chapter 11 Memory Protection Unit Learn to partition external memory into protection regions with different sizes and access attributes. Chapter 12 Caches Learn about cache structure and operation, including CP15 cache operations and cache and TCM priorities. Chapter 13 Pending Write Buffer Learn about the programmable eight-entry buffer for loads and stores and the parallel eviction buffer. Chapter 14 Interrupt Latency Learn to calculate latency from a worst-case example and to use techniques for improving latency. Chapter 15 Noncachable Instruction Fetches Learn how to use the noncachable instruction prefetch buffer to support speculative prefetching and instruction streaming. Chapter 16 External Aborts Learn how the ARM1026EJ-S ARM1026EJ-S processor handles and reports precise and imprecise aborts on critical and noncritical words. Chapter 17 Tightly-Coupled Memories Learn to initialize and operate the ITCM and DTCM and see examples of the timing of TCM transactions. Chapter 18 Vectored Interrupt Controller Port Learn how to connect an external VIC and to enable the ARM1026EJ-S ARM1026EJ-S processor to read IRQ address vectors from the VIC port. ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. xix Preface Chapter 19 Power Management Learn to use dynamic power management to idle all external interfaces and static power management to turn off cache and MMU RAMs. Chapter 20 Design for Test Learn to integrate the ARM1026EJ-S ARM1026EJ-S DFT and MBIST features into an SoC. Chapter 21 Instruction Cycle Count Learn the cycle-by-cycle behavior of the ARM and ThumbTM instruction sets. Appendix A Signal Descriptions Refer to Appendix A for a summary of ARM1026EJ-S ARM1026EJ-S processor signals. Product revision status The rnpn identifier indicates the revision status of the product described in this document, where: rn Identifies the major revision of the product. pn Identifies the minor revision or modification status of the product. Typographical conventions The following typographical conventions are used in this book: italic bold Denotes signal names. Also used for terms in descriptive lists, where appropriate. monospace Denotes text that can be entered at the keyboard, such as commands, file and program names, and source code. monospace Denotes a permitted abbreviation for a command or option. The underlined text can be entered instead of the full command or option name. monospace italic Denotes arguments to commands and functions where the argument is to be replaced by a specific value. monospace bold xx Introduces special terminology. Also denotes cross-references. Denotes language keywords when used outside example code. Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Preface Timing diagram conventions The figure explains the symbols used in timing diagrams. Any variations are clearly labeled when they occur. Therefore, you must attach no additional meaning unless specifically stated. Clock HIGH to LOW Transient HIGH/LOW to HIGH Bus stable Bus to high impedance Bus change High impedance to stable bus Key to timing diagram conventions Shaded bus and signal areas are undefined, so the bus or signal can assume any value within the shaded area at that time. The actual level is unimportant and does not affect normal operation. ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. xxi Preface Register notation conventions The table shows the terms and abbreviations used in register descriptions. In all cases, reading or writing any fields, including those specified as Unpredictable, Should Be One, or Should Be Zero, does not cause any physical damage to the chip. Register notation conventions Term Description Unpredictable (UNP) Reading returns an Unpredictable value. Writing causes Unpredictable behavior or an Unpredictable change in device configuration. Undefined (UND) An instruction that accesses this field in the manner indicated takes the Undefined instruction trap. Should Be Zero (SBZ) When writing to this field, write only zeros. Writing ones has Unpredictable results. Should Be One (SBO) xxii When writing to this field, write only ones. Writing zeros has Unpredictable results. Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Preface Further reading This section lists publications by ARM Limited and by third parties. ARM periodically provides updates and corrections to its documentation. See http://www.arm.com for current errata sheets and addenda, and the ARM Frequently Asked Questions list. ARM publications This document contains information that is specific to the ARM1026EJ-S ARM1026EJ-S processor. Refer to the following documents for other relevant information: · ARM Architecture Reference Manual (ARM DDI 0100) · ARM AMBA Specification (ARM IHI 0001) · ARM102600E ARM102600E Test Chip Implementation Guide (ARM DXI 0143) · ARM VFP10 VFP10 Technical Reference Manual (ARM DDI 0106) · ARM ETM10RV ETM10RV Technical Reference Manual (ARM DDI 0245) · Jazelle VI Architecture Reference Manual (ARM DDI 0225). Other publications This section lists relevant documents published by third parties: · IEEE Standard, Test Access Port and Boundary-Scan Architecture specification 1149.1-1990 (JTAG). ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. xxiii Preface Feedback ARM Limited welcomes feedback both on the ARM1026EJ-S ARM1026EJ-S processor, and on the documentation. Feedback on the ARM1026EJ-S ARM1026EJ-S processor If you have any comments or suggestions about this product, contact your supplier giving: · the product name · a concise explanation of your comments. Feedback on this document If you have any comments on this document, send email to errata@arm.com giving: · the document title · the document number · the page number(s) to which your comments refer · a concise explanation of your comments. General suggestions for additions and improvements are also welcome. xxiv Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Chapter 1 Introduction This chapter describes the components and features of the ARM1026EJ-S ARM1026EJ-S processor. It contains the following sections: · About the processor on page 1-2 · Components of the processor on page 1-4 · Silicon revision information on page 1-10. ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. 1-1 Introduction 1.1 About the processor The ARM1026EJ-S ARM1026EJ-S processor is a member of the ARM10 ARM10 family and implements the ARMv5TEJ architecture. It is a high-performance, low-power, cached processor that provides full virtual memory capabilities. It is designed to run high-end embedded applications and sophisticated operating systems such as Linux, Microsoft WindowsCE, NetBSD, and EPOC-32 EPOC-32 from Symbian. It supports the 32-bit ARM, 16-bit Thumb®, and 8-bit JazelleTM instruction sets. The synthesizable ARM1026EJ-S ARM1026EJ-S processor consists of: · · CP14 debug coprocessor and CP15 system control coprocessor · external coprocessor interface for application-specific acceleration hardware · Memory Management Unit (MMU) or Memory Protection Unit (MPU) · separate ICache and DCache configurable to 0KB or 4KB-128KB 4KB-128KB sizes · Tightly Coupled Memory (TCM) interface with: - separate externally-instantiated instruction and data TCMs configurable to 0KB or 4KB-1MB sizes - zero-wait-state memory support - DMA support · write-back Physical Address (PA) TAG RAM · pending write buffer · separate Advanced Micro Bus Architecture (AMBA) High-performance Bus (AHB) instruction and data bus interfaces with independently configurable 32-bit or 64-bit widths · Embedded Trace Macrocell (ETM) interface · 1-2 the ARM10EJ-S ARM10EJ-S integer core - prefetch unit - integer unit - load/store unit - EmbeddedICE-RTTM logic for JTAG-based debug Vectored Interrupt Controller (VIC) port. Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Introduction Features of the ARM1026EJ-S ARM1026EJ-S processor include: · a six-stage pipeline · branch prediction that supports branch folding (zero-cycle branches) · full 64-bit interfaces between the integer core and: - caches - pending write buffer - bus interface unit instruction side and data side - coprocessors · multilayer AHB support through independent 32-bit or 64-bit AHB interfaces for instruction and data sides · power management support · enhanced debug support. See the ARM Architecture Reference Manual for a detailed ARM1026EJ-S ARM1026EJ-S instruction set specification. ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. 1-3 Introduction 1.2 Components of the processor The main blocks of the ARM1026EJ-S ARM1026EJ-S processor are: · Integer core on page 1-6 · Memory management unit on page 1-6 · Memory protection unit on page 1-6 · Instruction and data caches and pending write buffer on page 1-7 · Instruction and data TCMs on page 1-7 · Branch prediction and prefetch unit on page 1-8 · AMBA interface on page 1-8 · Coprocessor interface on page 1-8 · Debug on page 1-8 · Instruction cycle summary and interlocks on page 1-8 · Design-for-test features on page 1-9 · Power management on page 1-9 · Clocking and reset on page 1-9 · ETM interface logic on page 1-9. Figure 1-1 on page 1-5 shows the structure of the ARM1026EJ-S ARM1026EJ-S processor. 1-4 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Introduction DRDATA IRDATA DWD TCM interface DRAM IPA DPA DEXT DROUTE IRDATA DRDATA DEXTRD DCRD ETM interface DRD DWD DWD IA FCSE External coprocessor interface DWD DEXTRD DPA DMVA IMVA IPA IMVA IRD CPINSTR LDCMCRDATA STCMRCDATA ARM10EJ-S ARM10EJ-S integer core DMVA DA DPA IRDATA IEXTRD ICRD DEXTBIUA DBIU DEXTBIUWD DBIURD Data AHB interface MMU or MPU Main TLB and lockdown TLB MPU entries in lockdown TLB DBIURD AHB 64/32 MMUBIUA DCACHE DPA IROUTE NCB, NCNB, WT, WB (miss) pending write buffer IRAM DMVA DCRD DWD PA tag RAM DCACHEBIUA Eviction write buffer Linefill buffer IMVA ICRD IPA DBIURD IBIURD ICACHE IEXTRD IPA DCACHEBIUWD IEXT ICACHEBIUA IBIURD IBIU Instruction AHB AHB interface 64/32 Figure 1-1 ARM1026EJ-S ARM1026EJ-S processor block diagram ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. 1-5 Introduction 1.2.1 Integer core The ARM1026EJ-S ARM1026EJ-S processor is built around the ARM10EJ-S ARM10EJ-S integer core in an ARMv5TEJ implementation that runs the 32-bit ARM, 16-bit Thumb, and 8-bit Jazelle instruction sets. You can balance high performance against code size and extract maximum performance from 8-bit, 16-bit, and 32-bit memory. The processor contains EmbeddedICE-RT logic and a JTAG debug interface to enable hardware debuggers to communicate with the processor. See Chapter 2 Integer Core for details of the pipeline stages and instruction progression. See Chapter 3 Programmer's Model for system coprocessor programming information. 1.2.2 Memory management unit The Memory Management Unit (MMU) has a single Translation Lookaside Buffer (TLB) for both instructions and data. The MMU includes a 1KB tiny page mapping size to enable a smaller RAM and ROM footprint for embedded systems and operating systems such as WindowsCE that have many small mapped objects. The ARM1026EJ-S ARM1026EJ-S processor implements the Fast Context Switch Extension (FCSE) and high vectors extension that are required to run Microsoft WindowsCE. See Chapter 10 Memory Management Unit for more information. Enable the MMU by tying the MMUnMPU pin HIGH. 1.2.3 Memory protection unit The Memory Protection Unit (MPU) enables you to partition external memory into eight protection regions. The protection regions can have different sizes and protection attributes. Enable the MPU by tying the MMUnMPU pin LOW. 1-6 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Introduction 1.2.4 Instruction and data caches and pending write buffer The ARM1026EJ-S ARM1026EJ-S Instruction Cache (ICache) and Data Cache (DCache) are configurable to 0KB or 4KB-128KB 4KB-128KB in powers of two. The DCache regions are individually programmable for Write-Through (WT) or Write-Back (WB) operation. Configuring large caches enables you to obtain high performance from memory systems by reducing: · the read bandwidth required of main memory · the write bandwidth required of main memory when write-back caching is used · overall system power consumption by reducing accesses to off-chip memory. The ARM1026EJ-S ARM1026EJ-S pending write buffer holds up to eight 8, 16, 32, or 64-bit values, each at an independent or sequential address. See Chapter 12 Caches and Chapter 13 Pending Write Buffer for more information. 1.2.5 Instruction and data TCMs You can individually configure the Instruction TCM (ITCM) and Data TCM (DTCM) sizes with sizes of 0KB or 4KB-1MB anywhere in the memory map. For flexibility in optimizing the TCM subsystem for performance, power, and RAM type, the TCMs are external to the ARM1026EJ-S ARM1026EJ-S processor. The INITRAM pin enables booting from the ITCM. Both the ITCM and DTCM support wait states and DMA activity. See Chapter 17 Tightly-Coupled Memories for more information. ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. 1-7 Introduction 1.2.6 Branch prediction and prefetch unit The prefetch unit is part of the ARM10EJ-S ARM10EJ-S integer core. It fetches instructions from the ICache, ITCM, or from external memory and predicts the outcome of branches in the instruction stream. Refer to Chapter 5 Prefetch Unit for more information. 1.2.7 AMBA interface The bus interface unit provides a multimaster AHB interface to memory and peripherals. The AHB is an on-chip multilayer bus with configurable 32-bit or 64-bit data buses. On the data side, the address bus is 32 bits wide, and the data buses are configurable as: · a 64-bit read data bus plus a 64-bit write data bus · a 32-bit read data bus plus a 32-bit write data bus. On the instruction side, the address bus is 32 bits wide, and the read data bus is configurable to 32 or 64 bits. See Chapter 6 Bus Interface for more information. 1.2.8 Coprocessor interface Chapter 7 Coprocessor Interface describes the interface for on-chip coprocessors such as floating-point or other application-specific hardware acceleration units. 1.2.9 Debug The debug coprocessor, CP14, implements a full range of debug features described in Chapter 8 Debug and Chapter 9 Debug Test Access Port. 1.2.10 Instruction cycle summary and interlocks Chapter 21 Instruction Cycle Count describes instruction cycles and gives examples of interlock timing. 1-8 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Introduction 1.2.11 Design-for-test features The ARM1026EJ-S ARM1026EJ-S processor is designed to be embedded into large System-On-a Chip (SoC) designs. The EmbeddedICE-RT logic debug facilities, AMBA on-chip system bus, and test methodology are all designed for efficient use of the processor when integrated into a larger IC. See Chapter 20 Design for Test for details of testing. 1.2.12 Power management Power management features are described in Chapter 19 Power Management. 1.2.13 Clocking and reset The ARM1026EJ-S ARM1026EJ-S processor has one clock input, CLK. The design is fully static. When CLK is stopped, the internal state of the processor is preserved indefinitely. CLK drives the internal logic in the processor and both AHB interfaces. To enable the data and instruction interfaces of the AHB to run at synchronous multiples of CLK, the AHB interfaces have separate clock enable signals, HCLKEND and HCLKENI. See Chapter 4 Clocking and Reset Timing for details. 1.2.14 ETM interface logic An optional external ETM can be connected to the ARM1026EJ-S ARM1026EJ-S processor to provide real-time tracing of instructions and data in an embedded system. The processor includes the logic and interface to enable you to trace program execution and data transfers using the ETM10RV ETM10RV. Further details are in the Embedded Trace Macrocell Specification. See Table A-6 on page A-12 for descriptions of ETM-related signals. ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. 1-9 Introduction 1.3 Silicon revision information This manual is for revision r0p2 of the ARM1026EJ-S ARM1026EJ-S processor. See Product revision status on page xx for details of revision numbering. Updates in the r0p1 ARM1026EJ-S ARM1026EJ-S processor are: · corrections for r0p0 errata · update to the AHB address bus during IDLE cycles in locked SWP instructions so that the address bus maintains the same value during the locked period · update to the CP15 c0 Device ID Register to reflect the r0p1 release. There are no other functional differences between the ARM1026EJ-S ARM1026EJ-S r0p0 and ARM1026EJ-S ARM1026EJ-S r0p1 processors. Updates in the r0p2 ARM1026EJ-S ARM1026EJ-S processor are: · corrections for r0p1 errata · update to the CP15 c0 Device ID Register to reflect the r0p2 release. There are no other functional differences between the ARM1026EJ-S ARM1026EJ-S r0p1 and ARM1026EJ-S ARM1026EJ-S r0p2 processors. 1-10 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Chapter 2 Integer Core This chapter describes the ARM1026EJ-S ARM1026EJ-S integer core. It contains the following sections: · About the integer core on page 2-2 · Pipeline on page 2-4 · Prefetch unit on page 2-6 · Typical ALU/multiply operations on page 2-7 · Load/store unit on page 2-8 · Typical load/store operations on page 2-9. ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. 2-1 Integer Core 2.1 About the integer core By overlapping the stages of operation, the integer core increases the number of instructions executed per cycle. The integer core has multiple execution units, enabling multiple instructions to exist in the same pipeline stage, and enabling simultaneous execution of some instructions. As a result, it delivers a peak throughput of one instruction per cycle. The integer core consists of: Prefetch unit The prefetch unit fetches instructions from the ICache, ITCM, or external memory. To reduce the number of pipeline refills, it predicts the outcome of branches whenever it can. Integer unit The integer unit decodes instructions sent from the prefetch unit. It contains the barrel shifter, Arithmetic Logic Unit (ALU), and multiplier, and executes data processing instructions such as MOV, ADD, and MUL. The integer unit helps the load/store unit to execute loads, stores, and coprocessor transfer instructions such as LDR, STM, LDC, and MCRR. It also contains the main instruction sequencer that takes care of multicycle data processing instructions, mode changes, exceptions, and debug events. Load/store unit If the data address is 64-bit aligned, the Load/Store Unit (LSU) can load or store two registers (64 bits) per cycle. In a load or store multiple instruction (LDM or STM), the LSU remains in lockstep with the integer unit for the duration of the LDM or STM. Note Unlike the ARM1020E ARM1020E and ARM1022E ARM1022E processors, the ARM1026EJ-S ARM1026EJ-S LSU does not support Hit-Under-Miss (HUM) operation. Figure 2-1 on page 2-3 shows the integer core components. 2-2 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Prefetch unit Force prefetch 1 Force prefetch 2 PC queue PC Branch predictor/ return stack Misprediction Branch phantom Integer unit W Shift and ALU A Prefetch buffer Multiplier Instruction B Register bank Load/store unit Decoded load/store instruction Force prefetch 3 Rotate and sign extend L1 S1 Halfword replicate Rotate and sign extend L2 S2 Halfword replicate Data write data (DWD)Data address (DA) Data read data (DRD) 64 bits Instruction read data (IRD) 64 bits Instruction address (IA) Integer Core Figure 2-1 Integer core block diagram ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. 2-3 Integer Core 2.2 Pipeline The ARM1026EJ-S ARM1026EJ-S pipeline has six stages to maximize instruction throughput: Fetch ICache access. Branch prediction for instructions that have already been fetched. Prediction of fetch path ahead of execution of branch instructions. The Fetch stage uses a First-In-First-Out (FIFO) prefetch buffer that can hold up to four instructions. Issue Initial instruction decode. Can contain one instruction with up to one branch in parallel. Decode Final instruction decode, register reads for ALU operation, data access address calculation, forwarding, and initial interlock resolution. Can contain one instruction with up to one branch in parallel. Execute Data processing shift, shift and saturate, ALU operation, first stage of multiplications, flag setting, condition code check, branch mispredict detection, first stage of store data register read, and DCache access request. Memory Second stage of multiplications and saturations, second stage of store data register read, and DCache memory access. Write Byte rotation, sign extension, register writes, and instruction retirement. The Execute, Memory, and Write stages can simultaneously contain the following: · a predicted branch · an ALU, multiply or load/store instruction. Figure 2-2 on page 2-5 shows the stages of the ARM1026EJ-S ARM1026EJ-S pipeline. 2-4 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Integer Core Fetch Decode Execute Memory Write Static branch prediction ARM/Thumb/ Jazelle main instruction decode Secondary instruction decode ALU operation/ Shift Saturation ALU/MUL register write Register read Multiply(1) Data address calculation ALU pipeline Issue Store data register read(1) Store data register read(2) Data cache request Data cache access Return stack Instruction fetch LSU pipeline Multiply(2) Byte rotate/ Sign extension LSU register write Figure 2-2 Pipeline stages of the ARM1026EJ-S ARM1026EJ-S processor ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. 2-5 Integer Core 2.3 Prefetch unit The prefetch unit operates in the Fetch stage of the pipeline. It can fetch 64 bits every cycle from the ICache. It can only issue one 32-bit instruction per cycle to the integer unit. Because it can fetch more instructions than it can issue, the prefetch unit puts pending instructions in the prefetch buffer. While an instruction is in the prefetch buffer, the branch prediction logic can decode it to see if it is a predictable branch. Where possible, the branch prediction logic removes branches from the instruction stream. If the branch is predicted to be taken, then the instruction address is redirected to the branch target address. If the branch is predicted not to be taken, then the instruction address continues to progress through the instructions following the branch instruction. If the instruction following the branch is already in the prefetch buffer, it can be issued in place of the branch and the branch effectively takes no cycles. When there is not enough time to completely remove the branch, the fetch address is redirected anyway, because this still helps to reduce the branch penalty. The prefetch unit and branch prediction are described in detail in Chapter 5 Prefetch Unit. 2-6 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Integer Core 2.4 Typical ALU/multiply operations Figure 2-3 shows the stages of a typical data processing operation. ALU pipeline Main instruction decode ALU operation/ Shift Saturation Register write Cycle 3 Cycle 4 Cycle 5 Cycle 6 Not used Instruction fetch Not used Not used Not used Secondary instruction decode Register read Cycle 1 Cycle 2 LSU pipeline Figure 2-3 Pipeline stages of a typical ALU operation Figure 2-4 shows the stages of a typical multiply operation. The MUL loops in the Execute stage until it passes through the first part of the multiplier array enough times. Then it progresses to the Memory stage where it passes once through the second half of the array to produce the final result. Fetch Issue Decode Execute Memory Write Instruction fetch Main instruction decode Secondary instruction decode Multiply Multiply 2 Register write Cycle 3 Cycle 4, 5 Cycle 4, 5 Cycle 6, 7 Not used Not used Not used Not used ALU pipeline Register read Cycle 1 Cycle 2 LSU pipeline Figure 2-4 Pipeline stages of a typical multiply operation ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. 2-7 Integer Core 2.5 Load/store unit If the data address is 64-bit aligned, the LSU can load or store two 32-bit registers per transfer. This does not speed up single load or store instructions (LDR or STR) but it does considerably speed up load and store multiple instructions (LDM and STM). Load and store double instructions (LDRD and STRD) also take advantage of the available bandwidth. Accesses that are not 64-bit aligned have to take place over two cycles. If an LDM or STM address is not 64-bit aligned, then only one 32-bit register is transferred on the first access. After that, two registers per cycle can be transferred each cycle. Single loads and all cycles of multiple loads and stores work in cooperation with the integer unit. A DCache load access that misses stalls the LSU and integer unit until the data is returned from the cache. The LSU calculates the address for the data access using a dedicated adder. A separate adder in the ALU calculates a base register write-back value if it is required. The A and B register ports of the integer unit read the operands for both adders. For complex, scaled-register addressing modes that require the barrel shifter, the ALU has to calculate the shifted value. This costs one extra cycle. The LSU has two dedicated register bank read ports, S1 and S2, and two dedicated write ports, L1 and L2. These are used to read data to be stored and to write data that is loaded. 2-8 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Integer Core 2.6 Typical load/store operations Figure 2-5 shows a simple LDR/STR operation that hits in the DCache. Instruction fetch ALU pipeline Main instruction decode Secondary instruction decode Base register writeback Writeback value calculation Register read Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Data address calculation Cycle 1 Store data register read Memory access Loaded data register write Cycle 5 Cycle 6 LSU pipeline Memory request Cycle 3 Cycle 4 Figure 2-5 Pipeline stages of a load or store operation ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. 2-9 Integer Core Figure 2-6 shows the progression of an LDM/STM operation using the load/store pipeline to complete. The LDM/STM iterates in the LSU pipeline until it completes. Because any LDM/STM memory access can abort, the LSU stalls all integer pipeline activity until the last LDM/STM memory access completes. ALU pipeline Instruction fetch Main instruction decode Secondary instruction decode Base register writeback Writeback value calculation Register read Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Data address calculation Cycle 1 Store data register read Memory access Loaded data register write Cycle 5-8 Cycle 7-10 LSU pipeline Memory request Cycle 3-6 Cycle 4-7 Figure 2-6 Pipeline stages of a load multiple or store multiple operation See Chapter 21 Instruction Cycle Count for further details of instruction cycles and timing. 2-10 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Chapter 3 Programmer's Model This chapter describes the ARM1026EJ-S ARM1026EJ-S registers and provides information for programming the microprocessor. It contains the following sections: · About the programmer's model on page 3-2 · Program status registers on page 3-3 · About the CP15 system control coprocessor registers on page 3-5 · CP15 register descriptions on page 3-9 · CP15 instruction summary on page 3-70. ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. 3-1 Programmer's Model 3.1 About the programmer's model The ARM1026EJ-S ARM1026EJ-S processor implements the ARMv5TEJ architecture. This includes the: · 32-bit ARM instruction set · 16-bit Thumb instruction set · 8-bit Jazelle instruction set. For details of both the ARM and Thumb instruction sets, and the ARM programmer's model, see the ARM Architecture Reference Manual. For details of the Jazelle instruction set and the Jazelle programmer's model, see the Jazelle VI Architecture Reference Manual. The ARM1026EJ-S ARM1026EJ-S programmer's model is the same as that described in the ARM Architecture Reference Manual and the Jazelle VI Architecture Reference Manual, but extended in the following ways: · · The system control coprocessor, CP15, provides additional registers for system configuration and control. · 3-2 The Current Program Status Register, CPSR, and the Saved Program Status Registers, SPSRs, have an additional J bit to indicate Jazelle state and an additional A bit to mask imprecise aborts. The CP14 debug registers provide support for debug functionality. See Chapter 8 Debug for a description of the CP14 debug registers. Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Programmer's Model 3.2 Program status registers To support exception handling, the ARM1026EJ-S ARM1026EJ-S processor has one CPSR and five SPSRs. The Program Status Registers: · hold information about the most recently performed ALU operation · control enabling and disabling of interrupts · set the processor operating mode. 31 30 29 28 27 26 25 24 23 N Z C V Q J 9 8 7 6 5 4 Reserved A I F T 0 Mode Reserved Figure 3-1 Program Status Registers 3.2.1 The J bit The J bit in the CPSR indicates when the ARM1026EJ-S ARM1026EJ-S processor is in Jazelle state. When J is set, the processor is in Jazelle state. When J is clear, the processor is in ARM or Thumb state, depending on the T bit. · Note Setting both J and T causes the next instruction executed to take the Undefined Instruction exception. Entering the exception handler causes the processor to enter ARM state, and the exception handler can detect that setting both J and T caused the exception. · · ARM DDI 0244C 0244C The MSR instruction cannot be used to change the J bit in the CPSR. The position of the J bit avoids using the status or extension bytes in code run on ARMv5TE or earlier processors. This ensures that operating system code that uses the deprecated CPSR, SPSR, CPSR_all, or SPSR_all syntax for the destination of an MSR instruction still works. Copyright © 2003 ARM Limited. All rights reserved. 3-3 Programmer's Model 3.2.2 The A bit An imprecise abort is separated from the instruction that caused the error response. The abort can occur many cycles after the error-generating instruction retires. The AHB error response leading to an imprecise abort can occur at a time when the processor is already in Abort mode, or when the processor has entered the interrupt handler from Abort mode. To avoid the loss of the Abort mode state (R14_abt and SPSR_abt) in these cases, which leads to the processor entering an unrecoverable state, the existence of a pending imprecise abort must be held by the processor until a time when the Abort mode can safely be entered. The A mask is added to the CPSR to indicate that an imprecise abort can be accepted. When the A bit is set, an imprecise abort is held until the mask is cleared. When the A bit is cleared, a pending imprecise abort is recognized, and the abort is taken. The A bit is set automatically on entry into Abort mode, IRQ mode, FIQ mode, and on reset. 3.2.3 Other bits All other bits of the CPSR and the SPSRs are as described in the ARM Architecture Reference Manual. 3-4 Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Programmer's Model 3.3 About the CP15 system control coprocessor registers The programmer's model of the ARM1026EJ-S ARM1026EJ-S processor includes a system control coprocessor, CP15, that provides additional registers for system configuration and control. 3.3.1 Accessing CP15 registers CP15 registers can be accessed only with MCR and MRC instructions in a privileged mode. Figure 3-2 shows the MCR and MRC instruction format. 31 28 27 cond 24 23 1110 21 20 19 0 16 15 CRn 12 11 Rd Opcode_1 8 7 1111 5 4 3 1 0 CRm Opcode_2 Figure 3-2 CP15 MCR and MRC instruction format The assembly code for these instructions is: MCR{cond} P15, opcode_1, Rd, CRn, CRm, opcode_2 MRC{cond} P15, opcode_1, Rd, CRn, CRm, opcode_2 In User mode, coprocessor instructions take the Undefined instruction trap. See the ARM Architecture Reference Manual for a description of the MCR and MRC instructions. ARM DDI 0244C 0244C Copyright © 2003 ARM Limited. All rights reserved. 3-5 Programmer's Model 3.3.2 Summary of CP15 registers Table 3-1 lists the 16 CP15 registers and their accessibility. The MMU/MPU enabled column indicates whether you can access the register only when the MMU is enabled, only when the MPU is enabled, or when either the MMU or MPU is enabled. Table 3-1 CP15 register summary Register MMU or MPU enabled Access CP15 c0 Device ID Register Cache Type Register TCM Status Register MMU or MPU MMU or MPU MMU or MPU Read-only Read-only Read-only CP15 c1 Control Register Auxiliary Control Register MMU or MPU MMU or MPU Read/write Read-only CP15 c2 TTB Register DCache Configuration Register ICache Configuration Register MMU only MPU only MPU only Read/write Read/write Read/write CP15 c3 Domain Access Control Register Write Buffer Control Register MMU only MPU only Read/write Read/write CP15 c4 Reserved - Undefined CP15 c5 Data Fault Status Register when using MMU Instruction Fault Status Register when using MMU Data Extended Access Permission Register Instruction Extended Access Permission Register Data Standard Access Permission Register Instruction Standard Access Permission Register Data Fault Status Register when using MPU Instruction Fault Status Register when using MPU MMU only MMU only MPU only MPU only MPU only MPU only MPU only MPU only Read/write Read/write Read/write Read/write Read/write Read/write Read/write Read/write CP15 c5 Data Fault Address Register when using MMU Instruction Fault Address Register when using MMU Protection Region Registers 0-7 Data Fault Address Register when using MPU Instruction Fault Address Register when using MPU MMU only MMU only MPU only MPU only MPU only Read/write Read/write Read/write Read/write Read/write CP15 c7 Cache operations MMU or MPU Read/write CP15 c8 3-6 Register name TLB operations MMU only Write-only Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Programmer's Model Table 3-1 CP15 register summary (continued) Register MMU or MPU enabled Access CP15 c9 DCache Lockdown Register ICache Lockdown Register DTCM Region Register ITCM Region Register MMU or MPU MMU or MPU MMU or MPU MMU or MPU Read/write Read/write Read/write Read/write CP15 c10 TLB Lockdown Register MMU only Read/write CP15 c11 Reserved - Undefined CP15 c12 Reserved - Undefined CP15 c13 FCSE Process ID Register Context ID Register MMU only MMU or MPU Read/write Read/write CP15 c14 Reserved - Undefined CP15 c15 ARM DDI 0244C 0244C Register name Debug Override Register Prefetch Unit Debug Override Register Debug and Test Address Register Memory Region Remap Register MMU test operations Cache Debug Control Register MMU Debug Control Register MMU or MPU MMU or MPU MMU or MPU MMU or MPU MMU only MMU or MPU MMU only Read/write Read/write Read/write Read/write Read/write Read/write Read/write Copyright © 2003 ARM Limited. All rights reserved. 3-7 Programmer's Model 3.3.3 Address types The ARM processor uses three address types: · Virtual Address (VA) · Modified Virtual Address (MVA) · Physical Address (PA). Table 3-2 shows the parts of the ARM processor that use each address type. Table 3-2 Address types Processor unit Integer unit Virtual address Caches and TLBs Modified virtual address TCM and AMBA bus 3-8 Address type Physical address Copyright © 2003 ARM Limited. All rights reserved. ARM DDI 0244C 0244C Programmer's Model 3.4 CP15 register descriptions This section describes the CP15 registers: · CP15 c0 Device ID Register on page 3-10 · CP15 c0 Cache Type Register on page 3-11 · CP15 c0 TCM Status Register on page 3-13 · CP15 c1 Control Register on page 3-14 · CP15 c1 Auxiliary Control Register on page 3-19 · CP15 c2 Translation Table Base Register on page 3-20 · CP15 c2 DCache and ICache Configuration Registers on page 3-21 · CP15 c3 Domain Access Control Register on page 3-23 · CP15 c3 Write Buffer Control Register on page 3-25 · CP15 c4 Reserved on page 3-26 · CP15 c5 Data and Instruction Fault Status Registers on page 3-26 · CP15 c5 Data and Instruction Extended Access Permission Registers on page 3-29 · CP15 c5 Data and Instruction Standard Access Permission Registers on page 3-31 · CP15 c5 Data and Instruction Fault Address Registers on page 3-33 · CP15 c5 Prot