| The Datasheet Archive - 100 Million Datasheets from 7500 Manufacturers. |
Revision: 30275A Issue Date: April 2003 2003 Advanced Micro Devic
Top Searches for this datasheetAlchemySolutions Au1500Processor Performance Revision: 30275A Issue Date: April 2003 2003 Advanced Micro Devices, Inc. rights reserved. contents this document provided connection with Advanced Micro Devices, Inc. ("AMD") products. makes representations warranties with respect accuracy completeness contents this publication reserves right make changes specifications product descriptions time without notice. license, whether express, implied, arising estoppel otherwise, intellectual property rights granted this publication. Except forth AMD's Standard Terms Conditions Sale, assumes liability whatsoever, disclaims express implied warranty, relating products including, limited implied warranty merchantability, fitness particular purpose, infringement intellectual property right. AMD's products designed, intended, authorized warranted components systems intended surgical implant into body, other applications intended support sustain life, other application which failure AMD's product could create situation where personal injury, death, severe property environmental damage occur. reserves right discontinue make changes products time without notice. Contacts www.amd.com pcs.support@amd.com Trademarks AMD, Arrow logo, Alchemy, combinations thereof, Au1500 trademarks Advanced Micro Devices, Inc. Other product names used this publication identification purposes only trademarks their respective companies. Rev. 30275A April 2003 AlchemySolutions Au1500Processor Performance Introduction This document describes performance characteristics controller integrated into Au1500processor. This document assumes reader familiar with specification found Local Specification Rev. (see "References"). Controller Overview Au1500processor features integrated controller connecting external peripherals. controller designed support 32-bit wide interface 33MHz 66MHz. controller initiate master cycles also serve target deviceinitiated master cycles into SDRAM memory Au1500 processor. controller supports maximum four loads arbitration five devices including core. separate Au1500 application notes describe clocking schemes controller software techniques utilizing (see "References"). general arrangement controller depicted below Figure "Au1500Processor's controller". Au1500SBUS AD[31:0] CBE[3:0]# FRAME# IRDY# TRDY# STOP# PERR# DEVSEL# STOP# IDSEL REQ[3:0] GNT[3:0] INT[D:A]# RST# Core Device(s) SDRAM SDRAM Cntrlr Figure Au1500Processor's controller AlchemySolutions Au1500Processor Performance Rev. 30275A April 2003 supports variety cycle types. This document focuses following types: core initiated single-beat read from core initiated single-beat write device initiated single-beat read from Au1500 SDRAM device initiated single-beat write Au1500 SDRAM device initiated burst read from Au1500 SDRAM device initiated burst write Au1500 SDRAM performance measured throughput, amount data that transferred from given time period. These cycle types represent significant majority cycles that occur running system. Thus, throughput estimates each cycle type combined provide overall estimate throughput. Cycles Performance examples below, core assumed operating 396MHz, system (SBUS) 198MHz (5.1ns SBUS clock), 66MHz (15.2ns clock), SDRAM interface 99MHz (10.1ns SDRAM clock). operating frequencies core, system bus, SDRAM controlled sys_cpupll sys_powerctrl registers. Note: Further information SDRAM timings used this document available "Au1x00 SDRAM Performance" application note. Core Initiated Single-Beat Read From core initiates single-beat reads with load access, typically from device registers memory. Software frequently uses single-beat read accesses while managing operation device. core accesses non-cacheable, which turn initiate single beat accesses. Note: core initiate cacheable load accesses address space cacheable memory window controlled pci_cmem. Cacheable load accesses address space will initiate burst read from PCI, single beat read. core initiated access traverses both system bus. timing diagram single-beat read provided "Figure 3-5: Basic Read Operation", page Local Specification Rev. 2.2. activity buses during such access depicted below Figure "Au1 Core Initiated Single-Beat Read from PCI". Rev. 30275A April 2003 AlchemySolutions Au1500Processor Performance SBUS SDRAM Figure Core Initiated Single-Beat Read from minimum time necessary single-beat read active SBUS times: SBUS clocks clocks SBUS clocks SBUS clocks synchronization arbitration, clocks internal state machine synchronization, clock arbitration, clock address/command, clock turnaround, clock data, clock state machine SBUS clocks return data core complete access). This yields 5*5.1ns 11*15.2ns 3*5.1ns 208.0ns core initiated single-beat read access. 208.0ns single-beat read, theoretical maximum number single-beat reads possible 4,807,692 second, which yields theoretical maximum throughput 19.2MB/s (4,807,692 bytes). reality, timing single-beat read usually exceeds minimum time outlined above reasons provided "Detrimental Influences Performance", particular assertion DEVSEL# TRDY#. Furthermore, read must stall core until data returned. illustrated above, time considerable depending upon ability device return data timely fashion. Core Initiated Single-Beat Write core initiates single-beat writes with store access, typically device registers memory. Software frequently uses single-beat write accesses while managing operation device. graphics devices, write accesses dominate other accesses drawing frame buffer contents. core accesses non-cacheable, which turn initiate single beat accesses. Note: core initiate cacheable store accesses address space cacheable memory window controlled pci_cmem. Cacheable store accesses address space initiate burst write immediately; burst write initiated after data cache casts corresponding cache line(s), corresponding cache line(s) flushed. Note: programming with CCA=7 when mapping memory spaces, write buffer gather core stores which turn leads more efficient burst writes into memory space. core initiated access traverses both system bus. timing diagram single-beat write provided "Figure 3-6: Basic Write Operation", page Local Specification Rev. 2.2. activity buses during such access depicted below Figure "Au1 Core Initiated Single-Beat Write PCI". Application Note AlchemySolutions Au1500Processor Performance Rev. 30275A April 2003 SBUS SDRAM Figure Core Initiated Single-Beat Write minimum time necessary single-beat write active SBUS times: SBUS clocks clocks SBUS clocks synchronization, arbitration start write access, clocks state machine synchronization, clock arbitration, clock address/command, clock data). This yields 5*5.1ns 9*15.2ns 162.3ns core initiated single-beat write access. 162.3ns single-beat write, theoretical maximum number single-beat writes possible 6,161,429 second, which yields theoretical maximum throughput 24.6MB/s (6,161,429 bytes). reality, timing single-beat write usually exceeds minimum time outlined above reasons provided "Detrimental Influences Performance", particular assertion DEVSEL# TRDY#. However, write FIFO reduce single-beat timing, 4.1.3 "PCI Write FIFO". device initiated single-beat read from Au1500Processor SDRAM device initiates single-beat reads Au1500 processor SDRAM during operation, typically while processing ring buffer similar data structure. device access Au1500 SDRAM traverses bus, system bus, SDRAM interface. timing diagram single-beat read provided "Figure 3-5: Basic Read Operation", page Local Specification Rev. 2.2. activity buses during access depicted below Figure "PCI Device Initiated Single-Beat Read from Au1500Processor SDRAM". SBUS SDRAM Figure Device Initiated Single-Beat Read from Au1500Processor SDRAM minimum time necessary device single-beat read active bus, SBUS SDRAM times: clocks SBUS clocks SDRAM clocks clock Rev. 30275A April 2003 AlchemySolutions Au1500Processor Performance clock arbitration, clock address/command, clock turnaround, SBUS clocks synchronization arbitration, SDRAM clocks single-beat read, clock data complete access). This yields 4*15.2ns 5*5.1ns 6*10.1ns 146.9ns device initiated single-beat read access Au1500 SDRAM. 146.9ns single-beat read, theoretical maximum number single-beat reads possible 6,807,351 second, which yields theoretical maximum throughput 27.2MB/s (6,807,351 bytes). reality, timing access usually exceeds minimum time outlined above reasons provided "Detrimental Influences Performance", particular arbitration system retries. device initiated single-beat write Au1500Processor SDRAM device initiates single-beat writes Au1500 Processor SDRAM during operation, typically while processing/updating ring buffer similar data structure. device access Au1500 SDRAM traverses bus, system bus, SDRAM interface. timing diagram single-beat write provided "Figure 3-6: Basic Write Operation", page Local Specification Rev. 2.2. activity buses during access depicted below Figure "PCI Device Initiated Single-Beat Write Au1500Processor SDRAM". SBUS SDRAM Figure Device Initiated Single-Beat Write Au1500Processor SDRAM minimum time necessary device single-beat write active bus, SBUS SDRAM times: clocks SBUS clocks SDRAM clocks clocks state machine synchronization, clock arbitration, clock address/command, clock data, SBUS clocks synchronization arbitration, SDRAM clocks single-beat write). This yields 6*15.2ns 5*5.1ns 6*10.1ns 177.3ns device initiated single-beat write access Au1500 SDRAM. 177.3ns single-beat write, theoretical maximum number single-beat writes possible 5,640,157 second, which yields theoretical maximum throughput 22.6MB/s (5,640,157 bytes). reality, timing access usually exceeds minimum time outlined above reasons provided "Detrimental Influences Performance", particular arbitration system retries. Once data been moved onto system SDRAM), next cycle initiated, access Au1500 processor SDRAM, cycle stalls until Application Note AlchemySolutions Au1500Processor Performance Rev. 30275A April 2003 previous write completes. example, PCI-to-PCI cycles continue while data transferred Au1500 SDRAM. device initiated burst read from Au1500Processor SDRAM device initiates burst reads Au1500 processor SDRAM during operation, typically while utilizing bus-mastering transmit network packets writing disk blocks. burst transfers permit efficient movement data thus better performing I/O. device access Au1500 SDRAM traverses bus, system bus, SDRAM interface. timing diagram burst read provided "Figure 3-5: Basic Read Operation", page Local Specification Rev. 2.2. activity buses during access depicted below Figure "PCI Device Initiated Burst Read from Au1500Processor SDRAM". SBUS SDRAM Figure Device Initiated Burst Read from Au1500Processor SDRAM minimum time necessary device burst read eight words from Au1500 SDRAM active bus, SBUS SDRAM times: clocks SBUS clocks SDRAM clocks clocks clocks state machine synchronization, clock arbitration, clock address/command, clock turnaround, SBUS clocks synchronization arbitration, SDRAM clocks burst read, clocks eight words data). This yields 14*15.2ns 5*5.1ns 12*10.1ns 359.5.1ns device initiated burst read access Au1500 processor SDRAM. 359.5ns eight word burst, theoretical maximum number burst reads possible 2,781,641 second, which yields theoretical maximum throughput 89.0MB/s (2,781,641 bytes). reality, timing access usually exceeds minimum time outlined above reasons provided "Detrimental Influences Performance", particular arbitration system retries. device initiated burst write Au1500Processor SDRAM device initiates burst writes Au1500 processor SDRAM during operation, typically while utilizing bus-mastering receive network packets reading disk blocks. burst transfers permit efficient movement data thus better performing I/O. device access Au1500 SDRAM traverses bus, system bus, SDRAM interface. timing diagram burst write provided "Figure 3-6: Basic Write Operation", Rev. 30275A April 2003 AlchemySolutions Au1500Processor Performance page Local Specification Rev. 2.2. activity buses during access depicted below Figure "PCI Device Initiated Burst Write Au1500Processor SDRAM". SBUS SDRAM Figure Device Initiated Burst Write Au1500Processor SDRAM minimum time necessary device burst write eight words Au1500 SDRAM active bus, SBUS SDRAM times: clocks SBUS clocks SDRAM clocks clocks state machine synchronization, clock arbitration, clock address/command, SBUS clocks synchronization arbitration, SDRAM clocks burst write, clocks eight words data). This yields 13*15.2ns 5*5.1ns 12*10.1ns 344.3ns device initiated burst write access Au1500 SDRAM. 344.3ns eight word burst, theoretical maximum number burst writes possible 2,904,443 second, which yields theoretical maximum throughput 92.9MB/s (2,904,443 bytes). reality, timing access usually exceeds minimum time outlined above reasons provided "Detrimental Influences Performance", particular arbitration system retries. Once data been moved onto system SDRAM), next cycle initiated, access Au1500 processor SDRAM, cycle stalls until previous write completes. example, PCI-to-PCI cycles continue while data transferred Au1500 SDRAM. Performance overall throughput interface dominated access cycles described previously. maximum throughput controller approximated this equation: =(AUSBRTP AUSBRR) (AUSBWTP AUSBWR) (PDSBRTP PDSBRR) (PDSBWTP PDSBWR) (PDBRTP PDBRR) (PDBWTP PDBWR) where AUSBRTP core initiated single-beat read maximum throughput AUSBWTP core initiated single-beat write maximum throughput PDSBRTP device initiated single-beat read from Au1500 processor SDRAM maximum throughput AlchemySolutions Au1500Processor Performance Rev. 30275A April 2003 PDSBWTP device initiated single-beat write Au1500 processor SDRAM maximum throughput PDBRTP device initiated burst read from Au1500 processor SDRAM maximum throughput PDBWTP device initiated burst write Au1500 processor SDRAM maximum throughput these variables maximum throughput values identified discussion each access cycle type. remaining variables are: AUSBRR ratio core initiated single-beat reads AUSBWR ratio core initiated single-beat writes PDSBRR ratio device initiated single-beat reads from Au1500 processor SDRAM PDSBWR ratio device initiated single-beat writes Au1500 processor SDRAM PDBRR ratio device initiated burst reads from Au1500 processor SDRAM PDBWR ratio device initiated burst writes Au1500 processor SDRAM ratios must equal represent 100% utilization. actual ratios given system depend upon types devices connected bus. This equation only approximation tends yield realistic upper-bound throughput. dynamic nature types devices connected often result less than optimal throughput bus. examples provided this discussion. variety devices interaction with overall system influences throughput. Positive Influences Performance following items improve throughput. Utilizing fast back-to-back capabilities device Au1500 processor's controller, arbitration cycles reduced thus shortening access time. Au1500 processor's controller features coherency setting (pci_config[NC]=0) whereby requests Au1500 SDRAM snooped data cache. request hits data cache, data cache fulfills request immediately, thus avoiding need access external SDRAM. Au1500 processor's controller features cacheable window into memory address space (the pci_cmem register). Accesses this window initiate burst transfers rather than single-beat transfers. Rev. 30275A April 2003 AlchemySolutions Au1500Processor Performance programming with CCA=7 when mapping memory spaces, write buffer gather core stores which turn leads more efficient burst writes into memory space. improve performance, Au1500 processor implements write FIFO between system PCI. This FIFO effectively shortens core write cycle just system time, FIFO available slot. pci_cmem feature, CCA=7, write FIFO items warrant additional discussion these performance features make significant, positive improvement throughput. 4.1.1 Cacheable Memory Window Au1500processor's controller features cacheable window into memory space pci_cmem register. cacheable window used pre-fetchable memory space, enabling core cache memory window contents. This mutually beneficial effects: core caches space improved processing performance, data cache initiates burst transfers from better throughput. mapping that covers pci_cmem must CCA=4. This encoding fetches word first opposed critical word first), match specification. Also note that CCA=4 noncoherent configuration, therefore data cache does snoop memory space accesses. example, consider memory space that both mapped pci_cmem either source and/ destination target-to-target transfer, this scenario target-to-target transfer contained solely within bus, core cache snoop transfer, result, either target data cache might contain stale data. core read from this window, data cache initiates burst read transfer. timing burst read access depicted here Figure "Au1 Core Initiated Burst Read from PCI". SBUS SDRAM Figure Core Initiated Burst Read from minimum time necessary burst read active SBUS times: SBUS clocks clocks SBUS clocks SBUS clocks synchronization arbitration, clocks internal state machine synchronization, clock arbitration, clock address/command, clock turnaround, clocks data, clock state machine SBUS clocks return data core complete access). This yields 350.1ns core initiated bust read access. This yields 91.4MB/s throughput, vastly improved compared single-beat read 19.2MB/s throughput. AlchemySolutions Au1500Processor Performance Rev. 30275A April 2003 cast-out dirty cache line, data cache initiates burst write transfer. timing burst write access depicted here Figure "Au1 Core Initiated Burst Write PCI". SBUS SDRAM Figure Core Initiated Burst Write minimum time necessary burst write active SBUS times: SBUS clocks clocks SBUS clocks synchronization, arbitration burst write access, clocks state machine synchronization, clock arbitration, clock address/command, clocks data). This yields 258.4ns core initiated burst write access. This yields 123.8MB/s throughput, greatly improved compared single-beat write 24.6MB/s throughput. write buffer improve performance well, discussion 4.1.3 "PCI Write FIFO". software environment/application permits, cacheable window allows core cache memory improved throughput overall system performance. 4.1.2 CCA=7 mappings must non-cacheable setting (the exception being pci_cmem window previously discussed). Typically value (non-cached, non-mergeable, nongatherable), programming with value write buffer merge gather core stores into more efficient burst writes into memory space. throughput advantage burst writes discussed previously, actual effect overall system performance positive difficult determine dynamic nature run-time system. write buffer further improve performance, discussion 4.1.3 "PCI Write FIFO". 4.1.3 Write FIFO improve performance, Au1500 processor implements write FIFO between system bus. This FIFO effectively shortens core write cycles just system time, FIFO available slot accept write. That from core perspective, write completes soon FIFO accepts rather than waiting until write target completes. FIFO accept combination single-beat write accesses burst write accesses. third write access stalls core (and system bus) until slot available. write FIFO improves overall system performance buffering write accesses PCI, thus Rev. 30275A April 2003 AlchemySolutions Au1500Processor Performance enabling core system continue with other activities while writes complete. Detrimental Influences Performance following items reduce throughput. runs asynchronously core system bus. result, each access several clock cycles consumed synchronizing different clock domains. 33MHz device connected bus, then entire operates 33MHz, even devices that operate 66MHz. examples above were calculated 66MHz bus, reducing 33MHz will have detrimental impact throughput. clock exactly 66MHz. When using internally generated clock, 64MHz 32MHz) clock common. system shared number masters (Au1 core, PCI, USB, Ethernet, DMA). such, cycles that system access SDRAM experience increased latency until wins arbitration system bus. Furthermore, accesses static controller (e.g. Flash, PCMCIA, etc.) core occupy system tens hundreds nanoseconds latency arbitration. core initiated reads space prevent device that attempting access Au1500 SDRAM from winning arbitration system (because core arbitration system bus). DEVSEL# timing given device increase access time, which turn decreases throughput. devices assert DEVSEL# fast, medium slow (see Configuration space Status register). Many devices unable satisfy read write request immediately. TRDY# signal de-asserted device insert wait states into access. cycles that access Au1500 processor SDRAM, controller unable system arbitration, then retry signalled. this situation, access time Au1500 SDRAM extended clocks while controller attempts system arbitration. discussions above assume 4-bytes data single-beat access, 32-bytes data burst access. reality, accesses will transfer 32-bytes access; less thus will further decrease throughput. discussions above ignored other cycles types (e.g. C/BE encoding). Initiating other cycles types further decreases bandwidth available data movement cycles types. above factors, clock, target TRDY# timing, number active system masters most detrimental influences throughput. Application Note AlchemySolutions Au1500Processor Performance Rev. 30275A April 2003 Performance Examples Here typical examples illustrate throughput. real design, system should profiled order determine more accurate ratios thus better estimate throughput. 4.3.1 Network Device Example high performance PCI-based networking device uses bus-mastering transfer packets to/from Au1500 SDRAM. device also uses ring buffers Au1500 processor SDRAM provide queues incoming outgoing packets. this environment, ratios similar following: AUSBRR 0.10 managing device operation, servicing interrupts, etc. AUSBWR also 0.10 managing device operation, servicing interrupts, etc. PDSBRR 0.10 reading ring buffer contents PDSBWR 0.10 updating ring buffer status PDBRR 0.30 transmitting outgoing packets PDBWR 0.30 receiving incoming packets substituting values, throughput becomes: =(19.2MB/s 0.10) (24.6MB/s 0.10) (27.2MB/s 0.10) (22.6MB/s 0.10) (89.0MB/s 0.30) (92.9MB/s 0.30) 63.9MB/s This example also indicative high performance disk controller. 4.3.2 Graphics Device Example high performance PCI-based graphics device programmed with drawing commands perform drawing locally (i.e. hardware acceleration), drawing also performed core directly writing into frame buffer memory. this environment, ratios similar following: AUSBRR 0.20 moderate read-modify-write pixel operations (blits) AUSBWR also 0.80 drawing commands, frame buffer updates (blits) PDSBRR 0.00 PDSBWR 0.00 PDBRR 0.00 Rev. 30275A April 2003 AlchemySolutions Au1500Processor Performance PDBWR 0.00 substituting values, throughput becomes: =(19.2MB/s 0.20) (24.6MB/s 0.80) (27.2MB/s 0.00) (22.6MB/s 0.00) (89.0MB/s 0.00) (92.9MB/s 0.00) 23.5MB/s throughput capable with Au1500 processor enables good motion video decode. instance, full motion video decode typically requires frames second which achievable with Au1500 processor: video clip with resolution 800x600 16bpp requires 960,000 bytes frame. This yields frame rate frames second (23.5MB/s 960,000 bytes/frame). video clip with resolution 640x480 16bpp requires 614,400 bytes frame. This yields frame rate frames second (23.5MB/s 614,400 bytes frame). This throughput translates into very good graphics experiences. Conclusion controller integrated into Au1500processor capable handling common based peripherals such networking, graphics disk controllers. actual performance dependent upon devices connected estimated using equation provided. References AlchemyAu1500Internet Edge Processor Data Book, Alchemy Semiconductor, 2001. Local Specification Rev. 2.2, Special Interest Group, 1998. Clock Generation AlchemyAu1500Processor from Application Note, AMD, 2002. Software Support AlchemyAu1500Processor from Application Note, AMD, 2002. AlchemySolutions Au1000TM, Au1100and Au1500Processors SDRAM Performance Application Note, AMD, 2003. Other recent searchesLFCSP40 - LFCSP40 LFCSP40 Datasheet ICL7660 - ICL7660 ICL7660 Datasheet EMD12164P - EMD12164P EMD12164P Datasheet AD7416 - AD7416 AD7416 Datasheet AD7417 - AD7417 AD7417 Datasheet AD7418 - AD7418 AD7418 Datasheet AD7416 - AD7416 AD7416 Datasheet AD7417 - AD7417 AD7417 Datasheet AD7418 - AD7418 AD7418 Datasheet
Privacy Policy | Disclaimer |