| |
Datasheet Home \ Datasheet Details
Download
PDF Abstract Text:
Increasing System Bandwidth with CDS
June 2001, ver. 1.0
Increasing System Bandwidth with CDS
Application Note 162
June 2001, ver. 1.0
Introduction
As system speeds have increased, semiconductor and board designers have turned to source-synchronous clocking and differential signaling to improve chip-to-chip data transfer rates. While source-synchronous clocking does meet this need, it is not very flexible. Designers must closely match the clock and data line lengths, complicating board design. Every chip-to-chip data transfer must have a clock as well as data lines, so every connection introduces a new clock domain. A device that receives data from several devices must have dedicated circuitry for each connection and manage data flow among several clock domains. A new clocking technique called clock-data synchronization (CDS) combines the advantages of traditional synchronous clocking and sourcesynchronous clocking by providing high-speed data transfer without the need to closely match clock and data lines. Unlike clock-data recovery (CDR), there is no need to encode or scramble data to meet any kind of run-length requirement. This application note discusses how CDS works and how it can be used in a variety of systems. The look-up table (LUT)-based APEX II device family incorporates CDS circuitry in its differential I / O circuitry. These devices offer four banks of high-speed differential I / O pins: two output banks and two input banks. Each bank contains 18 channels and one clock and supports LVDS, LVPECL, PCML, and HyperTransport I / O standards at up to one gigabit per second (Gbps). The two input banks incorporate CDS, providing the advantages described below.
SourceSynchronous Clocking
Source-synchronous clocking has become a popular technique for highspeed designs. With this technique, the transmitting device sends a clock along with the data. The advantage of this approach is that the maximum performance is no longer computed from the clock-to-out delay, propagation delay, and setup times of the devices and board. Instead, the maximum performance is related to the maximum edge rate of the driver and the skew between the data signals and the clock signals. Using this technique, data can be transferred at a 1-Gbps rate (1-ns bit period) even though the propagation delay from transmitter to receiver may exceed 1 ns. Figure 1 shows an example of source-synchronous transfer.
Altera Corporation
A-AN-162-01
AN 162: Increasing System Bandwidth with CDS
Figure 1. Source-Synchronous Transfer
In a source-synchronous system, trace lengths must be matched to minimize skew between data traces and the clock trace.
Transmitter
Receiver
Data1 Data2
Clock
However, there are some drawbacks to the source-synchronous clocking technique. The board design must be tightly controlled so that there is minimal skew between the data and clock paths. Additionally, each set of data driven from a device must be sent with a clock signal. Therefore, if a device receives data from four other devices, that device must also receive four clocks. These clocks can complicate the design of the receiver, as the design now has to manage four clock domains using first-in first-out (FIFO) buffers.
Clock-Data Synchronization
CDS is a new solution to this design challenge. With CDS, the receiving device can synchronize multiple incoming streams of data to its own system clock. This technique simplifies board design because skew between data channels and the clock is no longer an issue. A receiver can use CDS to correct any amount of clock-to-channel or channel-to-channel skew. CDS allows designers to easily implement various system topologies. Multiple devices can now feed into one receiving device, which processes all incoming data in one clock domain. Figure 2 shows an example of a system using CDS.
Altera Corporation
AN 162: Increasing System Bandwidth with CDS
Figure 2. System Using CDS
APEX II Device 1
APEX II Device 2
APEX II Device 3
APEX II Device 4
Clock Signal
CDR has been used to address similar skew and topology requirements. CDR has an advantage over CDS because the data transmitters can operate on multiple crystals as the receiver recovers individual clocks from each incoming data channel. Every channel can have phase variation as well as frequency variation within a specified limit. Although CDR provides flexibility, the receiver design is more complicated because every data channel has its own clock domain. With CDS, the data channels may vary in phase, but must all be precisely the same frequency. To ensure that all channels are the same frequency, all transmitters must be clocked from the same system clock.
Altera Corporation
AN 162: Increasing System Bandwidth with CDS
Compared to CDR, CDS has an advantage in data transmission efficiency. For a CDR receiver to recover the clock and data, the data channel must periodically toggle. This requirement is known as the maximum run length. For example, a common CDR technique is to use 8B / 10B encoding, which ensures that more than five ones or five zeros are never transmitted consecutively. However, this encoding scheme creates inefficiency on the data channel. A 1.25-gigabit data channel can only transmit a 1-gigabit 8B / 10B-encoded data stream. CDS does not have a run length requirement, so there is no need to encode the data stream. Therefore, the entire bandwidth of the transmission channel can be used for the system data a 1.25-gigabit data channel can transmit 1.25-gigabits of data.
CDS Implementation
Altera Corporation
AN 162: Increasing System Bandwidth with CDS
Figure 3. CDS Implementation
Input Data
Synchronized Data
Control Logic Selects Register
0° Output System Clock PLL (1) 90° Output
Note to Figure 3:
(1) PLL: phase-locked loop.
When using source-synchronous clocking, the data stream can be automatically byte-aligned. For example, if the data stream is eight times as fast as the clock, the most significant bit (MSB) of each byte is the data transmitted during the third bit period after the clock. This relationship holds because skew between clock and data is limited. There is no limit on skew between clock and data in a CDS system. Therefore, the designer cannot use the relationship between clock and data to byte-align the two signals. However, in a CDS system, a byte alignment pattern is sent to the receiver after the training pattern. The receiving device uses this pattern to byte-align the data.
Altera Corporation
AN 162: Increasing System Bandwidth with CDS
It only takes a few clock cycles to transmit and process this training and byte-alignment sequence, and this is performed once upon system powerup. If multiple transmitting devices are on the same board, they are subject to the same voltage and temperature variation, so skews between them will not vary and retraining is not necessary. All transmitting devices send the training pattern simultaneously so that the receiver can self-adjust for all skews simultaneously. However, if the transmitting devices are on different boards or subsystems, they may experience different voltage and temperature variation, and the design may need to periodically resend the training pattern depending on the variation that the system sees. Although additional clock cycles are necessary to resend the training pattern, a CDS system is still more efficient than CDR systemencoding schemes.
CDS System Applications
CDS improves system efficiency in many ways. It can correct for skew that cables and connectors introduce to data channels. CDS also adds flexibility to overall system designs. Two examples are implementing a switched backplane and breaking up large designs into multiple devices. Many systems, including communications and storage systems, incorporate a backplane to transmit data from one subsystem to another. Historically, these designs have used a shared backplane (such as PCI). However, the need for faster data transfer has revealed limitations of this approach. A shared backplane can only support one transaction at a time, and the bus speed cannot increase fast enough to support the data requirements. The switched backplane approach is a solution to higher data transfer requirements. Rather than sharing a common bus, each card communicates on a point-to-point link to a master switch. The switch transfers the data to the destination point. Differential I / O standards are well-suited to this architecture, as each point-to-point link can run at very high speeds. Furthermore, since the bus is not shared, multiple transactions can be executed simultaneously, as shown in Figure 4.
Altera Corporation
AN 162: Increasing System Bandwidth with CDS
Figure 4. Switched Backplane Application
APEX II Device 1
APEX II Device 2
APEX II Device 3
APEX II Device 4
Clock Signal
With source-synchronous clocking, every point-to-point link must have its own clock. The master switch must implement multiple clock domains and manage data and clock skew across the backplane. CDS is a good solution to these concerns because all cards use a system clock. The master switch can use CDS to correct for any skews caused by system clock skew, device-to-device variation, or data skew. Using CDS for this architecture simplifies the overall system design by keeping the entire system synchronized to one clock. The CDS circuitry in the APEX II device family provides the flexibility necessary to easily implement a switched backplane system.
Altera Corporation
AN 162: Increasing System Bandwidth with CDS
Another example of a CDS application is design partitioning. Many complex designs, such as packet processing, cannot easily fit into one device or are partitioned for other reasons. For example, while software running on network processors is useful for general packet processing, ASICs or programmable logic devices (PLDs) are often used for accelerating specific functions. Network processors and PLDs implement different functions within the system. For example, classification and queuing control are important to assure quality of service, and encryption is important for security. These functions can be implemented at a higher speed in a PLD than in a network processor. The size of these functions may prevent them all from being incorporated into one PLD. Historically, partitioning these functions into multiple devices has resulted in very inefficient use of the devices. Each individual device would usually use up all its I / O pins before using all of its logic. High-speed differential interconnects in conjunction with CDS enable a very high bandwidth data transfer from device to device so the required data transfer from chip to chip can be implemented using only a few I / O pins. Figure 5 shows a block diagram of an OC-192 data path. In this design, the packet processing is divided between a network processor and multiple PLDs. CDS is used to implement high-speed data transfer among the multiple devices that make up the packet-processing function.
Altera Corporation
AN 162: Increasing System Bandwidth with CDS
Figure 5. OC-192 Design Partitioning
SRAM and SDRAM Blocks
CDR Circuitry PMD Device (1) Transceiver Framer Packet Processing Switch Fabric
Packet Processing CDS System
Host Processor
APEX II Device
Note to Figure 5:
(1) PMD: physical medium dependent.
Because CDS enables easier design partitioning, it is also useful for ASIC prototyping. In many cases, a designer takes advantage of the flexibility and easy reconfiguration of programmable logic to prototype a design, and then moves a very large or extremely high-volume design to an ASIC. Since the ultimate capacity of a standard-cell device is larger than that of a programmable logic device, the designer will partition this design into multiple PLDs. As discussed earlier, this may lead to inefficient use of the logic within these devices. By using CDS, the designer can implement the required data transfer between devices and use the full logic capacity of the PLDs.
Altera Corporation
AN 162: Increasing System Bandwidth with CDS
Summary
Increasing demand for data services has driven higher bandwidth requirements for system designers. Differential signaling has been successfully used to address this need. CDS builds on the success of differential signaling, giving designers more flexibility in the design of their boards and of their overall systems. By using CDS in APEX II devices, designers can enhance their systems to provide flexibility and performance.
Printed on Recycled Paper.
Altera Corporation
|