# **Optimum Data Flow Solutions for High-Performance Systems**

he most difficult problem for designers of high-performance, real-time digital signal processing systems is simply moving data within the system. Invariably, the problem is caused by data throughput limitations.

Many traditional methods of handling I/O tasks are no longer viable because they can't keep up with the speeds of new DSP and RISC processors. To improve system efficiencies, several new techniques to support fast data movement throughout the system have been created. Direct connections using private, high-speed data paths like those provided by mezzanines, front panel serial and parallel interfaces, and backplane fabrics can eliminate or significantly reduce data flow bottlenecks and arbitration for shared resources.

### New Processor I/O Demands

The ability to take maximum advantage of the newest RISC and DSP processors in high-performance, open-architecture embedded systems depends entirely on how well connected they are to each other and to system I/O devices.

During the last few years, the industry has witnessed the introduction of several fast DSP and RISC processors with impressive benchmarks for popular algorithms and sophisticated hardware engines for caching, moving data, and addressing. These processors also support external data bus speeds that outstrip virtually every backplane available.

To better understand the scope of supporting these new DSP and RISC devices, Figure 1 shows a few metrics for data transfer speeds. As an example, the Texas Instruments TMS320C6203 DSP executes eight 32-bit instructions in parallel within a 3.33 nsec instruction cycle time, yielding 2400 MIPS operation. An on-chip multiplepath ALU and four-channel DMA controller are coupled to support extremely high-speed I/O peripherals. With dual 32-bit parallel data buses, it can move data to I/O devices at a combined speed of 1200 MB/sec!

The new Motorola G4 AltiVec MPC7410 PowerPC running with a 400 MHz clock is capable of performing 20 operations per cycle. Using its external 64-bit data bus, it can handle peak I/O rates to peripherals at 1067 MB/sec.

For these processors, the speed of the computing engine may no longer be the critical path in the real-time equation. Instead, some of these recent gains in computational power can be quickly sacrificed due to bottlenecks in moving data to and from peripheral devices.

#### Mezzanine Buses

Mezzanine buses offer alternative parallel data paths to the common backplane in bus systems and can dramatically help improve real-time performance in several ways. First, mezzanine buses can provide a direct dedicated data path between system peripherals and the processor, so that data transfers can be guaranteed to meet realtime demands. Second, the data transfers on the backplane bus are reduced, making this bus more available for other traffic. And finally, since there can be multiple mezzanine buses within a system, the aggregate data transfer rates can be increased quite dramatically and in a modular manner. A few popular, high-speed mezzanines include PMC (PCI Mezzanine Card) and VIM (Velocity Interface Mezzanine). Figure 2 provides data transfer speeds for some mezzanine standards.

PMC has gained the support of virtually every board manufacturer in the high-end system bus community, first with VMEbus vendors, and now increasingly with Compact PCI vendors. The PMC module is attached to the carrier board using two, three or four 64-pin compact connectors, depending on the application. The PMC specification allows for direct connection through the front panel of the VME board. A separate PMC front panel can accommodate any specialized I/O connectors required by the module. Most PMC modules utilize the 32-bit interface and are capable of moving data in block transfers at 132 MB/sec.

| Figure 1. Data transfer speeds |  |
|--------------------------------|--|
| for new DSP and RISC           |  |
| processors.                    |  |

| PROCESSOR                | OCESSOR 21160       |                     | C6201               | C6203                 | MPC7410               |  |
|--------------------------|---------------------|---------------------|---------------------|-----------------------|-----------------------|--|
| No. of Buses             | 1                   | 1                   | 1                   | 2                     | 1                     |  |
| Address Bus Width        | 32                  | 24                  | 24                  | 24                    | 32                    |  |
| Data Bus Width           | 64                  | 32                  | 32                  | 32+32                 | 64                    |  |
| Bus Cycle Time           | 15 nsec<br>(66 MHz) | 6 nsec<br>(167 MHz) | 5 nsec<br>(200 MHz) | 3.3 nsec<br>(300 MHz) | 7.5 nsec<br>(133 MHz) |  |
| Bus I/O Rate<br>(MB/sec) | 528                 | 667                 | 800                 | 1200                  | 1067                  |  |



1

Leading the performance race, the VIM mezzanine specification was developed to meet the needs of new high-speed processors like the Texas Instruments C6000 and the Motorola PowerPC. The VIM specification developed by Pentek, provides a dedicated 400 MB/sec data channel to each of four processors on a quad processor 6U VMEbus board. Four 160-pin processor node connectors allow peripherals to de-

| MEZZANINE                          | IND. PACK | МІХ | PMC     | VIM  |
|------------------------------------|-----------|-----|---------|------|
| Data Bus Width<br>(bits)           | 16        | 32  | 32/64   | 32   |
| Bus Cycle Time<br>(nsec)           | 250       | 100 | 30      | 10   |
| Bus Cycle Rate<br>(MHz)            | 4         | 10  | 33      | 100  |
| I/O Bandwidth<br>(MB/sec)          | 8         | 40  | 132/264 | 400  |
| Buses per VME Slot                 | 4         | 1   | 1       | 4    |
| I/O Bandwidth per Slot<br>(MB/sec) | 32        | 40  | 132/264 | 1600 |

Figure 2. Data transfer speeds for mezzanines.

liver data directly to the private resources of each processor and include three types of electrical interface: high-speed parallel data, serial data, and control and status.

Some VIM module functions currently available include digital receivers and transmitters, highspeed A/D converters, and FPDP and

RACEway interfaces. VIM modules can also be custom-designed using the design specification available free of charge from Pentek. In all configurations, custom or offthe-shelf, the processor board and attached VIM modules occupy the same single VMEbus slot. The obvious benefits are flexibility, higher density, lower cost and much faster I/O paths.

### Front Panel I/O

As an alternative to using backplanes and mezzanines, several front panel data interconnect schemes have evolved which serve to reduce backplane traffic by sending data between boards using high-speed parallel and serial links. These include

Front Panel Data Port (FPDP) and front-panel serial ribbon cables.

FPDP, an ANSI standard, provides a 32-bit parallel front panel bus between two or more VME boards. It is a unidirectional, synchronous bus providing welldefined data transfer speeds and delivers data at either 80 MB/sec or 160 MB/sec. Now in use by dozens of

manufacturers, FPDP has proven itself as a simple, fast and inexpensive means for moving high-speed data between system

components. FPDP II will deliver 400 MB/sec performance.

## **Backplane I/O**

Front panel cable buses solve many types of interconnection problems for the system designer but suffer from the complication of having to fabricate, document, install and maintain cables of various types. For applications, where high availability is a significant issue, front panel connections of any type are undesirable.

A better solution for very high-speed I/O movement in applications is RACEway, created by Mercury Computer. RACEway is a backplane interconnection fabric that allows multiple boards to send and receive data simultaneously at rates far exceeding the basic VMEbus specification.

RACEway uses a separate overlay printed circuit board assembly which is attached to the VMEbus backplane, using mating sockets that engage the 64 user defined tail pins on the P2 connectors. It joins as few as two and as many as twenty VME slots. Each RACEway path operates synchronously at a clock rate of 40 MHz providing a data transfer rate of 160 MB/sec. Just now becoming available is the RACE++ technology which operates at a clock speed of 66.66 MHz and delivers 267 MB/sec transfers.

## **Putting it All Together**

With these high-speed interfaces available to meet the needs of the new generation of DSP and RISC chips, we will now show how they can be implemented on a multiprocessor board. As shown in Figure 3, a very high-performance one-slot data acquisition and signal processing system can be built using the VIM mezzanines and the RACEway interface.

The Quad C6203 DSP processor board is equipped with two VIM mezzanine modules. Two dual 100 MHz A/D VIM modules featuring four channels of 100 MHz 12-bit A/D conversion deliver four parallel data streams of 200 MB/sec each into four processor nodes. Using dedicated interprocessor FIFOs, preprocessed data is transferred across the processor nodes for final processing.

Finally, packets of processed data are sent through the RACE++ interface for delivery to a RACEway target via the 267 MB/sec backplane. Each of these data transfers takes place using no shared resources, completely eliminating data flow conflicts normally found in more traditional architectures.



Figure 3. Quad C6203 DSP with VIM mezzanines for A/Ds occupies one slot.

PENTEK