

## Spring 2002 • Vol. 11, No. 1

#### Pentek, Inc.

One Park Way, Upper Saddle River, NJ 07458 Tel: (201) 818-5900 · Fax: (201) 818-5904 email: pipeline@pentek.com http://www.pentek.com

© 2002 Pentek, Inc. Newsletter Editor: TNT Resources Trademarks are properties of their respective owners. Specifications are subject to change without notice.

quarterly publication for engineering system design and applications.

# Processor Comparison: TI C6000 DSP and Motorola G4 PowerPC

ith Motorola's introduction of the fourth generation (G4) PowerPC with AltiVec™ technology, digital signal processing applications are beginning to migrate from the traditional DSP environment to the RISC environment. While Motorola has been advancing the processing power of the PowerPC, Texas Instruments has been introducing new members of its C6000 family, that offer more speed and flexibility to an already impressive DSP portfolio.

This article compares and contrasts hardware, software and development tools and highlights strengths and weaknesses of each processor. Two application examples, one for each processor, show how their respective strengths can be utilized based on the application.

#### **Inside the Chips**

Two unique technologies provide the power that makes these processors capable of handling the demanding requirements of many real-time, highthroughput, and calculation intensive applications.

The C6000 DSPs are the first to use TI's high-performance VelociTI<sup>™</sup> architecture with its advanced Very Long Instruction Word (VLIW) engine for instruction-level parallelism. As shown in Figure 1, the core consists of eight functional units operating on two register files. Each of the eight units uses a 32-bit instruction, so a 256-bit VLIW is fetched on every clock cycle. To take advantage of this powerful resource, the Texas Instruments optimizing C compiler and optimizing assembler tools are designed to maximize the number of units executing useful operations during each cycle.

Motorola's G4 PowerPC architecture combines the existing RISC design found in prior generations with the new AltiVec vector parallel processing engine. Figure 2 is a simplified block diagram of the main processor blocks. The AltiVec unit operations are performed on multiple data elements by a single instruction. This is often referred to as SIMD (single instruction, multiple data) parallel processing. The unit operates on 128-bit data and performs both fixed- and floating-point functions. The AltiVec unit can process several data formats including 8-, 16- and 32-bit signed and unsigned integers, and 32-bit IEEE floating-point words.

## Fixed- vs. Floating-Point

When choosing a processor, a fundamental question to ask is whether the application can be best addressed using a fixed-point or a floating-point processor. The C6000 series of DSPs are available in both fixed- and floating-point varieties. For instance, in the C6201 and C6203, all eight functional units are fixed-point. In the C6701, six of the eight units are floating-point. The 7410 PowerPC contains both fixed- and floating-point units.

Because of their lower cost and power, fixed-point processors are best suited for high volume, heavily embedded applications. For fixed-point processors, the additional code complexity required for scaling may be offset by the lower cost of the silicon.

Floating-point processors are best for applications that require extensive floating-point arithmetic, or in custom applications where the code is likely to change and the user can exploit the faster development effort.

## System I/O Requirements

As processor cores become faster, the main bottleneck to performance quickly becomes the paths for getting data on and off the chip. Many applications require sustained high-speed data streams to maintain real-time processing. In most



embedded systems, the I/O requirements are as critical as the processing requirements. Whether the core is a C6000 or a PowerPC, the same general guidelines apply to most high-throughput, real-time embedded applications.

 A direct path to each processor helps guarantee that data will not be stalled because of routing through intermediate resources.

 Paths that are shared by more than one processor can be problematic when one data channel must wait for another channel to complete its transfer.

 In closed-loop systems, low latency in the control and feedback paths is essential for maintaining real-time performance.



Figure 2. G4 PowerPC Core Diagram



# Processor Comparison: TI C6000 DSP and Motorola G4 PowerPC

## [Continued from page 1]

• A FIFO in the data path can buffer the data and keep it continuously moving in or out of the processor.

#### C6203 Processor Node Design

Figure 3 shows a processor node design for the C6203 that is optimized for I/O throughput. This is one of four identical such nodes of the Pentek Model 4292 Quad C6203 VME processor board. The node has the following resources:

• VIM Interface, which provides a control/status path, a fast BI-FIFO interface for data streaming, and presents two of the C6203's serial ports to I/O modules.

• 8 Mbytes of external SDRAM for fast access to code and data.

• 1 Mbyte FLASH for processor boot code and user-installed programs.

• BI-FIFO for interprocessor communication allows each processor to transfer data to its two adjoining processors.

• Bus I/O BI-FIFO allows data to be shared by any of the board's global resources.

• PCI Interface for the processor to access the PCI Bus resources and master the VMEbus interface.

#### PowerPC Node Design

Figure 4 shows a similar node built around the 7410. This is one of four identical such nodes of the Pentek Model 4294 Quad G4 PowerPC VME processor board. The node has the following resources:

• Node Controller, which is the major difference between this and the previous node; the PowerPC requires a node controller interface chip to connect external resources.

• VIM Interface, which contains the same resources as the C6203 node; the node controller provides all the necessary connections to the VIM.

• 2 Mbyte L2 Cache to supplement the PowerPC on-chip cache.

• 32 Mbytes of local SDRAM for fast code and data storage.

• 4 Mbyte FLASH for processor boot code and user-installed programs.

64-bit/66 MHz PCI Bus to provide communication to all global resources.
Another 64-bit/66 MHz PCI Bus to provide interprocessor communication.

Whether using a C6203 or a PowerPC, the architectures shown provide exceptional data movement capability, which is a critical requirement in many high-speed, real-time applications.

A complete list of processor boards and VIM I/O modules may be found on Pentek's website at <u>www.pentek.com</u>.

# Operating Systems and Code Generation Tools

DSPs in general, have been successfully used in applications with and without an operating system (OS). In applications where the data enters and leaves the processor at consistent rates and the processing tasks can be performed in a known amount of time, the application may be written with interrupt-driven code and may not need an OS to schedule tasks.

Tools to develop code on the C6000 are available from Texas Instruments. Their premier code development tool is the Code Composer Studio, a complete code development environment that runs on Windows workstations. Pentek's ReadyFlow Board Support Libraries, fully compatible with TI's Code Composer and DSP BIOS, offer high-level C-callable functions to speed development.

The PowerPC is most commonly used with an OS. The majority of G4's deployed are in Apple computers that use the Mac OS. In embedded applications, the most common OS is Wind River's VxWorks. In addition, other OSs exist, some of which are based on Linux.

The most common software development tools for embedded applications on the PowerPC are contained in Wind River's Tornado package. Pentek offers a complete Board Support Package (BSP) with Tornado-compatible data transfer and control functions for each specific board architecture, as well as device drivers for peripherals and interfaces.



Figure 3. C6000 Node Design Diagram

## **Radar Data Acquisition System**

In this system example, we will look at a 4-channel radar data acquisition system used to capture the outgoing pulse for storage or processing. The radar pulse width is 4 msec and it repeats every 40 msec. Since the pulse bandwidth is 32 MHz, we need to sample it with an A/D converter running at 80 MHz. With two bytes per sample during the pulse, we need to acquire data at a peak rate of 160 megabytes per second for each channel. However, since we need to capture data only while the pulse lasts, and since it lasts only for 10% of the >





# Processor Comparison: TI C6000 DSP and Motorola G4 PowerPC



time, the average data rate is one-tenth the peak rate. The system must be able to handle both peak and average rates.

The system shown in Figure 5 uses two 2-channel 80 MHz A/D converter Pentek Model 6231 VIM-2 modules attached to a Pentek Model 4292 Quad C6203 VME processor board with a RACE++ interface.

Looking at just one of the four channels, the 160 MB/sec stream from the A/D is sent across the VIM interface into the processor. Both sides of the VIM FIFO can easily handle the peak rate and the processor uses its second bus and DMA controller to store the data into the local SDRAM, which can operate at up to 600 MB/sec. After capture, data is delivered to the RACE++ interface at up to 264 MB/sec. Therefore, even with four channels at an average rate of 16 MB/sec each, there's still enough headroom.

In this application, we have taken good advantage of the VIM interface with the high-speed BI-FIFOs and the local fast memory of each processor to capture the 640 kbytes of real-time data storage required for each pulse. Finally, we've used the four Bus I/O FIFOs to buffer the data for backplane I/O.



## Multichannel Receiver System

This system is a HF receiver capable of locating and downconverting 32 radio channels, each with up to 2.0 MHz bandwidth. All channels must be demodulated, decoded, and analyzed using floatingpoint calculations. Results are to be communicated across the backplane to an embedded host processor.

The system shown in Figure 6 uses a Pentek Model 6230 VIM-4 module, a 32-channel digital receiver, attached to a Pentek Model 4294 Quad G4 PowerPC VME processor board. In addition to the 32 narrowband receivers, Model 6230 includes four 80 MHz 14-bit A/D converters and two configurable FPGAs.

The wideband A/D data may be brought directly to any of the processors, so they can perform a FFT to detect the frequencies of signals anywhere in the HF band. To support this bandwidth, each path must deliver data to the processors at 160 MB/sec. Signal information can be easily communicated across the four processors via the interprocessor PCI Bus. Once the signals are located, the 32 receiver channels are tuned to their frequencies, so the signals can be demodulated by the PowerPCs.

Each narrowband signal represents a data rate of 10 MB/sec. If eight channels are sent to each processor, it has to handle 80 MB/sec. The four VIM mezzanine interfaces support both the 160 MB/sec signal detection mode and the 80 MB/sec narrowband mode. The interprocessor PCI Bus handles the tuning data transfers and the backplane PCI Bus handles the output commands to the host.

For more information, visit: www.pentek.com/dspcentral/ powerpc/articles.cfm

# **Upcoming Seminars**

Sign up for Pentek's Online Seminar, "Optimizing PowerPC Performance for Radar and Wideband Wireless Systems". It's being held on April 10, 2002 at 1:00 p.m. EST. Go to http://seminar.techonline.com/ pentekapr1002 to register.

3



Upper Saddle River, NJ 07458-2311

**enssi sidt ni** 

| ₽  |                      |
|----|----------------------|
| 83 | Upcoming Seminars    |
| 11 | Processor Comparison |

inewsletter! to continue receiving this quarterly rrfgorgog/moc.x9in9d.www design and applications. Go to intormed on engineering system New Recipients: Join in! Keep













modules.

environment, the Model 4996 speeds application development by providing a high-level API to access all of the board's memory and communication resources and control of its I/O interfaces and VIM

Model 4996 VxWorks BSP & Drivers provide software developers with a complete library of hardware initializa-

tion, control and application functions for the Model 4294 Quad G4 PowerPC

processor board. Used in conjunction

with Wind River's Tornado II software

The VxWorks BSP is designed to re-

duce development time not only during

the initial stages of software develop-

added to the system. VIM module driv-

being used, are built with a consistent

ment, but any time new I/O hardware is

ers, each designed to control the specific hardware features of the I/O interface

style and function naming convention, thereby allowing immediate familiarity with new VIM hardware as it's added. This can greatly shorten the application development learning curve when a system is modified or expanded.

MPI Software Technology's VSI/Pro is

an implementation of VSIPL functions

optimized specifically for the PowerPC

and AltiVec architectures and is designed

to be fully compliant with the emerging

VSIPL standard. It includes functions for

linear algebra, signal processing (FFTs,

windowing, filtering and convolution

routines), image processing, as well as

scalar functionality, and vector and ma-

real-time operating system and supports

trix view functionalities. The library is compatible with Wind River's VxWorks

the ANSI C and C++ languages. □





**Model 4986** 



**Model 4294** 

PERMIT No. 4 Rochester, MN

PAID **JOATZOG .2.U** 

**1ST CLASS MAIL** 

PRESORT