

### Fall 2015 • Vol. 24, No. 3

#### Pentek, Inc.

One Park Way, Upper Saddle River, NJ 07458 Tel: (201) 818-5900 • Fax: (201) 818-5904 email: pipeline@pentek.com http://www.pentek.com

#### © 2015 Pentek, Inc.

Trademarks are properties of their respective owners. Specifications are subject to change without notice.

quarterly publication for engineering system design and applications.

### In This Issue

• Feature: The feature article in this issue describes the roles CPUs and FPGAs play in embedded systems and how they are combined in SoC devices. AMBA interfaces and development tool suites also are discussed.

"In spite of their profound differences, **CPUs** and FPGAs have each staked out roles as essential elements in embedded systems. Recognizing this symbiotic rela-



tionship, many vendors now offer SoC (system-on-chip) devices combining CPUs and FPGAs in a single monolithic silicon device."

Rodger Hosking, Pentek Vice President and Co-founder

- **Product Focus: FlexorSets**
- **Q&A with Pentek**

### Free Resources

- Subscribe to The Pentek Pipeline
- To receive automatic notification about a Pentek product's documentation and life cycle, set up a YourPentek profile
- Technical Handbooks Helpful information about various technical topics
- **Catalogs** Pentek's segment catalogs highlight products by function

### Follow Us!



## Advances in CPUs, FPGAs, and SoC Technology

by Rodger Hosking, Pentek, Inc.

n almost every aspect, CPUs and FPGAs are radically different devices. And yet, they often compete for some of the same embedded system tasks. Choosing the best approach depends not only on the capabilities of each device, but also on the often disparate expertise of engineers promoting their respective development methodologies.

To make matters even more complicated, SoC (system-on-chip) technology now combines CPUs and FPGAs within the same device. Here, efficient interoperability becomes essential to meet stringent realtime performance levels. This article presents these challenges along with some strategies for developing successful solutions.

### **Two Different Devices**

FPGAs are user-configurable hardware logic, while CPUs are fixed arithmetic engines executing user programs. Table 1, below, shows how these considerable differences translate into application tasks and implementation.

One of the latest CPU processor cores, the ARM Cortex-A72, sports up to four 64bit ARMv8-A processor cores operating at clock rates up to 2.5 GHz. It targets powersensitive, high-performance mobile applications, and features a NEON 128-bit SIMD engine for efficient fixed- and floating-point vector processing. Interfaces to other processors and external memory are based on AMBA, discussed later. ➤

|                      | CPU                                                                                        | FPGA                                                                                                             |
|----------------------|--------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------|
| Computation          | Fixed arithmetic engines                                                                   | User-configurable logic, DSP blocks and data flow                                                                |
| Appropriate tasks    | Decision making<br>Complex analysis<br>Lower data rate computation<br>Block-oriented tasks | Compute-intensive algorithms<br>Massively parallel operations<br>Higher data rate computation<br>Streaming tasks |
| I/O                  | Fixed, dedicated I/O ports                                                                 | User-configurable I/O ports                                                                                      |
| Programmability      | Program execution                                                                          | Registers determine modes<br>and define operating parame-<br>ters                                                |
| Ease of programming  | C programming simplifies development tasks                                                 | HDL programming mandates<br>hardware awareness                                                                   |
| Maintenance/Upgrades | Less difficult                                                                             | More difficult                                                                                                   |

Table 1. Embedded System Factors for CPUs and FPGAs



## Advances in CPUs, FPGAs, and SoC Technology

The latest FPGAs from Xilinx, such as the UltraScale+, provide over 11,000 DSP engines, the essential building blocks for signal processing algorithms. They are aimed at high-performance embedded computing requirements with configurable interfaces for exotic peripherals and standard resources like DDR4 memory, PCIe Gen 4, and 100 GbE.

These differences drive the typical task

assignments shown in Table 1 on page 1. The complex aspects of high level decisions and data analysis are usually easier to implement in a CPU. Compute-intensive signal processing or data crunching tasks can take excellent advantage of the numerous DSP blocks found in FPGAs. Common examples, such as FFTs, matrix processing, and digital filtering can exploit the benefits of thousands of DSP blocks operating in parallel. Furthermore, FPGA hardware surrounding these blocks can be tailored for each application. This includes local data buffers, specialized FIFOs, and optimized

interfaces to and from external sensors, storage devices, networks, and system components.

Choosing between an FPGA and a CPU for a given function is sometimes obvious because of its nature, but other times it could go either way. If so, the deciding vote is often cast for the CPU because a C program is easier to develop, maintain and upgrade. Another important factor: it is often easier to hire a C programmer than an FPGA design engineer!

In spite of their profound differences, CPUs and FPGAs have each staked out roles as essential elements in embedded systems.

### System-on-Chip Devices

Recognizing this symbiotic relationship, many vendors now offer SoC (system-on-chip) devices combining CPUs and FPGAs in a single monolithic silicon device. It is important to note that "SoC" also refers to highly-integrated devices that include analog interfaces, video and network ports, human interfaces, as well as RF and wireless interfaces, but not necesA53 application processor, a Dual-Core ARM Cortex-R5 real-time processor, and a Mali GPU (graphical processing unit). To match a wide range of embedded applications, the programmable logic section includes a different mix of 16 nm FPGA resources in each of the eleven members of the series. With almost a million logic cells and over 3,500 DSP slices, they deliver significant computational power.



Figure 1. Different Types of Possible AXI Connections for a Software Radio Transceiver

sarily FPGAs. These SoCs are used extensively for consumer market products such as vehicles, smart phones, tablets, appliances, printers, and entertainment systems.

But to address the toughest requirements, real-time embedded systems often need a much narrower class of SoCs with the extra horsepower of large FPGAs. Leading the industry for such SoCs are Xilinx and Altera.

Xilinx offers the Zynq family of SoCs that combine ARM processors with Xilinx FPGAs. Their latest offering is the Zynq UltraScale+ series, whose processing section includes a Quad-Core ARM Cortex-

Altera competes with the Stratix 10 family of SoC devices, also using the Quad-Core ARM Cortex-A53 CPU. Altera's latest 14 nm FPGA technology offers ten different resource-balanced versions, one topping the list with over 5 million logic cells and 5,760 DSP blocks. Unlike Xilinx's counterpart, the DSP blocks of Stratix 10 can handle not only single- and double-precision fixed point operations, but also single-precision IEEE 754 floating point functions. This allows designers to achieve a much higher dynamic range for sensitive signal processing applications, and saves the often tedious task of optimizing scaling to ≻



## Advances in CPUs, FPGAs, and SoC Technology

avoid saturation and underflow conditions which can often plague fixed point hard-ware.

Because of parallel hardware structures connected directly to I/O ports, FPGAs can process and deliver high-rate continuous streaming data. CPUs are much more effective when processing data blocks in system memory. The advent of SoCs has thus created fundamental interface and data flow inconsistencies between FPGAs and CPUs.

### **AMBA Interfaces**

To help resolve this challenge, ARM, Ltd. developed the Advanced Microcontroller Bus Architecture or AMBA nearly two decades ago. Since then, it has been widely adopted as an open source, welldocumented, license-free interface protocol between CPUs and peripherals, including FPGAs.

One of the most prevalent versions of AMBA is the AXI4 (Advanced eXtensible

Interface Rev 4) specification. It presents a comprehensive standard for transferring data between master and slave devices for data widths from 32 to 1024 bits in burst lengths of 1 to 16. A master and a slave device, both having AXI4 compliant interfaces can be connected together and communicate, regardless of the nature or function of the devices.

Another popular variation is the AXI4-Lite specification, a subset of AXI4 for very simple devices that may not need the extra interface overhead required for full AXI4. Here, the data width is only 32 or 64 bits and the burst length is limited to single transfers. This is ideal for reading and writing to memory-mapped status and control registers, often satisfying the needs of most small peripheral devices.

Yet another derivative is the AXI4-Stream specification, which eliminates the addressing of AXI4 and AXI4-Lite. Instead, data bytes can be organized in packets of convenient size, and packets can be combined into frames tailored to a wide range of applications like specialized video and imaging. Each byte can be a data byte, a position byte to mark relative location of data bytes, or a null byte to serve as a filler. AXI4-Stream supports only unidirectional transfers from the master device to the slave device.

One important aspect for all of these AXI4 specifications is the concept of "interconnects". An interconnect is circuitry that joins one or more master interfaces to one or more slave interfaces, providing not only the required data path connectivity, but also adjusting the required data width and clocking for all devices. Nevertheless, if a single master needs to connect to a single slave, and the data widths and clocks are the same, they can be connected directly without the interconnect.

Figure 1 on page 2 shows how AXI4 interfaces connect some typical blocks of a software radio transceiver. Note examples of AXI4-Stream for the A/D and D/A converters, and the AXI4-Lite for a simple FPGA peripheral in IP7. The AXI interconnect contains the FPGA logic that allows the CPU to access three IP blocks. Direct AXI4 connections between two IP blocks are possible when the clocking and data widths match.

AXI4 makes life much easier for SoC developers by supporting connections among a diverse range of components through a common interface standard, with interconnect blocks to realize system topology and reconcile data widths. >>



Figure 2. Development Tool Methodology for SoC Design



## Advances in CPUs, FPGAs, and SoC Technology

Another important point is that AXI4 can be extremely effective in reducing power and boosting transfer rates compared to competing strategies. This is extremely important for high-performance FPGAs in real-time embedded computing systems.

### **Tools Make It All Happen**

For all of these obvious benefits, both Altera and Xilinx have harnessed AXI technology for their latest development tool suites, not only for SoC development but even for IP in processor-less FPGAs. Figure 2 on page 3 illustrates the development tool methodology for SoC design. Tasks are created to satisfy system requirements, and then initially partitioned as candidates for execution by either the CPU or the FPGA. During the development and modeling of each task, it may become apparent that a task may need to be reassigned to the other resource. Additional reassignment or optimization may occur when CPU and FPGA tasks are combined and tested during system integration.

Xilinx's SDSoC Development Environment supports their Zynq SoC devices. Familiar C/C++ design inputs to Eclipse compiler tools help developers determine which tasks dominate the CPU workload. Such tasks might be shifted to the FPGA programmable logic to help achieve the required real-time performance. SDSoC coordinates execution of both the CPU and FPGA tasks, showing the effects of different partitioning and implementations of tasks within each partition.

Tasks assigned to the FPGA are directed towards the Vivado Design Suite, which uses HLS (high level synthesis) to create IP from the C/C++ design input. Alternative design input choices include HDL using Verilog or VHDL and block diagram tools like MATLAB using System Generator. In addition, the Vivado IP Catalog is an extensive collection of plug-andplay IP modules for signal processing, communication, imaging, matrix processing, data manipulation, coding and formatting. Third-party IP and RTL design entries can be turned into compatible IP modules using the Vivado IP Packager.

Regardless of the Vivado design input, all of these newly-created IP modules use AXI4 interfaces compatible with the existing IP Catalog modules. Vivado IP Integrator streamlines the installation of AXI4 interconnects as required to ensure interoperability between IP modules. SDSoC helps link these AXI4 interfaces to compatible AXI4 links on the ARM CPU. The SDSoC and Vivado thus produce a fully synthesized modular SoC design complete with memory mapping, modeling, debugging tools, test benches, and timing analysis.

Altera's SoC Embedded Design Suite includes the Altera edition of the ARM DS-5 Development Studio to support the ARM CPU on Arria and Stratix SoCs. Based on Eclipse Tools, this open source extensible development environment includes compiler, debugger, and execution tracer.

Altera's QSYS System Integration Tool supports FPGA development tasks by graphically connecting IP modules from Altera and IP partners. Because they are equipped with AXI4 interfaces, QSYS automatically configures the required interconnects to implement the subsystem. QSYS creates custom IP using schematic or HDL design inputs. Quartus II System Level Software integrates the Embedded Design Suite with QSYS for a complete development environment. It includes Altera's IP modules, and resources for modeling, analyzing, and debugging the interaction between the ARM CPU and FPGA resources. It optionally includes DSP Builder and support for OpenCL.

It is clear that both Xilinx and Altera are competing directly for high-end SoC designs by offering powerful ARM CPUs tightly coupled to powerful FPGAs, not only at the device level, but also with comprehensive and ambitious design tool suites. In fact, system integrators may be tempted to choose the SoC vendor based upon the effectiveness of the tools, more so than on silicon features. But switching SoC vendors is a major commitment for any company, and the potential benefits must be carefully weighed. Acquiring the training, skills, design methodology, expertise, culture, and effective points of contact for support from a new vendor is often decided only at the highest levels of corporate management.

Since we are still in the early days of SoC offerings, embedded systems developers can expect significant advances in performance over the next few years as vendors continue to boost silicon resources and race to provide tools to most easily take advantage of them.



## VIDEO: A Quick Look at Pentek's SPARK Development Systems

Pentek's SPARK Development Systems are the quickest way to get up and running and start your application development. Each SPARK PC system includes a highperformance rackmount PC with Pentek hardware and software installed and tested, allowing immediate "out of the box" deployment. SPARK Development Systems are available in PC and VPX platforms.



Click <u>here</u> to watch a short, informative video.





New generations of FPGAs provide a level of processing performance and I/O bandwidth that generally surpasses conventional CPU designs. However, the I/O interfaces have become closely tied to the FPGAs, limiting reuse of FPGA designs because boards are designed with a specific type of I/O. Therefore, it has been challenging to design FPGA boards with an I/O suitable for a wide range of customers.

A modular design solves this problem, and the VITA 57 standard for an FPGA Mezzanine Card (FMC) was created for this purpose. The VITA 57 standard separates the FPGA from the I/O by defining two separate components: a mezzanine card that provides the I/O, and an FPGAbased carrier card. If a different I/O design is needed, the mezzanine card can be changed, and the FPGA-based carrier can be adapted to the new I/O requirements:



### Pentek Analog & Digital I/O Catalog

This catalog provides information about Pentek's analog and digital I/O products, including the Flexor family. Click <u>here</u> to download it. the FPGA design can be reconfigured and used with the new I/O module.

In this way, the VITA 57 FMC specification solves a major quandary with FPGA-based I/O: how to get optimum I/O bandwidth, and yet be able to change the I/O functionality.

Although VITA 57 makes it possible to use FMCs and carriers from different manufacturers, it is not a simple matter. The 38-question compatibility checklist in the VITA 57 specification is an indication of this. If an FMC from one vendor and a carrier from another prove to be compatible based on the checklist and additional investigation, FPGA IP and control software still must be developed. So it is possible to make an FMC and a carrier from different vendors work together, but it takes some time and effort.

For this reason, Pentek developed FMC /carrier sets called  $\frac{\text{FlexorSets}^{\text{TM}}}{\text{FMC}}$  as part of Pentek's  $\text{Flexor}^{\text{R}}$  family of FMC products.

### **FlexorSet Advantages**

- Pentek designs all the FPGA logic to use the Flexor FMC on a Flexor carrier.
- Pentek's GateFlow<sup>®</sup> FPGA Design Kit provides an entire FPGA design that you can modify or replace as needed for your specific application.

- A FlexorSet saves time and work because you do not need to create any FPGA IP.
- A FlexorSet includes many useful functions built into the FPGA logic.
- Pentek's GateXpress<sup>®</sup> PCIe Configuration Manager supports dynamic FPGA reconfiguration though software commands as part of the runtime application. This provides an efficient way to quickly reload the FPGA.
- Software functions to support all of a Pentek board's built-in functions are already written and ready to use in the ReadyFlow<sup>®</sup> software that Pentek provides with its products.
- Pentek's FMC carriers offer an optional VITA 66.4 optical interface.
- Pentek guarantees the analog and digital performance of Pentek products. We design our carriers and PCBs to deliver the highest possible analog performance.
- Pentek's FlexorSets are tested, shipped, and supported as an assembled unit from one company (Pentek), with Pentek's quality support on the whole board set. >>



Model 3312 FMC and Model 5973 FMC Carrier



# **Pentek's FlexorSets**

Signal Interface Solutions for Radar, Communications, or Data Acquisition:

# FlexorSet Model 5973-317 for 3U VPX and Model 7070-317 for PCIe

The Flexor Model 3316 8-channel A/D FMC is installed on either of two Flexor FMC carriers containing Pentek's eight-channel digital down converter (DDC) intellectual property (IP), which is ideally matched to the eight 250 MHz, 16-bit A/Ds on the FMC.

"The FlexorSet product strategy of bundling FMCs, carriers, and critical IP allows us to deliver complete solutions addressing a broad range of customer requirements," said Rodger Hosking, vice-president of Pentek. "Our building block product strategy puts the latest technology into our customers' hands quickly and affordably. Developing FPGA IP and supporting software is a difficult process, and FlexorSets leverage our years of experience to help our customers deliver systems on time."

In FlexorSet <u>Model 5973-317</u> and <u>Model 7070-317</u>, the FMC front end accepts eight analog HF or IF inputs on front panel connectors with transformer coupling into four Texas Instruments ADS42LB69 dual A/D convertors, boosting density for high channel count systems.

Each DDC has an independent 32-bit tuning frequency ranging from DC to the A/D sampling frequency. Each DDC can have its own unique decimation setting, supporting as many as eight different output bandwidths. Decimations can be programmed from 2 to 65,536, providing a wide range to satisfy virtually all applications.



Flexor



### Optical Interface Solutions for Radar, Communications, or Data Acquisition:

# FlexorSet Model 5973-312 for 3U VPX and Model 7070-312 for PCIe

These FlexorSets use the Flexor Model 3312 4-channel A/D and 2-channel D/A FMC installed on either of two Flexor FMC carriers. These FlexorSets combine the high performance of the Virtex-7 and optical interconnects with the flexibility of the multichannel FMC data converter, creating a complete, powerful radar and software radio sub-system.

"Combining FMCs and carriers from different vendors can be challenging. In addition to checking hardware compatibility, custom FPGA IP must be developed along with supporting software libraries. The FlexorSet product strategy of bundling FMCs, carriers, critical IP, and software allows Pentek to deliver complete solutions ready to be integrated into the customer's system," said Bob Sgandurra, Director of Product Management. He added, "Pentek also becomes the first supplier to offer VITA 66.4 backplane optical solutions on a VPX carrier, and compatible optical cable interfaces on a PCIe FMC carrier."

The <u>Model 5973-312</u> 3U VPX FlexorSet features a high pin-count VITA 57.1 FMC site, 4 GB of DDR3 SDRAM, PCI Express (Gen. 1, 2 and 3) interface up to x8, optional user-configurable gigabit serial I/O and optional LVDS connections to the FPGA for custom I/O. ➤



### FlexorSet Models 5973-312 and 7070-312 (continued)

The <u>Model 5973-312</u> 3U VPX Flexor-Set delivers new levels of I/O performance by incorporating the emerging VITA 66.4 standard for half size MT optical interconnect, providing 12 optical duplex lanes to the backplane. With the installation of a serial protocol, the VITA-66.4 interface enables gigabit backplane communications between boards independent of the PCIe interface.

The <u>Model 7070-312</u> PCIe FlexorSet features all of the same resources, except its optical interfaces are brought to MTP connectors compatible with industrystandard cables. ➤

# Limited Time Discount Offer!





### Multichannel, High-Speed A/D and D/A:

# FlexorSet Model 5973-324 for 3U VPX and Model 7070-324 for PCIe

These FlexorSets use the Flexor Model 3324 4-channel A/D and 4channel D/A installed on either of two carriers (<u>Model 5973-324</u> and <u>Model 7070-324</u>). The carriers contain optimized Pentek FPGA IP for A/D acquisition and D/A waveform playback, which is ideally matched to the four 500 MHz, 16-bit A/Ds and the four 1.5 GHz, 16bit D/As with digital up-converters on the FMC.

The Model 3324 FMC front end accepts four analog HF or IF inputs on front panel connectors with transformer coupling into two Texas Instruments ADS54J69 dual A/D converters, boosting density for high-channel-count systems.

On the output side, a Texas Instruments DAC38J84 D/A converter accepts baseband real or complex data streams from the FPGA. Each stream then passes through digital interpolation and upconversion stages before delivery to the D/A. Output sampling rates up to 1.5 GHz are supported, with or without translation.

FlexorSet carriers support the 3324 FMC with a choice of Virtex-7 FPGAs to match the specific requirements of the processing task. Optional optical interfaces for gigabit serial inter-board communication and optional LVDS connections to the Virtex-7 FPGA for custom I/O afford flexible configuration of the platform.

"These FlexorSet carriers include critical IP for data converters, memory and system interfaces plus additional signal processing resources for custom tasks," said Rodger Hosking, vice-president of Pentek. "The Model 3324 FMC delivers a high-density front end, nicely balanced with four wideband analog inputs and outputs and superior signal quality. Pentek now can offer twice the number of channels at twice the sampling rate of previous models." >>



### Pentek's FlexorSets (cont.)

# Bundling for Seamless Integration

All FlexorSets come pre-configured with a suite of built-in functions for data capture, synchronization, time tagging and formatting, all tailored and optimized for the FMC and carrier. This IP enables high-performance capture and delivery of data to provide an ideal signal interface for radar, communications, or general data acquisition applications, eliminating the integration effort typically left for the user when integrating the FMC and carrier.

# Development Tools and Software Support

FlexorSet presents system integrators with an ideal development and deployment platform for custom IP. The Pentek GateFlow FPGA design kit gives users access to the complete factory installed IP at the source level, allowing them to extend or even replace the built-in functions.

The Pentek GateXpress PCIe configuration manager supports dynamic FPGA reconfiguration though software commands as part of the runtime application. This provides an efficient way to quickly reload the FPGA, which reduces development time during testing. For deployed environments, GateXpress enables reloading the FPGA without the need to reset the host system, ideal for applications that require dynamic access to multiple processing IP algorithms.

The Pentek ReadyFlow Board Support Package is available for Windows and Linux. The ReadyFlow C-callable library contains a complete suite of initialization, control and status functions, as well as a rich set of precompiled examples, which help accelerate application development.

For more information, contact Pentek or go to <u>pentek.com/go/flexorinfo</u>.



### Pentek Radar & SDR I/O Catalog

This catalog provides information about Pentek's radar and softwaredefined radio (SDR) I/O products, including the Flexor family. Click <u>here</u> to download it.

# **Q&A** with Pentek

Q: How do I select the correct Talon analog signal recorder?

**A:** There are two major characteristics that differentiate one Talon analog signal recorder from another: the form factor and the maximum sample rate of the A/D converter.

The maximum sample rate of the A/D converter is listed in the main description of every Talon analog signal recorder. For example, the Model 2746 200 MS/s RF/IF Rugged Rackmount Recorder provides A/Ds with a maximum sample rate of 200 MHz. This sample rate dictates the maximum bandwidth signal that this recorder can accurately sample and record. In the case of a 200 MHz A/D, we know that the maximum signal bandwidth that can be captured is 0.8 x fs/2, assuming a user-supplied 80% anti-aliasing filter. This means that this A/D can capture signals that are 80 MHz wide. A 3.6 GHz A/D product can capture signals that are 1440 MHz wide, assuming the same 80% filter.

For more information about selecting a recorder based on form factor, see the Talon FAQ "What is the difference between the RTV, RTS, RTR and RTX recorders?"

### Q: What signal types do the Talon digital signal recorders support?

A: Talon digital signal recorders include support for serial FPDP, Gigabit Ethernet, 10 Gigabit Ethernet, 40 Gigabit Ethernet, and LVDS.



VIDEO: A Quick Look at Pentek's New "A" Series of Talon Data Recorders



Click here to watch the video.

With a wide range of analog and digital interfaces, the Talon RTR "A" series is ideal for high bandwidth data recording from the lab to harsh field environments. For more information, <u>go to pentek.com/go/talonportA.</u>