The ZeptoSDR is one of Nutaq’s latest products. A low-cost, flexible platform, it packages the Zynq-based Zedboard and a Radio420 FPGA mezzanine card (FMC) in an air-cooled case. Nutaq includes a reference design and an API to help you quickly get started on your software-defined radio applications.
In this blog post, I walk through the ZeptoSDR reference design and show how to use it as an arbitrary waveform generator (AWG). Different configurations are possible due to its advanced architecture (an FPGA and a dual-core ARM processor embedded on the same chip) and its external host computer connections. The benefits and drawbacks of each configuration will be discussed.
To get the most out of this post, I recommend you first familiarize yourself with the ZeptoSDR by reading the ZeptoSDR: Architecture and API blog post.
Figure 1: Waveform generation in the FPGA
The reference design shows how to connect the I/Q 12-bit signal in the FPGA logic to the Radio420 in order to transmit to the radio RF front-end baseband signal. The RF front-end can be controlled by the ARM processors with the use of the API in the C programming language.
This configuration enables the full bandwidth of the Radio420 hardware (28 MHz) since the FPGA can operate at much higher rates. However, developing FPGA code is not trivial. The FPGA doesn’t have a lot of available memory (560 KByte maximum), so only short waveforms can be played.
Hopefully, the waveform can be generated by real-time signal processing functions in order to avoid the use of memory. When using signal processing to generate the waveform, the complexity of the waveform depends on the operations perform by the FPGA logic. The desired waveform may be hard to generate inside the FPGA if the modulation requires too many mathematical operations. For example, a finite impulse response (FIR) filter can be easy to implement, but the available FPGA resources can limit the filter’s number of taps. This is also true for other signal processing operations like Fast Fourier Transforms (FFTs).
The Zynq 7020 (the FPGA/ARM chip used on the Zedboard) contains 220 DSP slices. A symmetrical FIR filter, for example, requires 1 DSP slice every 2 taps.
Embedded processor configurations
Figure 2: Waveform streaming from the ARM through the RTDEx
Nutaq provides a driver with its reference design for exchanging data between the FPGA’s programmable logic and the ARM’s processing system. From the FPGA’s perspective, the Real-Time Data Exchange (RTDEx) core can be seen as two first-in/first-out (FIFO) memories, one in each direction, that act like a data pipe for transmitting and receiving raw data with the ARM processor. On the processor side, the API provides simple functions to exchange raw data with the FPGA logic: RTDExOpenAxi, RTDExSend, and RTDExReceive.
In the reference design, the ARM processor runs an Ubuntu distribution from Linaro. Simple demos and their source code are available. One of the applications loads a binary file from the ZeptoSDR’s SD card to its internal memory and then sends it to the FPGA via the RTDEx. The binary data is composed of raw I and Q 16-bit interleaved samples. The FPGA RTDEx 32-bit bus splits them into two I and Q signals and sends them to the radio.
Figure 3 shows a default reference design application. The user can connect to the ZeptoSDR by a serial connection or SSH and launch the application’s binary file with a set of arguments. In this example, the Radio420 operating frequency is set to 40 MHz in order to create a waveform with a 20-MHz bandwidth.
Figure 3: Streaming I and Q samples from a binary file
This application requires an 80 MB/s sustained transfer rate between the ARM processor and the FPGA logic. While the average transfer rate of the RTDEx driver from the ARM to the FPGA can be greater than 400 MB/s, the limited buffer size in the FPGA (currently set to 65536 bytes) means an underflow condition may occur when data is transferred from the RTDEx FIFO to the Radio420 core. This is because the ARM processor is not setup for real-time operation. If the ARM processor receives other interrupts, it may delay the next DMA transfer from the ARM memory to the FPGA logic and cause an underflow condition since the RTDEx FIFO will run out of samples.
The limit before some underflow may occur is approximately 80 MB/s. So, when using this approach, the RF bandwidth is limited to around 20 MHz. One way to increase this bandwidth is to increase the RTDEx FIFO size. This will decrease the time requirement between DMA transfers as long as the average transfer data rate is higher than the system consumption rate.
Figure 2 shows that no transmission underflow happened after the application’s initialization. This information can be obtained by reading the RTDEx driver status (located at /proc/driver/rtdex).
Figure 4: RTDEx driver status
The TX underflow status shows that one underflow occurred but it was due to the initialization process. The underflow happens before the first RTDEx DMA has been performed. No other underflows occurred during the subsequent 15,910,386 DMA transfers of 4096 bytes.
If the application or the RTDEx driver cannot sustain the required data rate, the TX underflow signal will be incremented each time underflows happened between two DMA transfers.
The Zedboard has 512 MB of internal DDR3 memory for the ARM processor. This limits the length of the waveform that can be played. The Zedboard includes a 4 GB SD card but its transfer rate (a few MB/s) is not fast enough.
The waveform can also be computed by a real-time digital processing algorithm that doesn’t require memory to store a pre-generated waveform. However, this method will have a limited throughput depending of the complexity of the waveform. Since the data is computed in parallel of the data streaming, the processor needs to sustain the data rate requested by the system configuration when generating the waveform. Even if the dual-core ARM Cortex-A9 processor (running at 666 MHz) can be considered as a fast embedded micro-processor, when dealing with MHz sampling rates and performing filtering, its maximal capability can quickly be reached.
External host computer
Figure 5: Streaming I and Q samples from an external computer
Another approach would be to stream the raw I and Q samples from a external computer. The reference design includes a simple TCP server application that runs on the ZeptoSDR ‘s ARM processor and acts like a bridge between an external host computer and the RTDEx driver. Using this method, the ARM processor is used only for TCP packet decoding and sends raw data directly to the FPGA.
The host computer can read the raw I and Q data from a file or generate it dynamically with signal processing algorithms. Compared to the embedded processor approach, the host computer doesn’t have same limitations – the file size can be as large as required since its hard drive can keep up with the required throughput. Furthermore, a high-end computer has multiple Gigahertz cores that can perform complex and computationally intensive signal processing algorithms.
On top of it, Nutaq’s GNU Radio plug-in can be used to rapidly create waveforms from blocks freely available from its open-source community.
The drawback of this approach is that the TCP/IP protocol is relatively complex and the ARM processor cannot fully benefit from its Gigabit Ethernet port. Moreover, the TCP communication between the host computer and the ARM must have a constant throughput, which is not guaranteed with this kind of protocol. If too much time elapses between TCP packets, the RTDEx FIFO can underflow and causes discontinuities in the waveform.
The maximum sustained transmission rate by the TCP server is only around 40 MB/s. This means that a bandwidth of 10 MHz can be handled using this technique. The bottle neck of this technique is the ARM processor. At this rate, one of its cores is used at 100% of its capability. Using simpler protocol like UDP could help increase the bandwidth because the ARM processor could handle a higher data rate flow.
All of the methods shown in this blog post can be used to turn the ZeptoSDR into a highly configurable arbitrary waveform generator. Each method has its own pros and cons – the best one depends of your requirements. Hybrid configurations can also be created, using the processing power available at the different stages: FPGA logic, ARM processor, and external processor.