With an ever increasing demand for RF spectrum, the wireless industry will have to break paradigms to achieve what once was only a supposition. Two very important challenges for the future of wireless communications are spectral efficiency and energy efficiency. Some strategies are currently under study to meet the challenge. One of them is to drastically increase the number of transceivers of the wireless nodes, an approach referred to as “massive MIMO”.

Data interface requirements

As Nutaq described in a comprehensive six-post blog series, using very large scale antenna arrays with many transceivers (hundreds) on a base station, could be a solution. However a gap between theory and practice still remains, and such a huge system involves many engineering challenges.

As Nutaq thoroughly explained in a previous post, using many transceivers to cover a large bandwidth results in very high data rates within the system. To make the situation even more difficult, all the data must be routed to a central processing unit for the massive MIMO baseband processing. Simple calculations indicate an approximately 126 Gbps requirement for the particular case of a 128-transceiver system covering a 20 MHz bandwidth (calculations have been made assuming 16-bit digital sampling at a 30.72 MHz sampling rate for both I and Q samples). Most data interfaces are not suited to support this kind of throughput. This blog post reviews the different options and explains what is required on a technical basis to enable their application.

Table 1 shows the expected throughputs for different data interfaces. Figure 1 shows the limitation of each interface as the number of channels grows (transceivers).

Interface Latency Typical rate (Mbps)
PCIe gen1 x4 μ sec 6400
PCIe gen2 x8 μ sec 28800
1 Gigibit Ethernet m sec 896
10 Gigibit Ethernet m sec 9000
Table 1: Data rates of different types of interfaces
Figure 1: Data throughput requirement in Gbps as a function of number of MIMO channels for a system covering a 20 MHz bandwidth (light blue)
Figure 1: Data throughput requirement in Gbps as a function of number of MIMO channels for a system covering a 20 MHz bandwidth (light blue)
We see in Figure 1 that one PCIe Gen 2 x8 channel would be able to support just slightly more than 25 RF transceivers before the data rate would be too high for the infrastructure. PCIe Gen 2 is the fastest data link currently used on shared backplane industrial shelves.
Furthermore, this is for a hypothetical 20 MHz bandwidth system. The data throughput requirements and technology under development requires higher bandwidths, therefore increasing the data throughput requirement even more. Unless we completely change the data routing architecture, it is impossible to route all the data to a central processing unit using available interfaces.

The alternatives

Systems from National Instruments are based on PXI Express chassis technology, which uses the PCI Express (PCIe) interface for the communication backplane.
The philosophy behind PXI Express is to provide modular systems with a throughput sufficient for single module real-time operation or multi-module recording and playback applications
[1]. However, massive MIMO systems are far from being single module and require full-bandwidth real time functionality. Moreover, PCIe has the downside of implying a point-to-multipoint architecture where system components often share a common bus (Figure 2).
Figure 2: A typical point-to-multipoint PCIe architecture
Figure 2: A typical point-to-multipoint PCIe architecture
The maximal raw data speed of one PCIe Gen2 x4 link in a PCIe chassis is around 16 Gbps [1] (typically rated around 14400 mbps). Even if there is only one link between the PXI chassis, there is not enough throughput for even 15 channels covering a 20 MHz spectral bandwidth with a radio interface (16 bits per sample with an LTE sample clock rate of 30.72 MHz).

Nutaq’s TitanMIMO-4 solution

Nutaq implements its data routing based on a point-to-point architecture optimized for direct communication between the field-programmable gate arrays (FPGAs). This architecture enables direct access to all the FPGA’s high speed serial transceiver resources (GTX) for data transfer. With all 28 available GTX routed to the rear transition modules (RTM), seven Aurora pipes, each with four pins, can be made for each FPGA board. Each Aurora pipe can sustain an approximate throughput of 16 Gbps. This results in a potential 112 Gbps of bandwidth in and out of every single FPGA in Nutaq’s TitanMIMO-4 (in its base configuration – higher throughputs can be reached when options are added). If a single FPGA is used for data processing, the total system bandwidth is 112 Gbps. Referring back to Figure 1, this enables slightly above 100 channels in real time at a 20 MHz bandwidth.
Figure 3: Point-to-point architecture using 16 Gbps Aurora links
Figure 3: Point-to-point architecture using 16 Gbps Aurora links
Aurora is a link-layer protocol used to move data across point-to-point serial links. Therefore, it does not present the complexity of PCIe with its bus-based, parallel architecture and does not present any bottleneck within its structure.
This architecture also lets the user connect the FPGAs in a mesh topology, enabling research on distributed processing schemes. These schemes could soften the requirement for a single processing unit architecture, thus decreasing the required throughput within the system and enable wider bandwidth real-time processing. With this idea in mind, one can see that Nutaq’s massive MIMO system is the most suited for research on 5G wireless technology involving massive MIMO and large bandwidths (e.g. 100 MHz).
Moreover, the TitanMIMO-4 can be coupled to the Kermode-xv6, a eight FPGA array connected in a mesh topology, in order to provide the system with the most powerful FPGA-based processing unit ever build for baseband processing.


[1] PXI Express Specification Tutorial, http://www.ni.com/white-paper/2876https://nutaq.com/