5G/4G/3G Subscriber Session Coherent Load Balancing

Solution Outline

The correct and on-time distribution of GTP sessions is critical to achieving high performance analysis of the content within the GTP tunnels, providing data service for mobile devices and users. This analysis is vital for operators to understand and capitalize on the information and business insights hidden in the vast data flows.

The mobile operator’s BSS/OSS systems need to receive every correlated packet of a user’s or application’s GTP session within a specific timeframe. Otherwise, the usefulness of these expensive backend systems is severely reduced, and their ROI significantly crushed. Also negatively impacted is the mobile operator’s ability to track subscriber activities, resolve subscriber issues promptly and be proactively notified of issues with critical revenue generating services.

Silicom Session Based GTP distribution solution solves the challenge of processing scalability of analyzing appliances and systems monitoring subscriber data in mobile networks. This is achieved through correlation and load balancing while preserving subscriber coherent control and user plane traffic in its distribution.

The Silicom way

At the core of the Silicom solution is a standard PCIe card fitted with an FPGA that can operate in any standard server or custom appliance.

The FPGA gives the Silicom solution the ability to operate with packets at line rate, in decoding, filtering, and matching/lookup functions. This ability is not only a high-performance capability, but it is also a very flexible solution, in that the FPGA is fulfilling these functions based on having the functionality programmed directly into the FPGA. Which in turns means that this programming can be updated and extended to support new capabilities and features, to continuously adapt to the requirements of the monitoring solution and the ever-changing network, standards, and topologies the solution is operating within. This ensures a solution that is long-lived and can evolve without having to replace hardware and change architectures. Silicom even provide the function on several different links speeds and using those on the same piece of hardware.

Key applications
  • Monitoring & analyzing data traffic in mobile networks
  • Performance and service assurance
  • End-user experience monitoring
  • Dynamic monitoring of specific GTP sessions/users
  • Session based analysis
  • Cyber security
  • Bulk Data collection
  • VIP subscriber segment analysis
  • Forensic and lawful intercept of subscribers

The Silicom solution further takes advantage of the FPGA being on a PCIe card that is hosted in the monitoring device or server, as it utilizes the Host RAM and a single CPU core to assist in the Session correlation and the continuous end-point updates and creation needed to support the mobility of the subscribers. This is done without ever needing to transfer the high-volume User plane traffic to the host for the purpose of correlation. Obviously all or parts of the sessionized traffic can be relayed to the monitoring processes on the host system, limited only by the PCIe bandwidth of the host, just as it can be relayed to other devices via the Tx part of the Ethernet interface.

Having the solution one a PCIe card supported by host RAM and CPU means the Session load balancing function can be fully integrated in monitoring solutions, giving the application full control for configuration and on the fly alterations for rebalancing. It can also offload an integrating solution with a zero packet-loss and zero CPU cycle transfer of packets to host memory for processing. The FPGA can even do this in parallel for any non-Control/User plane traffic also being part of a solution’s monitoring. But, more importantly, it allows for an integrating solution to be self-contained, without relying on external devices like Packet brokers, switches, Mirror/Span port from 3rd parties and controlled by a network owner, with whatever proprietary interface they may expose to try to control and steer the load balancing. The power of the Silicom solution is that with a single FPGA card and the Silicom Software package, any standard server can be turned into a high-performance and highly advanced pivot point for a Subscriber aware Mobile data aggregator, load balancer, primary session distribution point, and session filtering solution.

All of this is in stark contrast to some other solutions that are based on User plane load balancing capabilities on inflexible ASICs HW, like some switch and packet brokers do, and in most cases not actually correlating the Control plane to the user plane, but merely load balancing the User plane based on inner IPs. They may be “broadcasting” the Control plane to multiple processes or centralizing a process for analysis of all the control plane, leaving all user plane processes depending on this for providing any semblance of real-time subscriber correlation.

The latter strategy is also the strategy chosen by some alternative FPGA based packet capture solutions. I.e., load balance to host buffers all the User plane traffic based on the inner IPs and leave it up to the application how it will try to deal with substantial correlation task still needed to combine the Control plane to the User plane. A task that grows bigger with high loads. A task that is prohibitive for any real-time monitoring solution that seeks to combine the User plane with any subscriber level information for use in prioritization or triggers, or even filtering or segmenting of the monitoring based on information like a subscriber’s IMSI, session APN, session QCI or other. Capabilities which can be essential for a solution trying to do targeted monitoring of high value subscribers or even basic interception tasks that should include all data.

So… What does it do?

The design of the Silicom solution allows for deployment as a stand-alone or an integrated monitoring solution, which only requires traffic from interfaces surrounding the UPF/PGW, but also allows for other control plane interfaces to be correlated and delivered to analyzing processes or external monitoring devices.

The correlation and load balancing keeps the individual subscriber’s user plane together with the subscribers control plane, regardless of it being GTP-U, GTP-C or PFCP. As an integrated element it becomes part of the monitoring solutions giving direct management and control of the solution without imposing a management interface with the network infrastructure. As a stand-alone deployment its deployment platform and management are detached from the monitored network and the monitoring devices, while just being connected through network interfaces to both.

The Session Based GTP Distribution allows for advanced filtering and segmentation of subscribers and all their traffic from control plane and user plane interfaces, which can be correlated together. This includes GTP-U, GTPv1-C, GTPv2-C and PFCP, the latter for 4G CUPS and 5G. Additionally, network elements can be employed in filtering, such as specific NB, eNB and gNB, as well as APNs as used in the session establishment.


Depending on the configuration the forwarding and load balancing toward analyzing CPU cores or probes may be segmenting in various ways, e.g., the Control plane packets may be duplicated and isolated for some subscriber segments’ traffic, while both User place and Control Plane packets can be provided for other segments, thus allowing in depth analysis of segments of subscribers while a shallower approach may be employed for other segments. This can allow monitoring solutions to focus on the subscribers most important for the operator’s revenue, while keeping the analysis resources to the level needed for doing a fully covering, but targeted analysis.

Not all Control plane traffic of a subscriber needs to be available to perform the session distribution. The correlation can rely on either GTP-C or PFCP, or both, which is then correlated to the subscribers’ GTP-U traffic. This flexibility is important to meet the operator deployment scenarios where topology and colocation of nodes is not a given. In many cases the opposite is the case. Monitoring around the User plane nodes is essential for the correlation and for efficient data movement.

OSSDistribution can go into regular distribution channels or into VIP channels. These channels are mapped to either RAM buffers, Tx Interfaces or VLANs on Tx interfaces for logical segmentation on the same physical distribution interface. VLAN channels can be used to send coherent traffic to external devices and these devices will be able to load balance based on the initial correlation done, simply by load balancing based on the channelized VLAN IDs.

This distribution can be exclusively into RAM buffers on the solutions deployment host, or via Tx network links to secondary devices or a combination, which ensures efficient utilization of the deployment HW by letting the correlating host also handling analysis workloads.

The single host’s resources can be optimally utilized by using the Session Based GTP Distribution to load balance the coherent data to multiple RAM buffers. This allows using multiple CPU cores with multiple instances of the analyzing process, which ensure optimal utilization of the available CPU resources.

Key features

  • Independent of network infrastructure
  • Supports 5G/4G CUPS/4G/3G/2G
  • Coherent subscriber session load balancing
  • Distribution of PFCP, GTP-C and GTP-U based on correlation with IMSI or generic ID
  • Subscriber or session unique key
  • Distribution channel derived from IMSI, for persistence of distribution between sessions
  • Subscriber (IMSI) filtering for interception
  • Subscriber segmentation and bulk IMSI filtering supporting 10 mil+ IMSIs
  • Flexible and weighted distribution
  • Standalone solution for network and monitoring system independence
  • System integrated distribution solution for optimal flexibility and resource utilization
  • RAM buffer load balancing, Direct distribution, Daisy chain distribution with fractional load balancing
  • HW traffic filters features, with BCD style syntax
  • True IP fragment handling in distribution

OSSScaling beyond the power of a single host

Performing distribution to RAM in a single Host has some limitations in the bandwidth across PCI, with about 110 Gbps per card and the individual server’s own ability to scale processing power with CPU and available RAM speeds and indeed also for handling analysis results.

To be truly scalable the distribution via Tx can be employed to bring more compute power to the task of analyzing the data.

The Silicom solution can utilize 2 concurrent 2x100GE for highest bandwidth performance. But many different line speeds and port counts can be utilized. For passing the load balancing on to secondary processing nodes, the Tx ports of the initial correlation FPGA PCIe card are connected to these secondary nodes.

The load balancing can be configured as either a daisy chain or as a direct distribution. With the daisy chain all correlation traffic is channeled via VLAN and sent via one or more Tx links to the next processing node. This node filters in a number of VLANs matching the load per channel and processing capacity.

OSSThe secondary node’s FPGA PCI card will duplicate and regenerate the received signal and transmit it out of its own Tx ports. Thus, passing on the entire channeled traffic to the next node in the daisy chain, which in turn filters in the next subset of VLAN channels it will process, and so on. Like this any number of nodes can be employed in the processing.

In the direct distribution each processing node is directly attached to the correlating and distributing node. This avoids the use of VLAN channels, but it can still be beneficial to use them, as it allows the processing node to load balancing across its CPUs simply by using VLAN IDs.

This allows for optimal utilization of the hardware platforms employed. Through dual level GTP session distribution, an unprecedented effectiveness and scalability can be achieved with new or existing analysis engines in multiple nodes, while preserving the session coherency for analyzing data for vertical and horizontal metrics.

Key benefits

  • Fully offload session correlation tasks and eliminate typical performance bottlenecks
  • Coherent user sessions
  • Load balance up to 400 Gbps of User and control plane traffic
  • Enables processing of large number of sessions
  • Frees CPU cycles for analysis tasks
  • Subscriber targeted monitoring, supporting 10’s of millions IMSIs
  • More than 500M tunnel endpoints supported
  • Easy integration into host system
  • Scale performance of existing solution’s SW/probe architecture
  • Zero packet loss

By utilizing the power and flexibility of FPGAs and the fbCAPTURE framework, Silicom Denmark’s Session Based GTP Distribution, the capacity of analysis systems can be raised to a new level using existing implementations and including even legacy equipment as well as cutting edge.

fb2CG@KU15P Dual Port FPGA CardThe Silicom Denmark GTP solution can ensure that monitoring systems can keep up with the growth in traffic volumes while providing the same full and rich analysis as before. This can be realized on standard commodity hardware for analysis rather than the ever more expensive hardware needed if trying to simply scale via pure processing power.

At the core of Silicom Denmark’s Session Based GTP Distribution are hardware integrated network traffic decoders and a GTP/PFCP tracking engine. All based on high performance hardware, harnessing the power and flexibility of FPGAs. The solution consists of one or more FPGA PCIe line cards for 1 – 100Gbps and port count form 2 – 16 ports per card, and the programming of this to ensure efficient traffic handling and correlation.

Additionally, configuration tools and control plane tracking SW to run on Host system is provided, requiring just a single CPU core. An API for low level control and alternate correlation mechanisms can be provided.