Speeding Up Your Cloud Environment On Intel® Architecture

By Quoc-Thai V Le, Published: 05/15/2013, Last Updated: 05/15/2013

In my previous blog, I discussed “Ways to Speeding up Your Cloud Environment…”, I will continue with this thread by introducing the topic of Software Defined Networks (SDN).  The industry has been depending on proprietary networking equipment and appliances, essentially creating an environment requiring vertical integrated software running on dedicated hardware. Due to millions of new connected devices and increasing traffic in the cloud computing environment, the network congestion challenges the vertical networking business model.  As a result, the cloud computing community is looking into a network virtualization solution. 

This blog focuses on speeding up your data packet networking by using the Intel® Data Plane Development Kit (Intel® DPDK) on Intel® Architecture.  With an Intel® Xeon® processor E5-2600 series (or later), an integrated DDR3 memory controller, an integrated PCI Express controller, and Intel DPDK, you can potentially see the increase for small network packet throughput in your cloud computing environment.  Before going into the Intel DPDK, I want to provide some insight for those unfamiliar with the terminology related to the SDN.

Common Terminology for SDN (from Wiki)  

  • SDN is a form of network virtualization where the control plane is separate from the data plane, and is implemented in a software application. SDN architecture allows network administrators to have programmable control of network traffic without the need to access to the network's hardware devices.
  • Control plane is the part of the router architecture concerned with the information in a routing table that defines what to do with incoming packets. 
  • The data plane defines the part of the router architecture that decides what to do with packets arriving on an inbound interface.  
  • Network virtualization is the process of combining hardware, software network resources, and network functionality into a single software-based administrative entity. 
  • A virtual network is a link that does not consist of a physical (wired or wireless) connection between two computing devices.  Network virtualization involves platform virtualization, often combined with resource virtualization.
  • Platform virtualization hides the physical characteristics of a computing platform from users, instead showing another abstract computing platform.
  • Hypervisor is the software that controls virtualization.

Intel’s 4:1 Workload Consolidation Strategy

Intel’s strategy is to consolidate the workloads (application, control plane, packet and signal processing) into a more scalable and simplified solution on Intel® Xeon® processor platforms.  This software-based approach depicted in Figure 1 shows the Intel’s 4:1 workload consolidation strategy.  Figure 2 and Figure 3 show the performance increases from various generations of Intel architecture processor-based platforms.

Figure 1. Intel's 4:1 Workload Consolidation Strategy

Note: Performance tests and ratings below are HW/SW configuration dependent and measured using specific computer systems and/or components as measured by those tests. Any difference in the configuration will reflect in the test results.

Figure 2. Breakthrough data performance with Intel® development kit (Intel® dpdk) L3 packet forwarding

Note: The measurement is in Million packets per second (Mpps) and each packet is 64 byte. The data performance (L3 packet forwarding) indicated that you can experience higher throughput by applying the Intel DPDK to your Linux environment.

Figure 3. IPv4 Layer 3 Forwarding performance for various generations of Intel Architecure Processor-based platforms

Figures 2 and 3 show the small packet performance achievable using the Intel Architecture with the Intel DPDK.    The hardware elements that contribute to the performance increase are the integrated memory controller, the integrated PCI Express* controller, and the increase number processor cores per chip in the latest Intel processor.   

The System Configurations Used for collecting the data used in Figure 2 and Figure 3 were:

  • Dual Intel® Xeon® processors E5540 (2.53 GHz, 4 core) processed 42 Mpps.
  • Dual Intel® Xeon® processors E5645 (2.40 GHz, 6 core) processed 55 Mpps.
  • A single Intel® Xeon® processors E5-2600 (2.0 GHz, 8 core) processed 80 Mpps (with Intel® Hyper-Threading Technology (Intel® HT Technology) disabled).
  • A dual Intel® Xeon® processors E5-2600 (2.0 GHz, 8 core) processed 160 Mpps (with Intel® HT Technology disabled) and 4x 10GbE dual port PCI Express* Gen2 NICs on each processor.

Intel DPDK Overview

The Intel DPDK is based on simple embedded system concepts and allows users to build efficient small packet (64byte) high performance applications.  It consists of a growing number of libraries (Figure 4) designed for high speed data packet networking, and offers a simple software programming model that scales from Intel® Atom™ processors to the latest Intel® Xeon® processors. The source code is available for developers to use and/or modify in a production network element.

  • The Environment Abstraction Layer (EAL) provides access to low-level resources (hardware, memory space, logical cores, etc.) through a generic interface that hides the environment specifics from the applications and libraries.
  • The Memory Pool Manager allocates NUMA-aware pools of objects in memory.  The pools are created in huge-page memory space to increase performance by reducing translation look aside buffer (TLB) misses, and a ring is used to store free objects.  It also provides an alignment helper to ensure objects are distributed evenly across all DRAM channels, thus balancing memory bandwidth utilization across the channels.
  • The Buffer Manager reduces the amount of time the system spends allocating and de-allocating buffers.  The Intel DPDK pre-allocates fixed size buffers, which are stored in memory pools for fast, efficient cache-aligned memory allocation and de-allocation from NUMA-aware memory pools.  Each core has a dedicated buffer cache to the memory pools, which is replenished as required.  This provides a fast and efficient method for quick access and release of buffers without locks.
  • The Queue Manager implements safe lockless queues instead of using spinlocks that allow different software components to process packets, while avoiding unnecessary wait times.
  • The Ring Manager provides a lockless implementation for single or multi producer/consumer en-queue/de-queue operations, supporting bulk operations to reduce overhead for efficient passing of events, data and packet buffers.
  • Flow Classification provides a proficient mechanism for generating a hash (based on tuple information) used to combine packets into flows, which enables faster processing greater throughput.
  • Poll Mode Drivers for 1 GbE and 10 GbE Ethernet controllers greatly speed up the packet pipeline by receiving and transmitting packets without the use of asynchronous, interrupt-based signaling mechanisms, which have a lot of overhead.

Figure 4. Major Intel DPDK Components

The Intel DPDK library is currently provided cost-free to OEMs under a BSD licensing model. A public version of the software will be available to download in early 2013.  For more information, see www.intel.com/go/dpdk

Once you download the Intel DPDK, here is the suggested reading order to use the kit:

Release Notes: Provides release-specific information, including supported features, limitations, fixed issues, known issues, and so on. It also provides frequently asked questions in FAQ format.
Getting Started Guide: Describes how to install and configure the Intel DPDK; designed to get users up and running quickly with the software.
Programmer's Guide: Describes:

— The software architecture and how to use it (through examples), specifically in a Linux* application (linuxapp) environment.
— The content of the Intel DPDK, the build system (including the commands that can be used in the root Intel® DPDK Makefile to build the development kit and an application) and guidelines for porting an application.
— Optimizations used in the software and those that should be considered for new development.

API Reference: Provides detailed information about Intel DPDK functions, data structures and other programming constructs.
Sample Application User Guides: A set of guides, each describing a sample application that showcases specific functionality, together with instructions on how to compile, run and use the sample application.


The growing demand for more connected devices and data accesses over the network has pushed the vertical network model to the limit.  To save cost and reduce power consumption of the network infrastructure, you may consider decreasing the number of physical assets by consolidating their functions using network virtualization on a common platform.  By using the Intel DPDK library on a common platform, you can:

  • experience faster network packet processing,
  • potentially reduce cost by simplifying the hardware to industry standard server architectures,
  • conserve energy by using power-optimized Intel platforms,
  • and increase efficiency by maximizing the utilization of your existing environment.


Packet Processing on Intel® Architecture:

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804