The Next Leap in Performance for Many of Today’s Most Critical Applications

On March 31, 2016, Intel Corporation announced the release of the next-generation Intel® Xeon® processor E5-2600 v4 product family. These new processors are designed to help organizations respond more quickly and intelligently as the business world continues to speed up. In almost every industry, critical applications have to process larger volumes of data at higher speeds to enable faster research and innovation and to support real time, analytics-driven business models.  

With higher core counts, enhanced virtualization capabilities, and increased memory bandwidth, this new processor family provides balanced resources for ramping up performance across a broad range of workloads. Based on benchmarks from leading independent software vendors (ISVs), it enables substantial performance gains for workloads as diverse as engineering simulations, financial trading, multimedia processing, core business applications, and big data analytics

Performance is just one focus of improvement. Businesses also need more agile and efficient infrastructure solutions, so they can respond faster to their customers and keep rising costs under control. The Intel® Xeon® processor E5 v4 family includes integrated technologies to help businesses orchestrate and automate resource usage more intelligently in software-defined data centers. IT organizations and cloud operators can use these capabilities to improve workload balancing and consolidation ratios, and to prioritize performance for critical workloads running on shared infrastructure. 

Proven Performance Gains for Key Applications

Even the most powerful server platform is only as valuable as the applications it runs. Dozens of ISVs have already optimized and benchmarked their applications on servers powered by the Intel Xeon processor E5-2600 v4 product family. In many cases, they have combined these new processors with other Intel platform ingredients to enable even higher performance gains. 

Optimizing the Large-Scale Atomic/Molecular Massively Parallel Simulator (LAMMPS)* software code for the Intel® Xeon® processor E5-2600 v4 product family and Intel® Advanced Vector Extensions 2.0 (Intel® AVX 2) has enabled up to 5.5X faster performance for the open source application.This dramatic speedup will help scientists look deeper into the behavior of complex molecular systems and explore more scenarios in less time.

INTES* optimized their PERMAS* finite element analysis software for the new processor family and added the Intel® Solid-State Drive Data Center Family for faster access to high volume data. Results show up to 2.8X faster simulation performance,2 which will help INTES  customers reduce manufacturing runtimes.

AppFormix* is taking advantage of the new Intel® Xeon® processor E5 v4 family to help improve performance and reliability for applications running on the AppFormix cloud platform. In addition to the increased execution resources, the new processor family includes Intel® Resource Director Technology (Intel® RDT), which can be used to monitor and control shared platform resources, such as cache and system memory. Based on performance testing, the new processors running optimized AppFormix cloud software enable up to 2.2X improvements in worst-case web server response times, with up to 1.5X faster average response times. Watch the AppFormix video.

These are just a few examples of the benefits leading ISVs are delivering to their customers by optimizing their software for the new Intel® Xeon® processor E5 v4 product family. Substantial performance gains have been realized across a wide range of software categories, including  technical computing, telecommunications, cloud and digital media, financial services and security, core business applications, enterprise database, and big data analytics.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. 

Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance.

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration.  Check with your system manufacturer or retailer or learn more at Intel.com.

Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. 

Cost reduction scenarios described are intended as examples of how a given Intel- based product, in the specified circumstances and configurations, may affect future costs and provide cost savings.  Circumstances will vary. Intel does not guarantee any costs or cost reduction.

Intel, the Intel logo, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries.


1 LAMMPS*: LAMMPS Stillinger-Weber Silicon Benchmark – 512K atoms workload. Testing by Intel, 3/21/2016

BASELINE: Intel® Xeon®Processor E5-2697 v3 on Grantley-EP (Wellsburg), with 64 GB Total Memory, 8 slots / 8 GB / 2133MT/s / DDR4 RDIMM, HT on, on Red Hat EnterpriseLinux* 6.5 kernel 2.6.32-431, LAMMPS: (25 Sep 2015+ LOCAL),compilers_and_libraries_2016.0.109,mpi5.1.2.RC1, Request Number: 2196

NEW: Intel® Xeon®Processor E5-2697 v4 on Grantley-EP (Wellsburg), with 64 GB Total Memory, 8 slots / 8 GB / 2400MT/s / DDR4 RDIMM, HT on, on Red Hat EnterpriseLinux* 6.5 kernel 2.6.32-431, LAMMPS: (25 Sep 2015+ LOCAL),compilers_and_libraries_2016.0.109,mpi5.1.2.RC1, Request Number: 2196

Intes PERMAS: Total processing time workload. Testing by Intel, 2/16/2016

BASELINE: 1-Node, 2 x Intel® Xeon® Processor E5-2697 v3 on Grantley-EP (Wellsburg) with 256 GB Total Memory, 16 slots / 16 GB / 1866 MT/s / DDR4 RDIMM, 4x Seagate R10K 1TB, turbo on, on Red Hat Enterprise Linux* 7.1 kernel 3.10.0-229. Data Source: Request Number: 1939

NEW: 1-Node, 2 x Intel® Xeon® Processor E5-2699 v4 on Grantley-EP (Wellsburg) with 256 GB Total Memory, 16 slots / 16 GB / 2133 MT/s / DDR4 RDIMM, 4x Intel SSD P3600 1.6TB, turbo on, on Red Hat Enterprise Linux* 7.1 kernel 3.10.0-229. Data Source: Request Number: 1939

3  AppFormix: NGINX based webserver workload, 100KB Request Size. Testing by AppFormix 3/3/2016.

BASELINE: NGINX web server (88 threads across both sockets) serving requests of 10MB, 1MB, 100KB, 10KB to external load generation system (below) on Ubuntu* 14.04, kernel v4.4 + Intel’s CAT v16 patch + MBM latest patch, 2 x Intel® Xeon® processor E5-2699 v4, 2.2GHz, 22 cores, 64GB DDR4-2133, standard RDIMMs, generic mass-market 7200RPM HDD, 10Gb network links via dual-port Intel X540-AT2 NICs (Model X540T2G1P5), BIOS Grantley 0271 with production microcode 0xE, C1E disabled and turbo disabled for test repeatability.

NEW: Intel’s Cache Allocation Technology (CAT) enabled via the AppFormix software suite, Linux cgroups patches (CAT v16 patch mentioned above: https://github.com/fyu1/linux/tree/cat16.1 ) and set to restrict the “noisy neighbor” applications to 10 percent of the L3 cache (effective CAT mask 0x00003 on a 20-way LLC).

“Noisy neighbor” applications: 11 processes per socket of the industry-standard “Stream” benchmark, publically available at https://www.cs.virginia.edu/stream/ref.html. One parameter changed to increase array size: stream.c: #define STREAM_ARRAY_SIZE    100000000

External Load generation system: wg/WRK benchmarking tool running 22 threads on Ubuntu Linux 14.04, based on 2 x Intel® Xeon® L5520@ 2.27GHz CPUs, 24GB DDR3-1067 with 10Gb networking (Intel® X540-AT2 NICs) over CAT7 copper 

 

standard
For more complete information about compiler optimizations, see our Optimization Notice.