An overview of the 6th generation Intel® Core™ processor (code-named Skylake)

Introduction

The 6th generation Intel® Core™ processor (code-named Skylake) was launched in 2015. Based on improvements in the core, system-on-a-chip, and platform levels and new capabilities over the previous-generation 14nm processor (code-named Broadwell), Skylake is the processor-of-choice for productivity, creativity, and gaming applications across various form factors. This article provides an overview of the key capabilities and improvements in Skylake, along with exciting new usages like wake on voice and biometric login using Windows* 10.

Skylake architecture

The 6th generation Intel Core microarchitecture is built on 14nm technology that takes into consideration reduced processor and platform size for use in multiple form factors, Intel® architecture and graphics performance improvements, power reduction, and enhanced security features. Figure 1 illustrates these new capabilities and improvements. Actual configuration in OEM devices may vary.

Figure 1: Skylake architecture and improvement summary [1].

Core processor vectors

Performance

Performance improvement is a direct result of providing more instructions to the execution unit—more instructions executed per clock. This was accomplished through four categories of improvements [Ibid]

  • Improved front-end. Smarter branch prediction with higher capacity creates wider instruction decoding, and faster and more efficient prefetch.
  • Enhanced instruction parallelism. With more instructions per clock, the parallelism of instruction execution is improved through deeper out-of-order buffers.
  • Improved execution units (EUs). The EUs are enhanced compared to the previous generations through :
    • Shortening latencies
    • Increased number of EUs
    • Improved power efficiency of turning off units not in use
    • Increased security algorithms execution speed.
  • Improved memory subsystem. With improvements to the front-end, instruction parallelism, and EUs, the memory subsystem is also improved to scale to the bandwidth and performance requirements of the above. This has been accomplished through :
    • Higher load/store bandwidth
    • Prefetcher improvements
    • Deeper storage
    • Fill and write-back buffers
    • Improved page miss handling
    • Improvements to L2 cache miss bandwidth
    • New instructions for cache management

Figure 2: Skylake core uArchitecture at a glance.

Figure 3 shows the resulting increase in parallelism in Skylake compared to previous generations of processors (Sandy Bridge is the second generation and Haswell the 4th generation of Intel® Core™ processors).

Figure 3: Increased parallelism over past generations of processors.

The improvements shown in Figure 3 and more resulted in up to a 60-percent increase in performance compared to a five-year-old PC, with up to 6 times faster video transcoding and up to 11 times the graphics performance.

Figure 4: Performance of 6th generation Intel® Core™ processor compared to a five-year-old PC.

  1. Source: Intel Corporation. Based on estimated SYSmark* 2014 scores comparing Intel® Core™ i5-6500 and Intel® Core™ i5-650 processors.
  2. Source: Intel Corporation. Based on estimated Handbrake w/ QSV scores comparing Intel® Core™ i5-6500 and Intel® Core™ i5-650 processors.
  3. Source: Intel Corporation. Based on estimated 3DMark* “Cloud Gate” scores comparing Intel® Core™ i5-6500 and Intel® Core™ i5-650 processors.

For detailed benchmarks in performance for desktop and laptop can be found in the following:

Desktop performance benchmark: http://www.intel.com/content/www/us/en/benchmarks/desktop/6th-gen-core-i5-6500.html

Laptop performance benchmark: http://www.intel.com/content/www/us/en/benchmarks/laptop/6th-gen-core-i5-6200u.html

Energy efficiency

Resource configuration based on dynamic consumption:

Legacy systems use the Intel® SpeedStep® technology for balancing performance with energy efficiency through a demand-based algorithm controlled by the OS. While this works well for steady workloads, it is not optimal for bursty workloads. In Skylake, Intel® Speed Shift Technology shifts control from the OS to the hardware and allows the processor to go to a maximum clock speed in ~1 ms, providing for finer-grained power management [3].

Figure 5: Comparison of Intel® Speed Shift Technology with Intel® SpeedStep® technology.

On Intel® Core™ i5 – 6200U processor, the chart below gives the response time of Intel Speed Shift Technology compared to Intel SpeedStep technology:

  • Up to 45-percent improved responsiveness
  • Photo enhancement up to 45 percent
  • Sales graphs up to 31 percent
  • Local notes up to 22 percent
  • Overall responsiveness up to 20 percent

[Measured by WebXPRT* 2015, a benchmark from Principled Technologies* that measures the performance of web applications using overall and subtests for photo enhancements, local notes, and sales graphs. Find out more at www.principledtechnologies.com.]

Additional power optimization is also achieved by configuring resources based on dynamic consumption, be it through downscaling of resources that are underutilized or power gating of Intel® Advanced Vector Extensions 2 when not in use, as well as through idle power reduction.

Media and graphics

Intel® HD Graphics capabilities have come a long way in terms of 3D graphics, media and display capabilities, performance, power envelopes and configurability/scalability since processor graphics (the core processor and graphics on the same die) was first introduced in the 2nd generation Intel® Core™ processors. Figure 6 compares some of these improvements that provide a >100X improvement in graphics performance [2].

[Peak Shader FLOPS @ 1 GHz]

Figure 6: Generational features in processor graphics.

Figure 7: Generational improvement in graphics and media.

Gen9 uArchitecture

The Generation 9 (Gen9) graphics architecture is similar to the Gen8 microarchitecture in the Intel® 5th generation Core™ processor code named Broadwell but has been enhanced for performance and scalability. Figure 8 shows a block diagram of the Gen9 uArchitecture [8], which has three main components.

  • Display. On the far left side.
  • Unslice. The L-shaped piece in the center; handles the command streamer, global thread dispatcher, and the Graphics Technology Interface (GTI).
  • Slice. Comprises the EUs.

Compared to Gen8, the Gen9 uArchitecture enables maximum performance per watt, throughput improvements, and separate power/clock domain to the unslice component. This capability makes it more intelligent in terms of power management for uses like media playback. The slice component is configurable. For example, while GT3 can support up to 2 slices (each slice with 24 EUs), GT4 (Halo) can scale up to 3 slice units (GTx stands for the number of EUs based on use: GT1 supports 12, GT2 supports 24, GT3 supports 48, and GT4 supports 72). This architecture is configurable enough to allow for scaling down the number of EUs for low-power scenarios, thus allowing for usages that range from less than 4W to more than 65. API support in Gen9 is available for DirectX* 12, OpenCL™ 2.x, OpenGL* 5.x, and Vulkan*.

Figure 8: Gen9 processor graphics architecture.

You can read more about these components in detail at (IDF link, https://software.intel.com/sites/default/files/managed/c5/9a/The-Compute-Architecture-of-Intel-Processor-Graphics-Gen9-v1d0.pdf)

Some of the capabilities and improvements for media include the following [2]:

  • < 1 W consumption and 1 W videoconferencing
  • Camera RAW acceleration with new VQE functions to enable 4K60 RAW video on mobile platforms
  • New Intel® Quick Sync Video Fixed-Function (FF) Mode
  • Rich codec support with fixed function and GPU accelerated decode

Figure 9 gives a snapshot of Gen9 codecs.

Note: Media codec and processing support may not be available on all operating systems and applications.

Figure 9: Codec support in Skylake.

Some of the capabilities and improvements on the display include the following:

  • Panel Blend, Scale, Rotate, Compress
  • High PPI support (4K+)
  • Wireless support up to 4K30
  • Self Refresh (PSR2)
  • CUI X.X – New capabilities, improved performance

For the gaming enthusiasts, the Intel® Core™ I7-6700K processor comes with all these rich features and improvements (see Figure 10). It also includes Intel® Turbo Boost Technology 2.0, Intel® Hyper-Threading Technology, and overclocking. The performance gains are 80 percent better compared to a 5-year-old PC. Additional information can be obtained here: http://www.intel.com/content/www/us/en/processors/core/core-i7ee-processor.html

  1. Source: Intel Corporation. Based on estimated SPECint*_rate_base2006 (8 copy rate) scores comparing Intel® Core™ i7-6700K and Intel® Core™ i7-875K processors.
  2. Source: Intel Corporation. Based on estimated SPECint*_rate_base2006 (8 copy rate) scores comparing Intel® Core™ i7-6700K and Intel® Core™ i7-3770K processors.
  3. Features are present with select chipsets and processor combinations. Warning: Altering clock frequency and/or voltage may (i) reduce system stability and useful life of the system and processor; (ii) cause the processor and other system components to fail; (iii) cause reductions in system performance; (iv) cause additional heat or other damage; and (v) affect system data integrity. Intel has not tested, and does not warranty, the operation of the processor beyond its specification.

Figure 10: Features in the Intel® Core™ i7-6700K processor.

Scalability

Skylake microarchitecture provides for a configurable core—a single design with two derivatives, one for the client space and another for servers—without compromising the power and performance requirements of each segment. Figure 11 shows the various SKUs and their power efficiencies for use in form factors that range from a compute stick on the low end to Intel® Xeon® processor-based workstations on the high end.

Figure 11: Intel® Core™ processor availability across various form factors.

Enhanced security features

Intel® Software Guard Extensions (Intel® SGX): Intel SGX is a set of new instructions provided in Skylake that allows application developers to protect sensitive data from unauthorized modification and access from rogue software running at higher privilege levels. This allows applications to preserve the confidentiality and integrity of sensitive information [1],[3]. Skylake provides instructions and flows to create secure enclaves and enables usage of trusted memory regions. More information about Intel SGX can be obtained here: https://software.intel.com/en-us/blogs/2013/09/26/protecting-application-secrets-with-intel-sgx

Intel® Memory Protection Extensions (Intel® MPX): Intel MPX is a new set of instructions to enable runtime buffer overflow checks. These instructions allow both stack and heap buffer boundary testing before memory access to ensure that the calling process only accesses memory that is allocated to it. Intel MPX support is enabled in Windows* 10 with support for Intel MPX intrinsics in Microsoft Visual Studio* 2015. Most C/C++ applications will be able to use Intel MPX by recompiling their applications without source code changes and interoperating with legacy libraries. Running an Intel MPX-enabled library on legacy systems without Intel MPX support (5th generation Intel® Core™ processors and earlier) do not provide any benefit or impact. It is also possible to enable/disable Intel MPX support dynamically [1], [3].

We’ve covered the architectural improvements and advancement in Skylake. In the next section, we’ll look at some of the Windows 10 feature that are optimized to take advantage of Intel® Core™ processor architecture.

New experiences with Windows 10

The capabilities in the 6th generation Intel Core processor are accentuated by the capabilities within Windows 10, creating an optimal experience. Below are some of the key hardware capabilities from Intel and Windows 10 capabilities that make the Intel® platforms running on Windows 10 more energy efficient, secure, responsive, and scalable [3].

Ϯ Intel and Microsoft active collaboration under way for future Windows support.

Figure 12: Skylake and Windows* 10 capabilities.

Cortana

The Microsoft Cortana* Voice Assistant available with the Windows* 10 RTM allows for a hands-free experience using the Hey Cortana keyword spotter. While the wake-on-voice capability uses the CPU processing audio pipeline for great Correct Accept and low False Accept performance, the capability can also be offloaded to a hardware audio DSP which has built in support on Windows 10 [3].

Windows Hello*

Using biometric hardware and Microsoft Passport*, Windows Hello supports various types of logins using the face, fingerprint, and iris for a password-free, out-of-the box-login experience. The user-facing Intel® RealSense™ camera (F200/SR300) supports biometric authentication using facial login [3].

Figure 13: Windows* Hello with Intel® RealSense™ Technology.

The photos in Figure 13 show how the facial landmarks provided by the F200 camera are used for the enrollment and login scenarios. The 78 landmark points on the face are used to create a facial template the first time a user tries to log in with face recognition. The next time the user tries to log in, the landmarks from the camera are verified against the template to obtain a match. Together with the Microsoft Passport security features and the camera features, the login capability provides for an accuracy determined by 1/100,000 False Acceptance Rate with a False Rejection Rate of 2 to 4 percent.

References

  1. Intel’s next generation microarchitecture code-named Skylake by Julius Mandelblat: http://intelstudios.edgesuite.net/idf/2015/sf/ti/150818_spcs001/index.html
  2. Next-generation Intel® processor graphics architecture, code-named Skylake, by David Blythe: http://intelstudios.edgesuite.net/idf/2015/sf/ti/150818_spcs003/index.html
  3. Intel® architecture code-named Skylake and Windows* 10 better together, by Shiv Koushik: http://intelstudios.edgesuite.net/idf/2015/sf/ti/150819_spcs009/index.html
  4. Skylake for gamers: http://www.intel.com/content/www/us/en/processors/core/core-i7ee-processor.html
  5. Intel’s best processor ever: http://www.intel.com/content/www/us/en/processors/core/core-processor-family.html
  6. Skylake Desktop Performance Benchmark: http://www.intel.com/content/www/us/en/benchmarks/desktop/6th-gen-core-i5-6500.html
  7. Skylake Laptop Performance Benchmark: http://www.intel.com/content/www/us/en/benchmarks/laptop/6th-gen-core-i5-6200u.html
  8. The compute architecture of Intel® processor graphics Gen9: https://software.intel.com/sites/default/files/managed/c5/9a/The-Compute-Architecture-of-Intel-Processor-Graphics-Gen9-v1d0.pdf
有关编译器优化的更完整信息,请参阅优化通知