Code and Data Prioritization - Introduction and Usage Models in the Intel® Xeon® Processor E5 v4 Family

Introduction

Code and Data Prioritization (CDP) as introduced on the Intel® Xeon® processor E5 v4 family is a specialized extension to Cache Allocation Technology (CAT), which enables software control over code and data placement in the last-level cache (LLC).

While this article focuses on the architecture and use models, the next article in this series describes software support for CDP and example proof points.

High-Level Architecture

The architecture of CDP is based on an extension to the existing Cache Allocation Technology (CAT) feature set, with many components reused.

Specific details are provided in the Intel Software Developer Manuals, and some key points are highlighted below.

With CDP, a new CPUID feature flag is added within the CAT sub-leaves at CPUID.0x10.[ResID=1]:ECx[bit 2]  to indicate support.

As shown in Figure 1, an enable MSR is added (IA32_L3_QOS_CFG, at address 0xC81), and setting bit[0] enables CDP:

                                                        

Figure 1:The CDP Enable bit in the IA32_L3_QOS_CFG MSR.

 

Before enabling CDP software, make sure to program all threads into the lowest Class of Service (CLOS[0]) and full-length capacity bitmasks (CBMs) to prevent unintended side effects of cache capacity mask reindexing.

When CDP is enabled the existing mask space is re-indexed to provide separate control over code and data, as shown below: 

Figure 2:When CDP mask details are enabled, one mask is provided for code, and one for data for each CLOS.

As shown in Figure 2, CDP provides separate control over code and data by enabling separate masks for code and data. With traditional CAT enabled, CLOS maps 1:1 with CBMs. With CDP enabled, the mapping is now 1:2 (each CLOS maps to two CBMs, one for code and one for data).

Note that when CDP is enabled, the number of CLOS is effectively reduced by a factor of two, and the same mask MSR space is reused, as shown in the table below:

Mask MSR

MSR Address

CAT-only Operation

CDP Operation

IA32_L3_QOS_Mask_0

0xC90

COS0

COS0.Data

IA32_L3_QOS_Mask_1

0xC91

COS1

COS0.Code

IA32_L3_QOS_Mask_2

0xC92

COS2

COS1.Data

IA32_L3_QOS_Mask_3

0xC93

COS3

COS1.Code

IA32_L3_QOS_Mask_4

0xC94

COS4

COS2.Data

IA32_L3_QOS_Mask_5

0xC95

COS5

COS2.Code

....

....

....

IA32_L3_QOS_Mask_’2n’

0xC90+2n

COS’2n’

COS’n’.Data

IA32_L3_QOS_Mask_’2n+1’

0xC90+2n+1

COS’2n+1’

COS’n’.Code

 

 

As with other QoS features, software may update the CDP settings at any time.

Key Usage Models

Key usage models include protecting the code of certain applications on the L3 cache, including those with large code footprints and large data footprints, which may otherwise contend for LLC space. Additionally, certain latency-sensitive applications such as communications apps may benefit as code is more likely to be in the L3 cache when needed (rather than needing to be fetched from memory).

General software support may be enabled by an OS, Hypervisor or container management framework to control apps, containers, and guests running on a platform. Optionally, the capabilities could also be exposed northward to orchestration software to enable centralized automation or tracking of isolation vs. goals.

Note that in many cases CDP, as a specialized feature, may not necessarily provide a throughput increase for the platform, but may instead trade throughput for increased determinism and performance consistency. The results will depend on the workload characteristics involved.

Conclusion

CDP enables software control over code and data placement in the LLC, which may benefit code-sensitive applications including communications workloads. The CDP feature is based on an extension of existing CAT controls, which enables efficient reuse of infrastructure while minimizing software-enabling overhead.

The next article in this series describes software support for CDP and example proof points.

 

Notices

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.

The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request.

Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm.

This sample source code is released under the Intel Sample Source Code License Agreement.

 

 Intel, the Intel logo, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries.

*Other names and brands may be claimed as the property of others.

© 2016 Intel Corporation.

For more complete information about compiler optimizations, see our Optimization Notice.