Intel's Cache Monitoring Technology (CMT) feature was introduced with the Intel® Xeon® E5 2600 v3 product family in 2014.
CMT is part of a larger series of technologies called Intel(r) Resource Director Technology (RDT). More information on the Intel RDT feature set can be found here, and an animation illustrating the key principles behind Intel RDT is posted here.
This feature enables fine-grained tracking of L3 cache occupancy, enabling detailed profiling and tracking of threads, applications or VMs.
Previous blog posts referenced below provide an overview of various aspects of the feature:
This blog, the fourth in the series, discusses details of available Operating System (OS) support, and software packages which can be used to test the feature.
Key ingredients discussed in this installment include the Linux* operating system, the perf profiling suite and a software package available from Intel which can be used on POSIX Operating Systems to monitor the L3 cache usage of applications (or pinned VMs) on a per-app/VM basis by pinning apps/VMs to cores.
Using the CMT capabilities is straightforward from a code development perspective since model specific registers (MSR) provide the interface to set up and query this capability. All modern Operating Systems provide application programming interface (API’s) or tools which enable users with the appropriate privilege to read and write the MSR’s. Linux* provides the msr-tools package which integrates both the readmsr and writemsr commands. Microsoft Windows* provides a similar interface. There are two high level approaches to Cache Monitoring:
To enable standalone and scheduler based monitoring several software development initiatives are in progress that are described in subsequent sections.
Scheduler based Cache Monitoring makes sure that the application of interest will be tracked with appropriate core and RMID association. Under Linux* this is achieved by integrating CMT into perf and its kernel support which is tightly bound to the Linux* scheduler's functionality.
In supported platforms (where both the processor and OS have support for CMT), perf is used to specify which process or thread is to be monitored and assigns it an RMID. All threads not being monitored will be assigned a default RMID used to capture the occupancy associated with those threads not specifically being monitored. Once perf configures the system for monitoring, context switches for the monitored threads result in a callback into the perf_events subsystem. When the CMT callback from the scheduler occurs (during the ‘context_switch’ kernel function), the perf_events subsystem selects the RMID associated with the thread being scheduled and assigns it to the CPU. The associated RMID may be for explicit monitoring or the default RMID in the case where the scheduled thread has not been configured for monitoring. From this point until the next context switch, the memory read requests and their subsequent cache loads from this logical processor will be assigned to the RMID just set up.
When a process or thread that is tracked for Cache occupancy terminates or the sched_out function call occurs, the perf CMT callback functionality selects a new RMID. In this instance the default RMID will be selected so that cache loads are not counted towards any explicitly monitored thread. After the monitored process terminates the associated RMID will be returned to a pool of unused RMID and will be recycled for new monitoring request. Mainstream support for these capabilities is trending to the Linux* kernel version 3.19.
CMT (Cache Monitoring Technology) Perf Implementation: The Linux* perf application provides an interface into kernel based performance counters. An extension has been developed to support the Cache Monitoring Technology feature. This allows users to monitor last level cache occupancy on a per-process or per-thread basis. The name of the new event is intel_cqm/llc_occupancy/. This new event returns the occupancy in bytes. The patches to perf and Linux* kernel are available from the following:
The Perf driver module will check for CMT hardware availability using the CPUID instruction (see  to learn about it). If CMT has been detected a number of function calls will be registered with Perf. Below are some of the registered events and their functionality:
To make sure that the occupancy associated with CPU is accurate the Perf kernel component associates the RMID only with the specific application thread while it is running on the CPU. As explained in the previous section, when the Linux scheduler swaps the process the RMID will no longer be associated with the core. In addition to RMID tracking, Perf also has process or thread inheritance support (any child process will inherit the RMID of its parent).
Basic operation of perf with CMT:
The motivation for proposing to use a limited set of User Space CMT APIs is to provide easier usage and integration of CMT into applications. This enables developers to use a small subset of API to retrieve cache occupancy information within their applications. Such unified access API implementation methodology would provide better management of shared level platform resources like RMIDs, access to MSRs etc.
Below are proposed functions which would wrap around the perf_event system calls. It will help tracking cache occupancy for task/pids.
Research is ongoing to provide a user space library that would allow developers or system administrators to take advantage of CMT without the need to consider how many RMIDs may be available and other RMID management task while tracking applications.
The proposed design and placement for the API implementation in depicted in this diagram:
Since KVM is a type two hypervisor it inherits the scheduler enhancements discussed in the previous section. Administrator or developers can utilize perf to track the last level cache occupancy of a virtual machine. The process or thread id’s of the virtual machines can be retrieved from the operating system through top or the Qemu monitor.
Since Xen is type 1 hypervisor scheduler enhancements will need to be made to track the last level cache occupancy. Xen 4.5 will be the first version which includes CMT support. The hypervisor implementation associates an RMID with each Domain (DomU or guest VM). Those that have been specified for monitoring will be associated with their own RMID while those not specified will be associated with the default RMID used to collect all non-monitored occupancy data. As the hypervisor schedules each domain on to a CPU and performs the context switch it also writes the RMID to the CPU specific MSR thus associating this CPU with the RMID and its associated domain. So long as the domain continues to run on the CPU the last level cache (LLC) occupancy resulting from domain memory reads from the CPU will be tracked via the RMID specified for the domain. When the next domain is scheduled for this CPU and the current monitored domain is switched out, its associated RMID is replaced on the CPU so no further association exists.
Xen’s xl command tool include a few additions to support CMT. The additions allow users to attach monitoring to a domain, detach monitoring and to show the LLC occupancy information. The command line tool additions have the following form:
$ xl psr-cmt-attach domid
$ xl psr-cmt-detach domid
$ xl psr-cmt-show cache_occupancy
In the above example commands, domid is the id number of the domain (guest VM) of interest.
This standalone library (available soon at: https://01.org/packet-processing/cache-monitoring-allocation-technology) enables developers to monitor the last level cache occupancy on per CPU basis without the need for OS enabling support (via the Standalone Cache Monitoring technique discussed earlier). When the library / application initially comes up it will check for the Cache Monitoring support. Once initialization is complete the monitoring functionality provides a “top”-like interface listing the last level cache occupancy on a per-CPU basis. The library implements a number of API's that enable developers to take advantage of CMT without the need to setup the MSRs that configure the RMID assignment or retrieval of the last level cache occupancy data. Developers may also utilize the library from within a virtual machine, however either paravirtualization (PV) technique or MSR bitmaps may be required to gain access to the CMT Model Specific Registers, and in general using the library from the host OS is preferred.
Additional OSes and VMMs will be enabled over time. Check the documentation or feature list for your preferred OS/VMM to determine if CMT is supported on a particular version.
If your preferred OS/VMM doesn’t yet support CMT their customer support organization may be able to track the feature request and provide an estimated time when support will be ready.
Several mainstream OSes and VMMs now include support for Intel's Cache Monitoring Technology (CMT), and for non-enabled OSes a software library will be available via 01.org to enable experimentation, prototyping of resource management heuristics and deployment of the features.
* Other names and brands may be claimed as the property of others
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804