Developer Guide

  • 2021.2
  • 06/11/2021
  • Public
Contents

Partition the Cache Manually

You can partition the cache manually among various caching agents.
Intel recommends doing this after you have tried the presets and after you have reserved software SRAM, if you need to further refine the configuration.

Before You Begin

When creating cache partitioning schemes, it is important to understand when it is necessary to isolate a portion of the cache (i.e., only a single caching agent can allocate into it), and when it is appropriate to share a portion of the cache (i.e., many caching agents can allocate into it). Sharing is often required as cache resources are limited, but this results in cache contention which can increase jitter and make it more difficult to bound worst-case execution times. Isolating (or dedicating) portions of the cache to a single caching agent can reduce jitter and improve performance for a particular workload, but at the cost of performance for everything else. The optimal cache partitioning scheme will vary for everyone and be use-case dependent.
Before you begin, here are some considerations for creating your own cache partitioning scheme:
  • Software SRAM buffers are, by design, not configurable from this screen. Cache space for these regions must be reserved first, and the remaining cache can be partitioned here.
  • Unless your design is graphics intensive, start with a small amount of space for the GPU and increase as needed. This leaves more of the cache undisturbed for applications running on CPU cores.
  • If you do not know your I/O working set size, consider keeping only 1 cache way reserved for I/O. Increase as needed depending on design.
  • Remember to configure the various CPU masks using Class of Service (COS). After creating and applying your cache partitioning scheme, follow the instructions described in Assign Classes of Service to Cores to assign a COS to a CPU core.

Steps

To partition the cache manually:
  1. If you have not done so, launch the tool as described in Launch the Cache Configurator.
  2. At this prompt, select
    P
    :
    Would you like to add (A) or delete (D) a cache allocation? Or would you like to change the way the configuration is partitioned? (P) P
  3. Select
    6
    as shown in How the Interface Works.
    6. Edit the Current Configuration / do not use a preset template
  4. After using manual cache partitioning via option 6, enable RTCM to allow the cache allocation library to work properly. See Real-Time Configuration Manager (RTCM) for instructions.
  5. If any cache is reserved for I/O, confirm that
    WRC Feature
    is enabled in BIOS.
    Intel® TCC Mode
    enables the setting by default. If the setting is disabled, you may see reduced performance.

How the Interface Works

A terminal user interface (TUI) interactive application appears as follows:
Legend:
1
Help
2
Editor for each caching agent. This control enables you to view and edit the cache ways (waymasks) assigned to a caching agent.
3
Summary that shows which caching agents can use each cache way. This view is useful to determine which ways are shared among caching agents. The numbers show which Classes of Service (COS) are assigned to a caching agent.
Labels:
Label
Name
Description
S
Software SRAM
Cache ways dedicated for software SRAM buffer
G
GPU
Cache ways assigned to GPU use
I
I/O
Cache ways assigned to I/O use
C
CPU
Cache ways assigned to CPU use
The screen presents the layout of L3 cache on the target system and how different caching agents access the cache. At the bottom of the screen, the ALL table represents the entire cache. The cache is divided into equal segments called
cache ways
, which you can allocate to caching agents.
Above the ALL table is an editor that shows a view of the same cache, separated by caching agents. Each caching agent has a table that shows the available cache for that agent. The agent can access only the cache ways that are marked. The cache ways assigned to an agent is called a
region
. You can use the controls to increase, decrease, or move the region assigned to a caching agent.
The tool guides you to create valid configurations based on the underlying platform architecture. For example, it is only possible to assign consecutive cache ways; there can be no gaps. This is why it is only possible to change the position and size of the region as a whole and not possible to assign each cache way separately. Also, each agent must have access to at least one cache way. This is why the tool allows you to remove all but one cache way from each agent.
Multiple caching agents can share a cache way, but none can share the ways that are dedicated for software SRAM buffers. These cache ways cannot be edited from this screen. To edit software SRAM buffers, see Reserve Software SRAM.
When the CPU has multiple Classes of Service (COS), each core is assigned a single class of service. It is a common practice to configure CPU COSes to dedicate some cache ways to a real-time workload (that is, making sure the cache ways do not overlap with other COSes). The number of COSes varies by platform.
As you make your edits, you can check the result in the ALL table. It shows whether a particular cache way is dedicated to one caching agent or shared among multiple agents. For CPU, the notation indicates the COS assignment. For example, C0-2 means COS 0, 1, and 2 are assigned to CPU.
For example, the image above shows the following configuration (from left to right):
  • 1 cache way [11:11] is used for software SRAM buffer
  • 6 cache ways [10:5] are assigned to GPU use
  • 6 cache ways [10:5] are assigned to CPU use and are associated with COS 0 and 3
  • 2 cache ways [4:3] are assigned to CPU use and are associated with COS 2
  • 2 cache ways [2:1] are assigned to CPU use and are associated with COS 1
  • 1 cache way [0:0] is assigned to I/O use
When the GPU has multiple COSes, the cache configurator displays the status of the first COS (COS 0). When you use the cache configurator to edit the GPU cache region, the cache configurator applies the configuration to all COSes, overwriting the previous COS configuration. Making all COSes for the GPU match is the most robust solution for virtualized environments where the GPU is assigned to a virtual machine. While the software running in the virtual machine is capable of changing which GT COS is active (0 through 3), if the masks are all the same then there is no impact. If the masks differ, it may be undesireable to have software in a virtual machine capable of altering which cache ways are used by GPU. The following image shows an example. In the example: (1) the cache configurator detects a GPU allocation in COS 0 and (2) displays it in the interface. Then (3) the user moves the region, and (4) the cache configurator applies the change to all COSes, making all COSes match.
Usage:
  • Use the keyboard’s “UP” and “DOWN” arrow keys to select a caching agent. The selected caching agent is marked with the “> <” signs.
  • Use the keyboard’s “LEFT” and “RIGHT” arrow keys to move the region of the currently selected caching agent. The region cannot overlap a software SRAM buffer or exceed cache bounds.
  • Use the “+” and “-” keys to increase or decrease the size of the region assigned to the currently selected caching agent. The size of the region cannot be bigger than the available continuous free size and cannot be less than one cache way.
  • Use the “PgUp” key to switch to the next COS and the “PgDown” key to switch to the previous COS.
  • Use U key to quit the manual partitioning mode without saving any changes and go back to the main menu.
  • Use S key to quit the manual partitioning mode, save the changes and go back to the main menu.
  • Use Ctrl+C to exit the application
If the partitioning scheme includes any isolated cache regions for real-time workloads, you will need to take additional steps outside the cache configurator to allow your real-time application to use the dedicated cache. For details, see Assign Classes of Service to Cores.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.