Developer Guide

  • 2021.2
  • 06/11/2021
  • Public
Contents

Step 3: Preproduction: Generate a Tuning Configuration

In this step, you will walk through the data streams optimizer preproduction workflow to generate a tuning configuration that meets the MRL latency requirement along with power consumption optimization. The demo focuses on tuning the core-from-PCIe stream (MMIO reads) and power consumption.
This demo does not measure power consumption, but power consumption can be measured using tools outside of Intel® TCC Tools, such as Intel® SoC Watch. For 11th Gen Intel® Core™ processors, you can find a compatible version of Intel® SoC Watch in Intel® System Studio 2020 Update 3. For Intel Atom® x6000E Series processors, ask your Intel representative for access to the appropriate version of Intel® SoC Watch.
The data streams optimizer requires the following input files: environment file, requirements file, and workload validation script. For this demo, you will use provided samples of these files.
The output examples shown here are for illustration only. Your output may vary.

Preproduction Workflow

These steps assume a host-target environment.
  1. On the target system
    , confirm that the
    Data Streams Optimizer
    setting in system firmware is enabled. For details, see Data Streams Optimizer Setting.
  2. On the host system
    , source the environment file to set up environment variables:
    source ~/intel/oneapi/setvars.sh
  3. Go to the
    tools
    directory:
    cd ${TCC_TOOLS_PATH}
  4. Review the sample environment file:
    1. Open the sample environment file. This command example uses nano, but you can use any text editor.
      Note:
      If you decide to copy the environment file to another location on your host system, be sure to change the relative paths in the environment file or replace them with full paths.
      nano ./demo/environment/sample_environment_uefi.json
    2. Modify the following fields (see example file below):
      Field Name
      Description
      "hostname"
      Replace
      ...
      with the IP address or hostname of the target system.
      "username"
      Replace
      ...
      with
      root
      .
      The field should look like this:
      "username": "root"
      This field contains the username to connect to the target board via SSH.
      "password"
      Replace
      ...
      with the password to connect to the target board via SSH.
      Note:
      The sample demo uses plain text for simplicity. You can use a different method when you set up your own input files. For instructions to set up an SSH key, see Target Setup: Yocto Project* BSP.
      If the SSH connection does not need a password or you have created an SSH key and installed the SSH key on the target system, leave the password field empty as shown below:
      "password": ""
      All other fields
      Leave all other fields as is. These fields enable you to further customize the behavior of the tool, but customization is beyond the scope of this demo.
      Environment file example (“targets” section only):
      "targets": { "target_name_1": { "target_info": "...", "connections_settings": { "hostname": "...", "username": "root", "password": "", "port": 22, "connection_timeout":5, "reconnection_timeout":10, "reconnection_attempts":10 },
    3. Save and close the file.
  5. Review the sample requirements file:
    1. Open the sample requirements file. This command example uses nano, but you can use any text editor.
      nano ./demo/requirements/single_corepcierd_1.json
    2. For your reference, note the
      "command"
      field. This is the same script you used in the previous step, Step 2: Run MRL on Untuned System. The tool will run the script to validate whether the tuning configuration meets the MRL latency requirement.
    3. Verify that the value in the
      --device
      field matches the name of the PCIe device. For example:
      I225
      or
      TSN
      . You can use the same value you used in Step 2: Run MRL on Untuned System. To check which device you have, run:
      lspci | grep -E 'Ethernet controller: Intel Corporation'
      The following example output shows the PCIe device is I225 (15f2).
      aa:00.0 Ethernet controller: Intel Corporation Device 15f2 (rev 03)
      Note:
      If your PCIe device is integrated TSN, use
      single_corepcierd_0.json
      file instead in next steps.
    4. Verify that the
      producer
      field matches your PCI device address. The example in the previous step shows the address is
      aa:00.0
      .
    5. Save and close the file.
  6. On the host system, run the preproduction tool to search for a tuning configuration.
    python3 tcc_data_streams_optimizer_preprod.py search --environment ./demo/environment/sample_environment_uefi.json --requirements ./demo/requirements/single_corepcierd_1.json
    For your reference, the following table contains a description of each argument.
    Option
    Description
    --environment
    Path to the sample environment file.
    --requirements
    Path to the sample requirements file.
  7. Confirm that you see output similar to the example below. The output shows that the tool first checks for dependencies, such as input files and ability to connect to the target via SSH.
    Processing environment file: ./demo/environment/sample_environment_uefi.json ... Environment file parsed. Processing requirement file: ./demo/requirements/single_userspace_0.json ... Requirement file parsed. Creating output folder: /home/<user>/intel/oneapi/tcc_tools/latest/tools/<target_hostname>/single_corepcierd_1_<date> Connecting to <target_hostname>... Connected.
    The tool finds the first suitable tuning configuration.
    The tool prints a list of messages to the log file. The messages describe the affected settings. The level of detail in these messages balances the need to provide useful information vs. the need to protect Intel proprietary information. For more information about each message, see Tuning Configurations.
    The tool generates a capsule of the configuration. The capsule is used to apply the configuration to the target. The target reboots.
    Connection to database tuning_tgl-u.db - successful. Searching for suitable tuning configuration... Stream 0:1c:0:0 -> Core3: configuration 1 out of 2 Generating the capsule(s) of the configuration to tune the system - for the settings of this configuration see /home/<user>/intel/oneapi/tcc_tools/latest/tools/<target_hostname>/single_corepcierd_1_<date>/log Capsule(s) were generated. Copying capsule(s) from host to target <target_hostname> Capsule(s) were copied Applying capsule(s) for <target_hostname>... Capsule(s) were applied for <target_hostname>.
  8. Wait for the target board to reboot. It may take 1 minute or even more. While the target is rebooting, the tool will attempt to connect to the target repeatedly based on the
    <reconnection_attempts>
    field in the environment file. Output example:
    Rebooting <target_hostname> Attempt 1. Reconnecting to <target_hostname>... Connection attempt failed. Trying again... Attempt 2. Reconnecting to <target_hostname>... Connection attempt failed. Trying again... Attempt 3. Reconnecting to <target_hostname>... Connection attempt failed. Trying again... Attempt 4. Reconnecting to <target_hostname>... Connection attempt failed. Trying again... Attempt 5. Reconnecting to <target_hostname>... Connection attempt failed. Trying again... Attempt 6. Reconnecting to <target_hostname>... Connected.
    After reconnecting to the target, the tool runs the workload validation script.
    Validating configuration for target <target_hostname>... Starting validation script: python3 /usr/share/tcc_tools/tools/demo/workloads/bin/mrl_validation_script.py. Validation script output: ====================================================== Return code: 1 Found CPU affinity for core 3 Running test ... Done. Test is complete! Results saved in data_mmio_read_latency_ns.csv data_mmio_read_latency_ticks.csv data_avg_inst_count.csv Enabling userspace access to performance counters Removing igc Start validation Validation stopped Restoring igc Validation is finished. Please wait for results processing. Validation information: device: I225 address: 0x88200000 core: 3 iterations: 10000000 processor: TGL-U Latency must be less than 90.0 us. Statistics: |Min |Max |Avg |Median ---------------------------------------------------------------- Microseconds|1.198 |92.347 |1.567 |1.546 ================================================================ Deadline |Iterations |Passed |Failed --------------------------------------------------- 80.0 us |10000000 |9999993 |7 =================================================== Failed: at least one iteration failed ====================================================== For details see the log file at: /home/test/intel/oneapi/tcc_tools/latest/tools/<target_hostname>/single_corepcierd_1_<date>/log Validation script FAILED
    If you see
    VALIDATION ERROR
    with a path to a log, this log is located on your target board.
    If the validation script fails, the tool repeats the tuning flow. It finds another suitable tuning configuration or exits if none are found. In this case, the tool finds another configuration:
    Searching for suitable tuning configuration... Stream 0:1c:0:0 -> Core3: configuration 2 out of 6 Generating the capsule(s) of the configuration to tune the system - for the settings of this configuration see /home/test/intel/oneapi/tcc_tools/latest/tools/<target_hostname>/single_corepcierd_1_<date>/log Capsule(s) were generated. Copying capsule(s) from host to target <target_hostname> Capsule(s) were copied Applying capsule(s) for <target_hostname>... Capsule(s) were applied for <target_hostname>.
  9. Wait for the target board to reboot. It may take 1 minute or even more. While the target is rebooting, the tool will attempt to connect to the target repeatedly based on the
    <reconnection_attempts>
    field in the environment file. Output example:
    Rebooting <target_hostname> Attempt 1. Reconnecting to <target_hostname>... Connection attempt failed. Trying again... Attempt 2. Reconnecting to <target_hostname>... Connection attempt failed. Trying again... Attempt 3. Reconnecting to <target_hostname>... Connected.
    After reconnecting to the target, the tool runs the workload validation script. Now the script shows that the maximum latency measurement meets the deadline.
    Validating configuration for target <target_hostname>... Starting validation script: python3 /usr/share/tcc_tools/tools/demo/workloads/bin/mrl_validation_script.py. Validation script output: ====================================================== Return code: 0 Found CPU affinity for core 3 Running test ... Done. Test is complete! Results saved in data_mmio_read_latency_ns.csv data_mmio_read_latency_ticks.csv data_avg_inst_count.csv Enabling userspace access to performance counters Removing igc Start validation Validation stopped Restoring igc Validation is finished. Please wait for results processing. Validation information: device: I225 address: 0x88200000 core: 3 iterations: 10000000 processor: TGL-U Latency must be less than 90.0 us. Statistics: |Min |Max |Avg |Median ---------------------------------------------------------------- Microseconds|1.192 |13.521 |1.562 |1.539 ================================================================ Deadline |Iterations |Passed |Failed --------------------------------------------------- 80.0 us |10000000 |10000000 |0 =================================================== Success: all iterations passed. ====================================================== Validation script PASSED
    After the validation passes, the tool shows a brief description of the tuning configuration, generates a tuning configuration file, and exits. Output example:
    Configuration for Target for session: <user>@<target_hostname> found. Tuning configuration applied a combination of Intel® TCC Mode BIOS options using BIOS default settings and real-time settings. Your latency requirements can be achieved with settings between the out-of-the-box configuration and Intel® TCC Mode enabled. Some power management may be disabled to tune for real-time latency with minimal impact to power or best-effort performance. Creating tuning configuration... See /home/<user>/intel/oneapi/tcc_tools/latest/tools/<target_hostname>/single_corepcierd_1_<date>/tuning_configuration.json for configuration details. Tuning configuration was created. Path to output file: /home/<user>/intel/oneapi/tcc_tools/latest/tools/<target_hostname>/single_corepcierd_1_<date>/tuning_configuration.json For more information, see the log file: /home/<user>/intel/oneapi/tcc_tools/latest/tools/<target_hostname>/single_corepcierd_1_<date>/log Application exit
    The following diagram depicts the information in the results: A tuned state in which latency is low and power consumption is limited.
  10. After the application exits, confirm that the tool generated the tuning configuration file,
    tuning_configuration.json
    , in the following directory:
    cd <target_hostname>/single_corepcierd_1_<date> ls -la
  11. Observe the differences between the untuned system in Step 2: Run MRL on Untuned System and the tuned system in step 3 (this step). You will see that while the results of both steps met the desired latency, the untuned system in step 2 doesn’t restrict power. Tuning the system applies power limits.
In this demo, both the untuned system (step 2) and the tuned system (step 3) met the deadline. The data streams optimizer selected a configuration that balances latency and power requirements. This “power-friendly” configuration did not turn off as many power management settings as the untuned system with Intel® TCC Mode enabled only. The data streams optimizer achieved higher latency and lower power consumption compared to Intel® TCC Mode enabled. In a real-world use case, you can perform additional analysis outside of Intel® TCC Tools to determine if your system requirements, like power consumption, are met.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.