Developer Guide

  • 2021.2
  • 06/11/2021
  • Public
Contents

Analyze Measurements in Your Workload

You can add measurement library APIs to your real-time application to analyze measurement results in various ways:
  • Measure and print minimum, maximum, and average latencies.
  • Set a deadline and run a custom callback function every time an iteration exceeds the deadline.
  • Convert measurement results in CPU clock cycles to time units.

Access Measurement Results

The following example shows how to instrument your code to measure and print the minimum, maximum, and average latencies of a workload. This example can be useful at the beginning of development to help you gather first impressions about your application’s performance. You can use the workflow to quickly compare the average latency to the maximum latency. If the average is significantly lower than the maximum, your workload has some rare outliers, which can be further analyzed with more advanced analysis such as histograms.
The example contains the following steps:
  1. Add the Instrumentation and Tracing Technology API (ITT API) to your application as described in Instrument the Code.
  2. Get the pointer to the measurement structure and run additional operations on the measurement results. This example prints the results to the console. It’s important to check that
    tcc_measurement_get()
    succeeds before doing anything with the measurement structure. If a collector other than the measurement library collector is loaded during runtime, or no collector at all,
    tcc_measurement_get()
    will return the error code
    TCC_E_NOT_AVAILABLE
    .
    tcc_measurement_get()
    call parameters:
    • domain
      : Pointer to the
      __itt_domain
      structure.
    • measurement
      : Pointer to the
      __itt_string_handle
      structure.
    • &tcc_measurement_ptr
      : Pointer to the pointer to the
      tcc_measurement
      structure.
    tcc_measurement_print()
    call parameters:
    • tcc_measurement_ptr
      : Pointer to the
      tcc_measurement
      structure.
    • TCC_TU_NS
      : Time unit for printing the measurements, either CPU clock cycles, nanoseconds, or microseconds. The example specifies nanoseconds. For more information about time units, see Time Unit Types.
    struct tcc_measurement* tcc_measurement_ptr; int tcc_sts; tcc_sts = tcc_measurement_get(domain, measurement, &tcc_measurement_ptr); if (tcc_sts == TCC_E_SUCCESS) { tcc_measurement_print(tcc_measurement_ptr, TCC_TU_NS); }
    Do not access the
    tcc_measurement
    structure after the
    main()
    function finishes (in cases like destructors of global variables).
The following diagram demonstrates the flow for this scenario:
Starting on the left side, the diagram shows that the real-time application is instrumented with ITT APIs and it is linked against the ITT Notify static library (
libittnotify.a
). At runtime, the static library reads the environment variable
INTEL_LIBITTNOTIFY64
and loads the measurement library collector (
libtcc_collector.so
), a dynamic library. The measurement library collector initializes the structures for data collection and stores the latency measurements there.
In addition, from the right side of the diagram, the real-time application uses measurement library functions to access the data structures. In this case, the application is linked against the measurement library (
libtcc_static.a
), a static library. The measurement library reads the environment variable
INTEL_LIBITTNOTIFY64
and loads the measurement library collector (
libtcc_collector.so
). As a result, the application can access the data structures created in the measurement library collector and can call the function to output the results (
tcc_measurement_print()
).
The
libtcc.so
shared library is linked by the measurement library collector and real-time application (through
libtcc_static.a
) to handle internal function calls.

Set a Measurement Deadline

The following example shows how to enable deadline monitoring in your real-time application. With deadline monitoring, you can set a deadline and run a custom callback function every time an iteration exceeds the deadline. This example can be helpful in a latter stage of development when you want to run the application for a long period and determine how many 9’s of execution to expect. If you run your application for a week, for example, how many deadline violations occur?
The example contains the following steps:
  1. Write a function that the application will call every time an iteration exceeds the deadline. The purpose of your function is to get additional information needed for debugging. For example, you can write a function that takes a snapshot of the system, calls an interrupt, or prints a message. The function signature is shown below. When the function is called, the first argument is the pointer to the measurement instance that detected the deadline violation and the second argument contains the violating value. In this example, the function
    notify_deadline
    prints a message when a deadline violation occurs.
    /* Callback function for the deadline monitoring. Called when iteration * latency exceeds the deadline. The measured latency value is printed in CPU * cycles. */ void notify_deadline(struct tcc_measurement* measurement, uint64_t latency) { printf("Latency exceeding deadline: %lu\n", latency); }
  2. Create the domain and the measurement structure handle.
    __itt_domain* domain; __itt_string_handle* measurement;
  3. Initialize the measurement structure.
    domain = __itt_domain_create("DOMAIN"); measurement = __itt_string_handle_create(measurement_name);
  4. Get the pointer to the measurement structure and set the deadline. If a collector other than the measurement library collector is loaded during runtime, or no collector at all,
    tcc_measurement_get()
    will return the error code
    TCC_E_NOT_AVAILABLE
    . In this case, it is not possible to set the callback and receive deadline violation events.
    tcc_measurement_get()
    call parameters:
    • domain
      : Pointer to the
      __itt_domain
      structure.
    • measurement
      : Pointer to the
      __itt_string_handle
      structure.
    • &tcc_measurement_ptr
      : Pointer to the pointer to the
      tcc_measurement
      structure.
    tcc_measurement_set_deadline()
    call parameters:
    • tcc_measurement_ptr
      : Pointer to the pointer to the
      tcc_measurement
      structure.
    • deadline
      : Deadline in CPU clock cycles. You can set one deadline that applies to all workload iterations.
    • notify_deadline
      : Function to be called when iteration latency exceeds the deadline.
    struct tcc_measurement* tcc_measurement_ptr; int tcc_sts; tcc_sts = tcc_measurement_get(domain, measurement, &tcc_measurement_ptr); if (tcc_sts == TCC_E_SUCCESS) { tcc_measurement_set_deadline(tcc_measurement_ptr, deadline, notify_deadline); }
    Do not access the
    tcc_measurement
    structure after the
    main()
    function finishes (in cases like destructors of global variables).
  5. Start the measurement.
    __itt_task_begin(domain, __itt_null, __itt_null, measurement);
  6. End the measurement.
    __itt_task_end(domain);
The following diagram demonstrates the flow for this scenario:
Starting on the left side, the diagram shows that the real-time application is instrumented with ITT APIs and it is linked against the ITT Notify static library (
libittnotify.a
). At runtime, the static library reads the environment variable
INTEL_LIBITTNOTIFY64
and loads the measurement library collector (
libtcc_collector.so
), a dynamic library. The measurement library collector initializes the structures for data collection and stores the latency measurements there.
In addition, from the right side of the diagram, the real-time application uses measurement library functions to access the data structures. In this case, the application is linked against the measurement library (
libtcc_static.a
), a static library. The measurement library reads the environment variable
INTEL_LIBITTNOTIFY64
and loads the measurement library collector (
libtcc_collector.so
). As a result, the application can access the data structures created in the measurement library collector and can call the function to set the deadline (
tcc_measurement_set_deadline()
).
When the measured iteration latency exceeds the deadline, the measurement library collector initiates a callback that was defined in the real-time application.
The
libtcc.so
shared library is linked by the measurement library collector and real-time application (through
libtcc_static.a
) to handle internal function calls.

Convert Measurement Units

The processor measures time in CPU clock cycles. The measurement library provides functions for converting CPU clock cycles to other time units and vice versa.
  • Convert CPU clock cycles to the standard Linux
    struct timespec
    :
    Call
    tcc_measurement_convert_clock_to_timespec()
    and specify the input value in CPU clock cycles. The value can be one of the measurements from
    struct tcc_measurement
    such as
    clk_min
    (latency of the fastest measured sequence iteration),
    clk_max
    (latency of the slowest measured sequence iteration), or
    clk_result
    (average latency). The following example converts the maximum measured time in clock cycles (
    clk_max
    ) to
    struct timespec
    and prints the
    tv_nsec
    field from
    struct timespec
    .
    printf("Maximum measured latency: %.0f CPU cycles (%ld nsec)\n", measurement.clk_max, tcc_measurement_convert_clock_to_timespec ( measurement.clk_max).tv_nsec);
  • Convert
    struct timespec
    to CPU clock cycles:
    Call
    tcc_measurement_convert_timespec_to_clock
    . The following example uses the function to enable application users to specify a deadline in time units, which must be converted to clock cycles for internal tracking.
    if (time_unit == TCC_TU_NS || time_unit == TCC_TU_US) { struct timespec ts_deadline = (struct timespec) { .tv_sec = 0, .tv_nsec = deadline * ((time_unit == TCC_TU_US)?1000:1) }; deadline = tcc_measurement_convert_timespec_to_clock(ts_deadline); }
  • Convert CPU clock cycles to a specified time unit:
    Call
    tcc_measurement_convert_clock_to_time_units()
    . The following example converts CPU clock cycles to nanoseconds.
    printf("Maximum measured latency: %.0f CPU cycles (%ld nsec)\n", measurement.clk_max, tcc_measurement_convert_clock_to_time_units(measurement.clk_max, TCC_TU_NS));
  • Convert a time unit to CPU clock cycles:
    Call
    tcc_measurement_convert_time_units_to_clock()
    . The following example converts microseconds to CPU clock cycles.
    uint64_t deadline_us = 50; // 50 µs deadline_clk = tcc_measurement_convert_time_units_to_clock(deadline_us, TCC_TU_US);

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.