Evaluation Feature: New Threading Error Analysis Types and Settings

Product Changes Under Evaluation

Benefits

Increased number of preset analysis types

Focus analyses on deadlocks, or data races, or both.

New configuration settings

Speed up data race and cross-thread stack access detection. View creation information on synchronization objects involved in deadlocks, lock hierarchy violations, and data races

Split configuration setting

Focus analyses on deadlocks, or lock hierarchy violations, or both.

Renamed configuration settings

Make configuration options more intuitive.

Increased number of configurable settings

Fine-tune preset analysis types without creating custom analysis types.

Note

  • Old preset threading error analysis types (ti1, ti2, and ti3) still work, but only when using the inspxe-cl command.

  • Results produced using old preset analysis types are still accessible.

Configuration Settings for Each Preset Analysis Type

The following table shows the configuration settings (in alphabetical order) for each new preset threading error analysis type.

  • N/A (setting value) means not applicable for the preset analysis type (despite the setting value).

  • Configurable means you can change the configuration setting in the preset analysis type without creating a custom analysis type.

  • New configuration settings are identified.

  • Renamed/split/combined configuration settings show previous manifestations.

Setting / Analysis Type

Detect Deadlocks

Detect Data Races

Detect Deadlocks and Data Races

Detect Data Races (Deep Dive)

Detect Deadlocks and Data Races (Deep Dive)

Cross-thread stack access detection

N/A (Hide problems/Show warnings)

Hide problems/Show warnings

Hide problems/Show warnings

Hide problems/Show warnings

Hide problems/Show warnings

Detect data races

N/A (No

Yes

Yes

Yes

Yes (configurable)

Detect data races on stack (previously Detect data races on stack accesses)

N/A (No)

No

No

No (configurable)

No

Detect deadlocks (split from Detect lock hierarchy violations and deadlocks)

Yes

N/A (No)

Yes

N/A (No)

Yes

Detect lock hierarchy violations (split from Detect lock hierarchy violations and deadlocks)

Yes (configurable)

N/A (No)

Yes (configurable)

N/A (No)

Yes (configurable)

Race analysis byte granularity (previously Memory access byte granularity)

N/A (4 bytes)

4 bytes

4 bytes

4 bytes (configurable)

4 bytes (configurable)

Remove duplicates

Yes

Yes

Yes

Yes (configurable)

Yes (configurable)

Save stack on first access

N/A (No)

No (configurable)

No (configurable)

Yes (configurable)

Yes (configurable)

Save stack on lock creation (new)

No (configurable)

No

No (configurable)

No

Yes (configurable0

Save stack on memory allocation (previously Save stack on allocation)

N/A (No)

No (configurable)

No (configurable)

Yes (configurable)

Yes (configurable)

Stack frame depth

1 (configurable)

1 (configurable)

1 (configurable)

8 (configurable)

8 (configurable)

Terminate on deadlock

No (configurable)

N/A (No)

No (configurable)

N/A (No)

No (configurable)

Use maximum resources (new)

N/A (No)

No

No

Yes (configurable)

Yes (configurable)

Configuration Setting Descriptions

The following table describes all configuration settings available for configuring threading error analyses. The settings are listed in alphabetical order.

Setting

Purpose, Usefulness, Cost, and Recommendation

Cross-thread stack access detection

Set the alert mechanism for when a thread accesses stack memory of another thread.

The alert mechanism helps you decide if this is an issue that requires handling.

All options are low cost if Detect data races is selected.

Recommendation:

  • Use Hide problems/Hide warnings if using an OpenMP*, Intel® Threading Building Blocks, or Intel® Cilk™ Plus programming model; or if cross-thread stack accesses are anticipated. Also select Detect races on stack.

  • Use Hide problems/Show warnings if cross-thread stack accesses are not anticipated. Also deselect Detect data races on stack.

  • Use Show problems/Hide warnings if cross-thread stack accesses are not anticipated but a previous analysis indicated they exist and you are not using an OpenMP*, Intel Threading Building Blocks, or Intel Cilk Plus programming model. Also deselect Detect data races on stack.

Detect data races

Select to detect problems where multiple threads access the same memory location without proper synchronization and at least one access is a write.

Selecting is useful when you suspect data races that are not yet evident.

High cost.

Recommendation: Select. Consider also deselecting Use maximum resources to reduce cost.

Detect data races on stack

Available only if Detect data races is selected.

Select to detect data races for variables allocated on the stack.

Selecting is useful when threads in an application share variables from the stack and you suspect data races on the variables.

High cost.

Recommendation: Deselect. If you select, consider also deselecting Use maximum resources to reduce cost.

Detect deadlocks

Select to detect problems where two or more threads are waiting for the other to release resources, but none of the threads releases the resources. Thus no thread can proceed.

Selecting is useful when you want to troubleshoot the location of a deadlock.

Low cost.

Detect lock hierarchy violations

Select to detect problems where the acquisition hierarchy order of multiple synchronization objects in one thread differs from the acquisition hierarchy order in another thread, and could cause a deadlock under certain conditions.

Selecting is useful when an application has complicated synchronization and it is hard to verify correctness.

Low cost unless an application has a significant number of locks.

Race analysis byte granularity

Available only if Detect data races is selected.

Set the size of the smallest memory block the Intel Inspector considers a single block of memory when determining if non-synchronized accesses to a memory block constitute a data race.

Selecting is useful to control memory consumption during analysis for some applications.

High cost when set to 1 byte.

Recommendation: Set to 4 unless you continually see data races based on safe access to smaller memory blocks. If so, reset to 1.

Remove duplicates

Deselect to show all occurrences of a detected problem in the Code Locations pane.

Deselecting is:

  • Useful when you need to fully visualize all threads and problem occurrences in relation to time

  • Low cost in terms of time; however, the number of duplicate errors could crowd out the number of unique errors.

Recommendation: Select.

Save stack on first access

Available only if Detect data races is selected.

Select to show as much information as possible on all threads involved in a data race.

Selecting is useful when investigating complex data race problems.

High cost.

Recommendation: Deselect on initial analysis runs. Select only when you need the maximum information and context about all threads involved in a data race to solve the problem.

Save stack on lock creation

Select to show creation information on synchronization objects involved in deadlocks, lock hierarchy violations, and data races.

Selecting is useful when acquisition stacks are not sufficient to understand the problem.

Low cost.

Save stack on memory allocation

Available only if Detect data races is selected.

Select to identify the allocation site of dynamically allocated memory objects involved in data races.

Medium cost.

Recommendation: Select when you need to identify the object hierarchy of low-level objects involved in data races. For example: If object R is involved in a data race and is instantiated within objects O1, O2, and O3, the allocation call stack can help you identify which encapsulating object is not properly protecting access to object R.

Stack frame depth

Provide more or less call stack context for detected errors.

A high setting is useful when analyzing highly object-oriented applications.

A higher number does not significantly impact cost with one exception: Choosing a higher number plus selecting Save stack on first access increases cost.

Recommendation: Use only as large a value as an application requires to display complete call paths.

Terminate on deadlock

Available only if Detect deadlocks is selected.

Select to stop analysis and application execution if the Intel Inspector detects a deadlock.

Selecting is useful when running your application as part of a kernel or unit testing suite.

Low cost.

Recommendation: Deselect. Instead, use the corresponding knob in the command line interface to perform kernel or unit testing in a nightly scenario. If the Intel Inspector identifies a deadlock, decide if it is appropriate to continue analysis.

Use maximum resources

Select to potentially find more problems.

High cost.

Recommendation: Deselect to run a quicker analysis that should find most of your data race and cross-thread stack access problems. Once you have found and fixed these problems, select to get more complete analysis coverage of possible data race and cross-thread stack access problems.

Corresponding Command Line Interface Changes

The following table identifies the appropriate <analysis_type> argument for the inspxe-cl collect option.

New Preset Analysis Type Name

Corresponding inspxe-cl <analysis_type> Argument

Detect Deadlocks

deadlock

Detect Data Races

datarace

Detect Deadlocks and Data Races

deadlock-datarace

Detect Data Races (Deep Dive)

datarace-deep

Detect Deadlocks and Data Races (Deep Dive)

deadlock-datarace-deep

Various knob names have also changed to better correspond to renamed configuration settings. Use the following inspxe-cl commands to identify available knobs:

  • For the collect action: $ inspxe-cl -knob-list <analysis_type>

  • For the collect-with action: $ inspxe-cl -knob-list <collector>

For example:

$ inspxe-cl -knob-list datarace
$ inspxe-cl -knob-list runtc

Result and Result Directory Name Templates

The following table identifies the default result and result directory name for each preset analysis type.

New Preset Analysis Type Name

Preset Analysis Type Identifier

Parsing Key

Detect Deadlocks

r@@@td

t = threading analysis type

d = deadlock

r = data race

x = deep dive

Detect Data Races

r@@@tr

Detect Deadlocks and Data Races

r@@@tdr

Detect Data Races (Deep Dive)

r@@@trx

Detect Deadlocks and Data Races (Deep Dive)

r@@@tdrx

State Propagation

When you run an analysis to create a new result, the Intel Inspector automatically propagates state information from a baseline (previous) result to the newer result. By default, the Intel Inspector establishes a baseline result from the immediately previous result of the same analysis type; however, you can establish any baseline result you want (using the Options-State Management dialog box in the GUI or the baseline-result action-option in the inspxe-cl command tool).

To maintain your state assignments when you switch from old preset threading error analysis types to new, choose a baseline result you created from an old preset threading error analysis type (ti1, ti2, or ti3). If you have used multiple preset threading error analysis types, choose the widest (ti1 is the narrowest threading error analysis type; ti3 is the widest).


Supplemental documentation specific to a particular Intel Studio may be available at <install-dir>\<studio>\documentation\ .

Para obter informações mais completas sobre otimizações do compilador, consulte nosso aviso de otimização.