Anomaly Detection View
- Code regions of interest
- Information about where simulations executed faster or slower than normal
- Switch to theBottom-upwindow.
- Group results byCode Region of Interest / Duration Type.
- To further examine the outliers in the Slow region, right click on this field and selectLoad Intel Processor Data by Selection.
Intel Processor Trace Details Window
Instructions Retired, Call Count, Total Iteration Count
Control flow metrics.
Instructions Retiredrefers to the number of entries into a kernel.
CPU Time (Kernel and User)
Active time on the CPU
Wait Time, Inactive Time
Duration for which a thread was idle because of synchronization or preemption
Latency (Wall-clock time of the code region execution)
Context Switch Anomaly
- In theIntel Processor Trace Detailswindow, check theInactive TimeandWait Timemetrics. TheWait Timeindicates the duration for which a thread was idle due to synchronization issues.
- If the metrics are zero, the application had no context switches. Proceed to check for a different type of anomaly.
- If the metrics are non-zero, continue with this procedure to check for context switches.
- Sort theWait Timecolumn.
- For the instances that had significantWait Time, compare theWait TimewithElapsed Time. If the thread was idle for a significant portion of elapsed time, this was due to a context switch synchronization issue. In this example,thread 25883was idle for 1.269 out of 1.318 milliseconds, which is quite significant.
- Expand the instance to drill down to a function or stack. Identify the stack(s) that brought the thread to idle state.
- In theIntel Processor Trace Detailswindow, sort the data in theKernel Timecolumn. Where the proportion of kernel time to elapsed time is high, a significant amount of time was spent in the kernel. In this example, 566 out of 997 microseconds were spent in the kernel for the highlighted thread.
- Expand the thread to see contributing stacks that could be responsible for long kernel times.
- Bottom-up window:Shows frequency information for the entire application.
- Intel Processor Trace Details window:Shows frequency information only for the loaded region.
- There are Intel® Advanced Vector Extensions (Intel® AVX) instructions used inside or outside a loaded code region.
- There are underlying hardware issues like cooling.
- Apart from your application, low activity on the core and OS can also cause frequency drops. drop the frequency. Look for high numbers ofInactive TimeorWait Time.
Control Flow Deviation Anomaly
- Select a node in the grid where you see a high value forInstructions Retired.
- Right click and selectFilter In by Selectionfrom the context menu.
- Switch to theCaller/CalleeWindow.In the flat profile view, you can see functions annotated with Self and Total CPU Times. The caller view shows the callers of the selected function in a bottom-up representation. The callee view shows a call tree from the selected function in a top-down representation.
- In this example, the function call to_slab_evict_onefunction from_slab_evict_randcauses significant delay as evidenced by the Self CPU Time.
- Compare the number of loop iterations between a fast and slow iteration by checking theTotal Iteration Count.
- If the slower iteration has a higher iteration count, switch toSource Assemblyview and examine the source code of the function.
- Check to see if the slower iteration passed the validation of the cached element.
- Increase the cache size.
- Update cache data and repeat the analysis.