Cookbook

  • 2020
  • 06/18/2020
  • Public Content

Profiling a .NET* Core Application

This recipe uses Intel® VTune™ Amplifier for .NET Core dynamic-code profiling to locate performance hotspots in the managed code and optimize the application turnaround.

Ingredients

This section lists the hardware and software tools used for the performance analysis scenario.
  • Application:
    a sample C# application that adds all the elements of an integer List. The application is used as a demo and not available for download.
  • Tools:
    • Intel® VTune™Amplifier 2018
      • For
        VTune
        Profiler
        downloads and product support, visit https://software.intel.com/en-us/vtune .
      • All the Cookbook recipes are scalable and can be applied to Intel VTune Amplifier 2018 and higher. Slight version-specific configuration changes are possible.
      • Intel® VTune™ Amplifier has been renamed to Intel® VTune™ Profiler starting with its version for Intel® oneAPI Base Toolkit (Beta). You can still use a standalone version of the VTune Profiler, or its versions integrated into Intel Parallel Studio XE or Intel System Studio.
  • Operating system:
    Microsoft* Windows* 10
  • CPU
    : Intel microarchitecture code name Skylake

Prepare Your Application for Analysis

  1. Open a new command window for the .NET environment variables to take effect. Make sure that .NET Core 2.0 is successfully installed:
    dotnet --version
  2. Create a new
    listadd
    directory for the application:
    mkdir C:\listadd > cd C:\listadd
  3. Enter
    dotnet new console
    to create a new skeleton project with the following structure:
  4. Replace the contents of
    Program.cs
    in the
    listadd
    folder with C# code that adds the elements of an integer List:
    using System; using System.Linq; using System.Collections.Generic; namespace listadd { class Program { static void Main(string[] args) { Console.WriteLine("Starting calculation..."); List<int> numbers = Enumerable.Range(1,10000).ToList(); for (int i =0; i < 100000; i ++) { ListAdd(numbers); } Console.WriteLine("Calculation complete"); } static int ListAdd(List<int> candidateList) { int result = 0; foreach (int item in candidateList) { result += item; } return result; } } }
  5. Add the following flag to the
    PropertyGroup
    section of the
    listadd.csproj
    file to enable source code analysis in VTune Amplifier:
    <DebugType>pdbonly</DebugType>
    .
  6. Create
    listadd.dll
    in the
    C:\listadd\bin\Release\netcoreapp2.0
    folder:
    dotnet build -c Release
  7. Run the sample application:
    dotnet C:\listadd\bin\Release\netcoreapp2.0\listadd.dll

Run Advanced Hotspots Analysis

  1. Launch the VTune Amplifier with administrator privileges.
  2. Click the
    New Project
    button on the toolbar and specify a name for the new project, for example:
    dotnet
    .
  3. In the
    Analysis Target
    window, select
    local host
    and
    Launch Application
    target type from the left pane.
  4. On the
    Launch Application
    pane, specify the application to analyze:
    • Application
      :
      C:\Program Files\dotnet\dotnet.exe
    • Application parameters
      :
      C:\listadd\bin\Release\netcoreapp2.0\listadd.dll
    The location of
    dotnet.exe
    depends on your environment and can be identified with the command:
    where dotnet
    .
  5. Click the
    Choose Analysis
    button on the right and select the
    Advanced Hotspots
    analysis from the left pane.
    Advanced Hotspots analysis was integrated into the generic Hotspots analysis starting with Intel VTune Amplifier 2019, and is available via the Hardware Event-Based Sampling collection mode.
  6. Click
    Start
    to run the analysis.

Identify Hotspots in the Managed Code

When the collected analysis result opens, switch to the
Bottom-up
tab and set the data grouping level to
Process/Module/Function/Thread/Call Stack
:
Expanding
dotnet.exe
>
listadd.dll
discovers the managed
listadd::Program::ListAdd
function that took the most CPU Time:
Double-click this hotspot function to open the source view. To view the source and disassembly code side by side, click the
Assembly
toggle button on the toolbar:
Use the statistics per source line/assembly instruction to identify the most time-consuming code snippets (line 24 in the example above) and work on optimizations.

Optimize the Code with Loop Interchange

VTune Amplifier highlights the following code line as performance-critical:
foreach (int item in candidateList)
For optimization, consider using the
for
loop statement. Replace the contents of
Program.cs
with this C# code:
using System; using System.Linq; using System.Collections.Generic; namespace listadd { class Program { static void Main(string[] args) { Console.WriteLine("Starting calculation..."); List<int> numbers = Enumerable.Range(1,10000).ToList(); for (int i =0; i < 100000; i ++) { ListAdd(numbers); } Console.WriteLine("Calculation complete"); } static int ListAdd(List<int> candidateList) { int result = 0; for (int i = 0; i < candidateList.Count; i++) { result += candidateList[i]; } return result; } } }

Verify the Optimization

To verify the optimization for the updated code, re-run the Advanced Hotspots analysis.
Before the optimization the sample application took 2.636 seconds of CPU time:
After optimization the application ran for 0.945s, which is a 64% reduction in time over the original:
To discuss this recipe, visit the developer forum

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804