Configure your target for the KVM guest OS profiling.
While Concurrency analysis helps identify where your application is not parallel, Locks and Waits analysis helps identify the cause of ineffective processor utilization. One of the most common problems is threads waiting too long on synchronization objects (locks). Performance suffers when waits occur while cores are under-utilized.
Use the Memory Access analysis to identify memory-related issues, like NUMA problems and bandwidth-limited accesses, and attribute performance events to memory objects (data structures), which is provided due to instrumentation of memory allocations/de-allocations and getting static/global variables from symbol information.