Disable NMI Watchdog when using PMU event-based sampling

Problem (affects versions previous to Intel(R) VTune(TM) Amplifier XE 2013 Update 17):

Failed to run PMU event-based sampling data collect, such as

 

# amplxe-cl -collect lightweight-hotspots -duration 10

2> NMI watchdog timer is enabled. 

2> Turn off the nmi_watchdog timer

2> and restart.

 

Using result path `/home/peter/problem_report/r000lh'

Executing actions 50 % done                                                   

Error: Error 0x4000001e (Cannot load raw collector data)

Root-cause:

Non Maskable Interrupt (NMI) Watchdog can be used in Linux kernel to periodically detect if CPU is locked. When CPU-locking occurred, NMI Watchdog service will do 1) print debug info 2)reboot the system, sometime. However NMI  Watchdog needs to use hardware performance count, so other performance tool including VTune™ Amplifier XE 2013 can’t use PMU event-based sampling data collection.

Two solutions:

1.     Add argument “nmi_watchdog=0” in /boot/grub/grub.conf , then reboot system

2.     Disable NMI_Watchdog, at running time

# echo 0 > /proc/sys/kernel/nmi_watchdog,

 

Use “cat /proc/sys/kernel/nmi_watchdog” to verify if NHM Watchdog is disabled (zero indicates "disabled"

The user can use simple test to verify, for example:

1. amplxe-cl -collect lightweight-hotspots -duration 10

2. amplxe-cl -collect-with runsa -knob event-config=CPU_CLK_UNHALTED.THREAD -duration 10

Para obter informações mais completas sobre otimizações do compilador, consulte nosso aviso de otimização.