Problem:
Failed to run PMU event-based sampling data collect, such as
# amplxe-cl -collect lightweight-hotspots -duration 10
2> NMI watchdog timer is enabled.
2> Turn off the nmi_watchdog timer
2> and restart.
Using result path `/home/peter/problem_report/r000lh'
Executing actions 50 % done
Error: Error 0x4000001e (Cannot load raw collector data)
Root-cause:
Non Maskable Interrupt (NMI) Watchdog can be used in Linux kernel to periodically detect if CPU is locked. When CPU-locking occurred, NMI Watchdog service will do 1) print debug info 2)reboot the system, sometime. However NMI Watchdog needs to use hardware performance count, so other performance tool including VTune™ Amplifier XE 2013 can’t use PMU event-based sampling data collection.
Two solutions:
1. Add argument “nmi_watchdog=0” in /boot/grub/grub.conf , then reboot system
2. Disable NMI_Watchdog, at running time
# echo 0 > /proc/sys/kernel/nmi_watchdog,
Use “cat /proc/sys/kernel/nmi_watchdog” to verify if NHM Watchdog is disabled (zero indicates "disabled"
The user can use simple test to verify, for example:
1. amplxe-cl -collect lightweight-hotspots -duration 10
2. amplxe-cl -collect-with runsa -knob event-config=CPU_CLK_UNHALTED.THREAD -duration 10
