Cannot monitor MICs with micsmc

Cannot monitor MICs with micsmc

Hello,

I can't monitor core utilization on a coprocessors using micsmc.

This is a part of the corresponding log file CPL_GUI.log:

19 2013 14:53:26: Warning: RAS Event Notification is disabled: Failure to connect, scif_connect failed.
19 2013 14:53:33: Warning: mic1: Turbo Mode is not supported by this device.
19 2013 14:53:33: Warning: mic0: Turbo Mode is not supported by this device.
19 2013 14:53:45: Warning: mic0: Device connection lost!
19 2013 14:53:45: Warning: mic1: Device connection lost!

Also, I can't get info into "Version" block with micinfo utility.

I've attached log files as said in http://software.intel.com/en-us/forums/topic/381735#comment-1730518

6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Also I've tried to get the version of firmware with mpssflash command:

]# micctrl -rw
mic0: resetting
mic1: resetting
mic0: ready
mic1: ready
# service mpss stop
Shutting down MPSS Stack:                                  [  OK  ]
# /opt/intel/mic/bin/mpssflash --device all device
mic1: Flash: Winbond W25Q16
mic0: Flash: Winbond W25Q16
mic0: Resetting
mic1: Resetting
#/opt/intel/mic/bin/mpssflash --device all version

Last command line execution gives nothing.

I can get flashversion only with commands

# cat /sys/class/mic/mic0/flashversion
386
# cat /sys/class/mic/mic1/flashversion
386
because micinfo shows "Flash Version : NotAvailable".

#cat /sys/class/mic/mic0/uevent
MAJOR=246
MINOR=2
DEVNAME=mic0

# cat /sys/class/mic/mic1/uevent
MAJOR=246
MINOR=3
DEVNAME=mic1

I have just found the solution of my problem.

The reason why ours 5110P cards didn't worked normally was the old versions of SMC firmware and SMC bootloader. The operation for updating wasn't completed, when I had seen every times the messages:

micflash: mic0: Flash operation timed out
mic0: Resetting

The solution is following:

1. Disable mpss service "chkconfig mpss off" and reboot the host system.
2. Make the first stage of firmware updating

# /opt/intel/mic/bin/micflash -update -device all

I seen following output:
No image path specified - Searching: /opt/intel/mic/flash
mic0: Flash image: /opt/intel/mic/flash/EXT_HP2_B1_0386-03.rom.smc
mic0: Flash update started
mic0: Flash update done
mic0: SMC update started
micflash: mic0: Flash operation timed out

mic0: Resetting

Please restart host for flash changes to take effect

3. Power off the host system and unplug AC cables on time about 10 seconds.
4. Power on the host system and make the second stage of the firmware updating

/opt/intel/mic/bin/micflash -update -smcbootloader -device all

Then I've seen other messages about successful firmware and bootloader updating
No image path specified - Searching: /opt/intel/mic/flash
mic0: Flash image: /opt/intel/mic/flash/EXT_HP2_B1_0386-03.rom.smc
mic0: SMC boot-loader image: /opt/intel/mic/flash/EXT_HP2_SMC_Bootloader_1_8_4326.css_ab
mic0: SMC boot-loader update started
mic0: SMC boot-loader update done
mic0: Resetting
mic0: Flash update started
mic0: Flash update done
mic0: SMC update started
mic0: SMC update done
mic0: Resetting

Please restart host for flash changes to take effect

Now I can monitor ours cards and micinfo gets full info about cards.

thanks for keeping us in the loop about the steps you took to fix your problem! I had the exact same issue with the phi constantly reconnecting. My system didn't need the smcbootloader part according to the installation notes on intel's website:

http://registrationcenter.intel.com/irc_nas/4110/readme.txt

The " micflash -update -device all"  and rebooting did the job!

Taylor Kidd (Intel)'s picture

Thank you for adding your update to the community.

 

Login to leave a comment.