Flash version too old?

Flash version too old?

Hi,

The flash version on our Intel Xeon Phi appears to be 2.1.01.0372. According to the MPSS readme, one of the prerequisites for updating the flash is "Starting version of Flash must be >= 375, if not, contact your Intel support representative". I have two questions regarding this:

1. Is it safe to continue flashing the Xeon Phi in spite of the older flash version currently on the device? If not, what should be done?

2. If I do not want to update the flash, what problems would I face? Can I continue using the 372 version without issues?

I have pasted the output of miccheck and micinfo below for your reference. Please let me know if you need any more info from me.

Thanks,

Prasanna.

miccheck 2.1.6720-15, created 11:31:20 Jun 21 2013
Copyright 2011-2013 Intel Corporation All rights reserved

Test 1 Ensure installation matches manifest : FAILED
Test 2 Ensure host driver is loaded : OK
Test 3 Ensure driver matches manifest : OK
Test 4 Detect all listed devices : OK
MIC 0 Test 1 Find the device : OK
MIC 0 Test 2 Check the POST code via PCI : OK
MIC 0 Test 3 Connect to the device : OK
MIC 0 Test 4 Check for normal mode : OK
MIC 0 Test 5 Check the POST code via SCIF : OK
MIC 0 Test 6 Send data to the device : OK
MIC 0 Test 7 Compare the PCI configuration : OK
MIC 0 Test 8 Ensure Flash version matches manifest : FAILED
MIC 0 Test 8> Flash version mismatch. Manifest: 2.1.03.0386, Running: 2.1.01.0372
Status: Test failed

miccheck output:

MicInfo Utility Log

Created Fri Jul 19 16:34:19 2013

System Info
HOST OS : Linux
OS Version : 3.2.0-4-amd64
Driver Version : 6720-15
MPSS Version : NotAvailable
Host Physical Memory : 32988 MB

Device No: 0, Device Name: mic0

Version
Flash Version : NotAvailable
SMC Firmware Version : NotAvailable
SMC Boot Loader Version : NotAvailable
uOS Version : NotAvailable
Device Serial Number : NotAvailable

Board
Vendor ID : 0x8086
Device ID : 0x2250
Subsystem ID : 0x2500
Coprocessor Stepping ID : 3
PCIe Width : x16
PCIe Speed : 5 GT/s
PCIe Max payload size : 256 bytes
PCIe Max read req size : 512 bytes
Coprocessor Model : 0x01
Coprocessor Model Ext : 0x00
Coprocessor Type : 0x00
Coprocessor Family : 0x0b
Coprocessor Family Ext : 0x00
Coprocessor Stepping : B1
Board SKU : NotAvailable
ECC Mode : NotAvailable
SMC HW Revision : NotAvailable

Cores
Total No of Active Cores : 60
Voltage : 0 uV
Frequency : 1052631 kHz

Thermal
Fan Speed Control : NotAvailable
Fan RPM : NotAvailable
Fan PWM : NotAvailable
Die Temp : NotAvailable

GDDR
GDDR Vendor : Elpida
GDDR Version : 0x1
GDDR Density : 2048 Mb
GDDR Size : 7936 MB
GDDR Technology : GDDR5
GDDR Speed : 5.000000 GT/s
GDDR Frequency : 2500000 kHz
GDDR Voltage : 0 uV

19 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

You really do need to update the flash and SMC. There is an archive page of earlier releases at: http://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss-archive. Download the N-2 version from that page and go through the install steps at least through the flash update (sections 7.2 and 7.3 in the readme). Then try installing the latest version of the MPSS, including updating the flash. Let me know how it goes.

Hi Frances,

Thanks for the reply. I have installed the N-2 version of MPSS and trying to go through the steps in section 7.2. However, the readme file says "This configuration is required for "SMC Firmware Version 1.7" or earlier. Execute /opt/intel/mic/bin/micinfo to identify the SMC firmware version installed on the card.". But when I execute micinfo it says "NotAvailable" for SMC Firmware Version. How do I find out which SMC firmware version is currently running? Is it safe to continue with SMC Bootloader update and flash update even if I don't know what versions are currently installed? I don't want to end up bricking the device.

Thanks,

Prasanna.

PS: I have pasted the micinfo output below:

MicInfo Utility Log

Created Sun Jul 21 23:26:48 2013

System Info
HOST OS : Linux
OS Version : 3.2.0-4-amd64
Driver Version : 5889-16
MPSS Version : NotAvailable
Host Physical Memory : 32988 MB

Device No: 0, Device Name: mic0

Version
Flash Version : NotAvailable
SMC Boot Loader Version : NotAvailable
uOS Version : NotAvailable
Device Serial Number : NotAvailable

Board
Vendor ID : 8086
Device ID : 2250
Subsystem ID : 2500
Coprocessor Stepping ID : 3
PCIe Width : x16
PCIe Speed : 5 GT/s
PCIe Max payload size : 256 bytes
PCIe Max read req size : 512 bytes
Coprocessor Model : 0x01
Coprocessor Model Ext : 0x00
Coprocessor Type : 0x00
Coprocessor Family : 0x0b
Coprocessor Family Ext : 0x00
Coprocessor Stepping : B1
Board SKU : NotAvailable
ECC Mode : NotAvailable
SMC HW Revision : NotAvailable

Cores
Total No of Active Cores : 60
Voltage : 0 uV
Frequency : 1052631 KHz

Thermal
Fan Speed Control : NotAvailable
SMC Firmware Version : NotAvailable
FSC Strap : NotAvailable
Fan RPM : NotAvailable
Fan PWM : NotAvailable
Die Temp : NotAvailable

GDDR
GDDR Vendor : Elpida
GDDR Version : 0x1
GDDR Density : 2048 Mb
GDDR Size : 7936 MB
GDDR Technology : GDDR5
GDDR Speed : 5.000000 GT/s
GDDR Frequency : 2500000 KHz
GDDR Voltage : 0 uV

The SMC and flash version numbers are only available from micinfo when the coprocessor is booted. I strongly suspect that you have version 1.7 or earlier because the current N-2 version comes with 1.8.

Did you finish the install of the N-2 version of the MPSS?  

Yes, I did. The micinfo output I pasted in my last reply is with the N-2 version. (The current version of intel-mic* packages is 2.1.5889-17.) I have not proceeded to flash the coprocessor yet, because I am using it to run some OpenCL programs and am still slightly apprehensive about bricking the device.

Thanks,

Prasanna.

Quote:

Frances Roth (Intel) wrote:

The SMC and flash version numbers are only available from micinfo when the coprocessor is booted. I strongly suspect that you have version 1.7 or earlier because the current N-2 version comes with 1.8.

Hi Frances,

     I met the same problem with Prasanna. The flash version of MIC is too old, the output of `miccheck` indates that

MIC 0 Test 8> Flash version mismatch. Manifest: 2.1.03.0386, Running: 2.1.01.0372

After seeing your suggestion, I went to "http://software.intel.com/en-us/articles/intel-manycore-platform-softwar..." to check for the MPSS N-2.

But just as Prasanna pointed out in the following posts, in the readme-en.txt released with MPSS N-2 (http://registrationcenter.intel.com/irc_nas/3156/readme-en.txt), in Section 7.2:

o PREREQUISITES: - Uninstall prior version of MPSS and install new version - Starting version of Flash must be >= 375, if not, contact your Intel support representative

Well, the flash version of My MIC is just 371, I do not meet the requirements update my flash via MPSS N-2.

I guess, I may need to install a more previous version of MPSS to update my flash to 375 first, and then I install the current version of MPSS(2.1.6720-13) to finally get the flash up to date.

Any Suggestions?

 My System Info
                HOST OS                 : Linux
                OS Version              : 2.6.32-358.el6.x86_64
                Driver Version          : 6720-16
                MPSS Version            : 2.1.6720-16
                Host Physical Memory    : 32777 MB

I met the exactly same problem with Prasanna.

I got the MIC card in Janurary,2013, but I set up a server which supports the MIC card just a few days before. When I tried to install the MPSS on the host server, I met the same problem of the flash version of MIC being too old. The current version of MPSS requires the flash version to be >=375 to upgrade the flash to 382(released with the mpss_gold_update_3-2.1.6720-16). If the flash version is too old, it can not upgrade the flash of MIC. The output of miccheck indicates that the flash version doesn't meet the requirement of MPSS.

I followed the suggestions before to try the N-2 version of MPSS. But disappointly, I found that in N-2 version of MPSS, if somebody wants to update the flash, it is still required that the flash version to be >=375.(please see the section of updating flash & SMC in readme-en.txt of N-2 version mpss_gold_update_3-2.1.6720-13).

 It seems that I need a more previous version of MPSS to upgrade the flash of my MIC card to version 375(or higher) and then update the MIC card through the current version of MPSS.
It is like a two-step upgrade MPSS procedure.

I don't know whether or not my approach is feasible. Dear Frances, do you have any suggestion?  Thank you very much.

Some people may get the MIC card of a very old flash version(just like me, the card is produced in 2013.1). It seems a problem for them to install the current version of MPSS.

I have been asked to have you try the following - it is very important to ensure the MPSS does not attempt to start before you try to update the flash. Note that you are asked not just to reboot the host but to disconnect the power, wait a few seconds, then power it up again. Please use the latest version of the MPSS and use the RASMM.elf file that came with that software, not the one I gave you earlier. The person who gave me these instructions was able to successfully update flash that was earlier than version 375 doing this.

1) Disable MPSS from starting on host boot
$ sudo chkconfig mpss off
2) Power-cycle the host (ie. remove power completely for a few seconds then re-apply power)
3); Boot host, do not start MPSS
4) Reset card:
$ sudo micctrl –rw
$ sudo micctrl –s
(should show ‘ready’.. if shows ‘failed’ or host kernel log shows hung @ some post code, go back to step #2)
5) Flash card
$ sudo /opt/intel/mic/bin/micflash –update –smcbootloader –device all
6) If the above fails, immediately execute micdebug.sh and send back the archive this script creates
$ sudo /opt/intel/mic/bin/micdebug.sh

Quote:

Frances Roth (Intel) wrote:

I have been asked to have you try the following - it is very important to ensure the MPSS does not attempt to start before you try to update the flash. Note that you are asked not just to reboot the host but to disconnect the power, wait a few seconds, then power it up again. Please use the latest version of the MPSS and use the RASMM.elf file that came with that software, not the one I gave you earlier. The person who gave me these instructions was able to successfully update flash that was earlier than version 375 doing this.

1) Disable MPSS from starting on host boot $ sudo chkconfig mpss off 2) Power-cycle the host (ie. remove power completely for a few seconds then re-apply power) 3); Boot host, do not start MPSS 4) Reset card: $ sudo micctrl –rw $ sudo micctrl –s (should show ‘ready’.. if shows ‘failed’ or host kernel log shows hung @ some post code, go back to step #2) 5) Flash card $ sudo /opt/intel/mic/bin/micflash –update –smcbootloader –device all 6) If the above fails, immediately execute micdebug.sh and send back the archive this script creates $ sudo /opt/intel/mic/bin/micdebug.sh

Hi~ Frances,

   I did carefully follow the instuctions you gave before, but unfortunately I failed to update.

  When I ran  " $ sudo /opt/intel/mic/bin/micflash –update –smcbootloader –device all"  , it fails by saying:

[root@michost0 ~]# /opt/intel/mic/bin/micflash -update -smcbootloader -device all
No image path specified - Searching: /opt/intel/mic/flash
mic0: Flash image: /opt/intel/mic/flash/EXT_HP2_B1_0386-03.rom.smc
mic0: SMC boot-loader image: /opt/intel/mic/flash/EXT_HP2_SMC_Bootloader_1_8_4326.css_ab
mic0: SMC boot-loader update started
micflash: mic0: SMC update failed: SMC buffer size exceeded (0x1)

mic0: Resetting

Then, as the instructions required, I immediately execute micdebug.sh. However, I got the following result:

[root@michost0 ~]# /opt/intel/mic/bin/micdebug.sh
'micdebug.sh' Version 1.3.0
micdebug.sh: Using folder '/tmp' for temporary data storage...
Saving the host dmesg output...
Copying messages log, mpssd log, '/etc/modprobe.d/mic.conf'...
Saving shell environment...
Copying distribution release information...
Copying '/etc/selinux/config' if present...
Gathering host network data (ifconfig and /etc/sysconfig/network*/ifcgf*.conf)...
Saving 'uname -a' informations...
Found ICC, recording version information (icc -V)...
Saving RPM package list (rpm -qa)...
Saving list of running processes (ps -aef)...
Saving 'lsmod' output for the host...
Saving BIOS information (dmidecode)...
'iptables' firewall service found. Recording current status...
Detected NetworkManager, getting its status...
/opt/intel/mic/bin/micdebug.sh: line 342: [: try2: integer expression expected
Gathering debug data...
Running 'service mpss status'...
The 'mpss' service is stopped.
Running 'service micras status'...
The 'micras' service is stopped.
Running an undetailed 'lspci'...
Running 'micctrl -s mic0'...
Gathering software configuration data...
Copying /etc/sysconfig/mic/default.conf...
Copying /etc/sysconfig/mic/conf.d/\*.conf files...
For 'mic0'...
Listing recursively the coprocessor filesystem folder (ls -lARh)...
Copying filelist file (using cp)...
Copying all passwd, group and hosts files...
Copying /etc/sysconfig/mic/mic0.conf file (using cp)...
Checking and validating ssh keys (using ls and diff)...
Crash dumps were not requested (use the '--include_crash_dumps' option)...
Capturing micN to PCIe addresses...
Getting PCIe addresses for all coprocessors (using lspci)...
Saving detailed (lspci -vvvv) information for only coprocessors...
Saving full lspci -t tree information...
Saving full 'lspci' output for the entire PCIe tree...
===================================================================
Packaging TAR file to send to your system support representative...
 * If crash dumps were enabled and present then this will take a long while...
 * This will list the contents of the archive file...

/opt/intel/mic/bin/micdebug.sh: line 801: [: micdebug_20130910_: binary operator expected
FATAL ERROR: micdebug.sh: The compressed tar file 'micdebug_20130910_ 84625utc.tgz' was NOT created!
     Are you out of disk space in '/tmp'?
     Have permissions to '/tmp' changed for 'root'?
     For security reasons, you will have to re-run this script!
Cleaning up...

To check whether or not the mic had been updated, I ran miccheck & micinfo, the output is inclued in the attachments. It seems that the update was failed :(

Any suggestions? Thank you very much.

is there a file in /tmp called micdebug_20130910_84625utc*   and if so, can you compress it and upload it to this forum thread.    Can you also see if you are out of disk space in /tmp (or nearly there?) and verify the permissions are correct and that you're not having any funky disk issues (I/O errors) causing any problems with writing data there?  (check your /var/log/messages).   It seems that there were both errors in the execution of micdebug, but also potential issues with space or permissions of /tmp.  

Belinda Liviero

Quote:

BELINDA L. (Intel) wrote:

is there a file in /tmp called micdebug_20130910_84625utc*   and if so, can you compress it and upload it to this forum thread.    Can you also see if you are out of disk space in /tmp (or nearly there?) and verify the permissions are correct and that you're not having any funky disk issues (I/O errors) causing any problems with writing data there?  (check your /var/log/messages).   It seems that there were both errors in the execution of micdebug, but also potential issues with space or permissions of /tmp.  

Thank you BELINDA,

   I checked the output, it seemed that the "date" program in my CentOS 6.2 had bugs resulting a invalid filename for micdebug.sh. I modified the micdebug.sh to walk around that bug. I got the tar file and attached it below.

Please help me to check what is going wrong when I updated the Flash of my MIC card. Thank you very much!

Thank you Belinda,

  It seems that "date" program on CentOS6.2 has bugs that generate invalide date information. I specified a valid filename in the micdebug.sh to get the tar.gz file. The file is attached below, could you help me to see what is going wrong when I tried to update the Flash? Thank you very much!

P.S: the version of my MIC Flash is 372(2.1.01.0372).

Attachments: 

Hi Dale,

the recommendation we are getting is to attempt to flash again, immediately after you saw the buffer error (

micflash: mic0: SMC update failed: SMC buffer size exceeded (0x1))

followed by a reboot, and then a 'micdebug' run.   

Belinda Liviero

Hi Dale,

did this process work for you?

Belinda Liviero

Hi all. Sorry to reawaken an old thread, but has anyone found a good solution for flashing out-of-date MIC firmware yet?

We’ve (Allinea) just removed a relatively up-to-date A0 revision Phi from one of our servers, and replaced it with a spare C0 that’s been waiting for a few months.

I tried the commands that Frances Roth posted, but get:

VERSION: Copyright 2011-2012 Intel Corporation All Rights Reserved.
VERSION: 4982-15

Intel(R) Xeon Phi(TM) Coprocessor - 0
Flash update : Failed; Reason: Intel(R) Xeon Phi(TM) Coprocessor stack initialization failed
Device status : HW ready

micdebug.sh doesn’t exist. I’ve attached micinfo and miccheck output.

Thanks.

Attachments: 

AttachmentSize
Downloadtext/plain mic0.txt3.76 KB

Hi Gareth, MPSS 2.1.4982-15 is quite old, and was released even before there was a C0 stepping, so this may be the cause of the failure you are experiencing. I recommend upgrading to one of the latest releases posted here: https://software.intel.com/en-us/articles/intel-manycore-platform-softwa...

 

 

Also, when you install the newer MPSS, you might want to do a clean configuration. From the error messages in your micinfo output, there is at least an IP address conflict.

Thanks for the suggestions; I hadn’t realized that the MPSS was so old. I shall have to hold off on further changes to the machine for now as it is needed for a demo this week, for which it apparently works well enough. Once it goes back into service as a MIC testing box we’ll get back onto it.

Leave a Comment

Please sign in to add a comment. Not a member? Join today