mpss-3.2.3 installation problem

mpss-3.2.3 installation problem

I have an installation problem of mpss-3.2.3 on CentOS

Per 3.2.3 readme, (after stopping and unloading older version of mpss), I ran the uninstall.sh from 3.2.3. The readme showed using the 3.2.3 version of uninstall to uninstall the earlier version.

The uninstall seemed to work properly.

I get an error in installation of mpss-3.2.3

Transaction Check Error:

file /etc/modprobe.d/mic.conf from install of

mpss-modules-2.6.32-358.el6.x86_64-3.2.3-1.el6.x86_64 conflicts with file from package

mpss-modules-2.6.32-358.23.2.el6.x86_64-3.1-0.1.build0.el6.x86_64

Any help would be appreciated as I am dead in the water now.

Jim Dempsey

www.quickthreadprogramming.com
14 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

My hunch would be that something didn't get uninstalled from the old MPSS release.  With MPSS 3 uninstall scripts I typically have run the uninstall.sh from a repository of the old release before running the installer from the new release.  Generally that script has been a laundry-list of wild-card uninstalls and I wouldn't expect the problem you describe.  Do you have the old MPSS install tree still present?  Have you tried running the uninstall script over there?

I have a machine with MPSS 3.2 that I'm in the process of upgrading to 3.2.3.  On the far side of that, I may have more insights to share.

Robert,

Thanks for the quick response.

In running the mpss 3.1 uninstall.sh (still on system), it deleted an additional 2 files: mpss-metadata*.*

Re-running the 3.2.3 installation still shows same error.

This leads me to believe there is an installation script error.

Note, I am upgrading from3.1 to 3.2.3 using

sudo yum install *.rpm

from the mpss-3.2.3 directory

Jim Dempsey

www.quickthreadprogramming.com

Hmmmm.  It took me a little longer to follow through because I was using some arcane means (and an ancient machine) to pull the MPSS 3.2.3 tar file off a Windows server and onto my designated Linux host.  This machine did have MPSS 3.2 installed and I proceeded as I described, going to the 3.2 install archive to run the uninstall script before going to the 3.2.3 install tree to run, as you did, a sudo yum install *.rpm.  Everything proceeded without any apparent errors, and I was able to log into my user account on one of the installed coprocessor cards and do an "ls -F" successfully.  No errors.

Of course, I'm running RHEL 6.2 on this machine, versus CentOS, but that shouldn't make a difference.  And I was NOT able to confirm any issues with the installation (or uninstallation) scripts.

Are you able to reinstall  MPSS 3.1?  Because the uninstall script generates a yum job, perhaps using the wrong (un)-install tree the first time led to an imperfect uninstall, that your subsequent attempts didn't rectify?  In the realm of pure hunch at this point.  Perhaps if you could successfully reinstall MPSS 3.1 and then properly and completely uninstall it, that might remove any additional kruft that persisted through your previous attempts, and give you a clean shot at installing 3.2.3?  Don't know if this will work, but it's the only thing I can think of, given what I know so far.

The recent install scripts introduced such failures due to remaining rpms from much earlier releases, which weren't removed by their uninstall scripts.  I found it necessary to search for these, e.g. rpm -qa|grep mpss, and remove them specifically by rpm -e.

Some customers adopted the habit of making a clean linux installation.  I don't know whether this may be necessary to overcome inability to set up permanent password-less ssh in the recent mpss installations.

Tim,

Thanks!!!!!

You too Robert.

I went with Tim's suggestion first, it seemed more straightforward.

rpm -qa|grep mpss

found two mpss files

sudo  rpm -e FileNameHere

for each file. Then

sudo yum install *.rpm

Worked (at least to the point of performing the install). Now I can continue with installation (new flash), install newer Parallel Studio, etc...

My system was running real fine on 3.1 and older Parallel Studio and I did not want to tip the canoe (if it ain't broke, don't fix it).

Jim Dempsey

 

www.quickthreadprogramming.com

I was looking around at some old copies of the 3.1.2 release that we have lying around and none of them have build0 in the version name. I don't have a complete set of the 3.1.2 builds, so it might be that I just missed that version, but still, it is curious. Do any of the other rpm's (either installed or in your 3.1.2 directory) say build0? 

As I say, this is mostly a curiosity, especially since you are off and running now. 

Build0 was the default minor version number for the first few MPSS 3 releases.  I think they had a field without any update mechanism for a while after just having turned the crank to Yocto.  It persisted over several releases but I haven't seen build0 in a while.

Yes, Jim, I agree that rpm -qa|grep is the best way to find errant install modules.  I've used that technique to remove lots of old compiler versions that had not been properly uninstalled.  Hope the rest of your reinstall went as well as the middle.

There were a dozen or so of files with build0 in the mpss install folder for 3.1.

I have to install the newer version of Parallel Studio and build and test a project before I can declare "lift off".

Thanks for being there for me.

Jim Dempsey

www.quickthreadprogramming.com

Next problem:

The mic.ko driver is not set to install on boot. I must perform by hand

sudo insmod -f /lib/modules/2.6.32-358.el6.x86_64/extra/mic.ko

After which

sudo modeprobe mic

FATAL: Module mic not found.

However

sudo micctrl -s

mic0: ready

mic1: ready

and I can start and stop the mpss service.

I am now having issues with making SSH connections. Something has changed here too. (....grumble....)

Jim Dempsey

www.quickthreadprogramming.com

How strange.  I did notice with MPSS 3.2.3 that the entries stuck in the host /etc/hosts file for the coprocessor(s) are now fully qualified domain names rather than just a -micX suffix on the host name.  That might affect SSH but your implication is of something worse.  Do you have more details?

I don't think I've EVER had to manually install a specific Intel MIC kernel object, and I was under the impression that all that happens under the hood with starting service mpss.  I've used chkconfig to enable mpss automatically on boot for the appropriate run levels, and the drivers seem to take care of themselves, even now with 3.2.3 (only one installation of it to go on so far, but it appears to be a good one).  I get the following:

$ micctrl -s
mic0: online (mode: linux image: /usr/share/mpss/boot/bzImage-knightscorner)
mic1: online (mode: linux image: /usr/share/mpss/boot/bzImage-knightscorner)

without any of the hoops you appear to be forced through (-s can be run in micctrl without sudo, though -r and other options that modify configuration still require it).  Are you still having trouble?

Robert,

I get the same report as you do.

I have successfully run a native mode OpenMP app on mic0.

At boot I still have to manually load mic.ko before starting mpss. I will have to figure out where that goes in some startup script..

Also the ssh works now, I had to delete an line entry in both the root and ~ ssh known_hosts file.

While this permits me to get into the mic and ssh to mic, I still have to enter passwords. I would like to eliminate this if possible (I am the only one using the system).

Also, the new installation removed libiomp5.so from the image file, so for now I have to copy it to /tmp (once) before running the application. I will stick this into the image file, once I find where this is hidden in the document. Note, there is no index on the document .pdf files, the table of contents is of little use for this topic. I think I seem to recall seeing this in the C++ document (but it should be in the installation section too).

As stated in an earlier post. My SOP here is to setup the mic's, and just use it for months on end. I don't do (or infrequently do) Linux system admin chores so I tend to forget system configuration procedures, and rely on the readme and quick installation files along with the installation scripts. Installing MPSS 3.1 seemed to have fewer problems than 3.2.3.

Before I install Parallel Studio XE 2013 SP1 Update 2 (replacing Parallel Studio XE 2013 SP1 Update 1) I would like to know if there is anything substantially fixed and/or broken. 2013 SP1 Update 1 has been working fine for now. (I also have 2015 Beta, but I do not know if that will introduce additional problems).

Jim Dempsey

www.quickthreadprogramming.com

Shared objects distributed with the compilers have never been present on a basic MPSS installations.   I think the usual suggested remedy would be to mount all of /opt/intel (or wherever you install the compilers), continually clean out unneeded versions, and avoid installing any 32-bit components.  By not mounting, I am left with the need to copy libiomp5 and libcilkrts (and MPI libraries) to mic0:/lib64/ after each mpss restart and deletion of /root/.ssh/known_hosts  The compiler shared objects may vary with each update, so it may not be safe to pick a specific version as part of MPSS.

The beta compilers have been better than past history would have indicated, and are needed for reasonable support for OpenMP 4.

Thanks Tim,

RE  /root/.ssh/known_hosts 

Which root? Host or MIC?

Jim Dempsey

www.quickthreadprogramming.com

Leave a Comment

Please sign in to add a comment. Not a member? Join today