Hi,
I trying to run GAMESS code on MIC.
I successfully compile the code only for MIC but when trying to run code on mic's its run but after few second its showing below error ...
Hi,
I trying to run GAMESS code on MIC.
I successfully compile the code only for MIC but when trying to run code on mic's its run but after few second its showing below error ...
I'm trying to get MPICH (3.0.3) and SCIF working.
I'm using the tests from osu_benchmarks(from mvapich2 tarball) as a set of sanity checks, and I'm running into some unexpected errors.
One example: running osu_mbw_mr works sometimes, and then fail on the next try. The printout from two successive runs as well as the hosts file are below.
Compiler is latest (13.1.1) icc; latest MPSS (2-2.1.5889-14); Centos 6.4.
For some reason the SCIF interface in my compute nodes is refusing connections. Any ideas on what's wrong or where to start investigating:
The node has a Mellanox ConnectX-3 HCA with the latest Gold Update 2 MPSS and everything else set up "by the book". All the IB services and modules load nicely and seem to work and I can ssh into the MIC and run natively.
However, if I try to run an offload (LEO or OpenCL) application it hangs. Doing an strace reveals the following:
After an upgrade of a node from MPSS Gold Update 1 to Update 2 I have had issues with the frontend node in our cluster crashing on boot. I tried to downgrade back to Update 1 but it still keeps happening.
We have upgraded the compute nodes succesfully. They have identical hardware and a bridged network configuration. The frontend has the default configuration in /etc/sysconfig/mic.
The host OS is CentOS 6.3 and the card model is 5110P (B1)
On the host side we get the following error during boot:
Hello,
I am writing this test code :
#include <stdio.h>
#include "offload.h"
int main()
{
char cdir[128];
int ndevices, devnum;
getcwd(cdir,sizeof(cdir));
ndevices = _Offload_number_of_devices();
devnum = _Offload_get_device_number();
printf("\n Hello...%s %d %d \n",cdir,ndevices,devnum);
return 0;
}
and compiling
icc -o hello hello.c -loffload
compiles succesfully
However, when i am compiling as
icc -o hello hello.c -loffload -mmic
To get a better idea of MIC's single core, single threaded performance, I tried the following simple experiment:
The following is a simple, unvectorized code, where I take two vectors "arr1" and "arr2" of length=LENGTH and multiply them their corresponding elements with each other, LOOP number of times. I have kept LENGTH short enough so that both vectors fit in the L1 cache, so this shouldn't be memory bound. For ex: LOOP = 1000000 and LENGTH < 256 (should fit within L1 cache).
I compiled without using any optimization flags.
Dear All,
I am facing problem in using _Cilk_shared key word. It does not work for bigger arrays. The same program works well when you have small array size/data.
I am getting either the following error or an error suggesting to increase memory map area.
CARD--ERROR:1 myoiPageFaultHandler: 0x7fffff22a788 Out of Range!
CARD--ERROR:1 _myoiPageFaultHandler: 0x7fffff22a788 switch to default signal handle
CARD--ERROR:1 Segment Fault!
What can we do so that we can use _Cilk_shared with big data (larger arrays).
Thanks,
Jesmin
As the title said, for example section "2.1.12 Host and Intel® MIC Architecture Physical Memory Map", the figure is unrecognizable, and the list below the figure is also fault in hierarchy. By the way, the overall quality of this document is not as good as Intel's 3-set software development manual. Do you have plans to fix this document? Thanks!
#include <stdlib.h>
#include <malloc.h>
#pragma offload_attribute(push, target(mic))
#include <stdio.h>
float *h;// *t;
int bytes, x, y, z;
#pragma offload_attribute(pop)
__attribute__((target (mic))) float *t;
__declspec(target (mic)) void memTest();
__declspec(target (mic)) void memTest() {
int j;
for(j=0; j<bytes; j++)
t[j] = h[j] + 1.0;
}
int main()
{
int i;
x = y = z = 2;
bytes = x*y*z;
I started to play with AO, using an example code dgemm_with_timing.F (attached). With MKL_MIC_ENABLE=1, OFFLOAD_REPORT=2, and matrix size M/N=4000 being large enough for AO, the code should automatically offload and provide the offload info, but I didn't see the report. Isn't OFFLOAD_REPORT=2 supposed to provide the offload profiling report level for any offload, including Intel MKL AO? Or is it possible that the code is not offloaded at all? The timing does not vary much with different MIC_OMP_NUM_THREADS I specified, so it could be. What did I miss?
I compiled with