Parallel Computing

Vtune amplxe-gui freezes, asks to send report/terminate

Hello,

I am trying to profile a Fortran program with the vtune amplifier. I created a report using 

amplxe-cl -collect general-exploration -result-dir vtune_results_smaller exec_name exec_001.rc

where 'exec_001.rc' is a config file read by exec_name. The report collection seemed to go fine, but when I tried to view the results using 

Multithreading when called by C++, not when called by R

Hi everyone, I have been struggling with a problem for quite some time, and I would greatly appreciate your input.

I have a program that uses Intel MKL's dgemm function many times.  In fact, to demonstrate my problem, I used exactly the same code as is in this dgemm tutorial:  https://software.intel.com/en-us/node/429920.

Varying Intel MPI results using different topologies

Hello,

I am compiling and running a massive electronic structure program on an NSF supercomputer.  I am compiling with the intel/15.0.2 Fortran compiler and impi/5.0.2, the latest-installed Intel MPI library.

The program has hybrid parallelization (MPI and OpenMP).  When I run the program on a molecule using 4 MPI tasks on a single node (no OpenMP threading anywhere here), I obtain the correct result.

However, when I spread out the 4 tasks on 2 nodes (still 4 total tasks, just 2 on each node), I get what seem to be numerical-/precision-related errors.

Xeon Phi and offload from MATLAB MEX file

Hello,

I am having a really hard time figuring out how to use the Xeon Phi offload mode from within MATLAB MEX files under Linux. I have managed to force MATLAB to use icc for compilation and verified that the mex files run fine. The problems start when using the offload pragma - as far as I can tell, nobody has tried that yet and I suspect this is some (fixable?) issue with libraries. Can someone here help me with this?

Consider the following simple code

Const data Globally Declared in 'c' is getting allocated in DATA REGION but not in the RODATA region. Any Solutions?

Hi Team,

I have written two test cases, example1.c and example2.c

*****************************************************************************************

example1.c::::

#include<stdio.h>

const char buf[11]="HelloWorld";

int foo(int value){
     return (value + 1);
}

Parallel Computing abonnieren