Vectorización

"-collect-with runsa -knob event-config" only works with Basic Performance Tuning Events

For example

event-config=CPU_CLK_UNHALTED.THREAD

works fine. But for many others such as MEM_UNCORE_RETIRED.REMOTE_DRAM

amplxe-cl will give error like:

amplxe: Error: Cannot configure sampling event groups. The collection is terminated.

Could anyone help? Thanks

New Jim Dempsey article: Elusive Algorithms – Parallel Scan

 

Since I haven't seen a notification of this elsewhere, the ever knowledgeable Jim Dempsey (QuickThreadProgramming.com) just published one of his great technical articles entitled, "Elusive Algorithms – Parallel Scan".

I believe this was an outgrowth of another discussion on the forums, "how to perform inclusive scan in C cilk".

--
Taylor
 

Error getting OFED to compile when mic is selected.

Compiling OFED with with phi and --all fails when compiling compat-rdma.
Compile without the "--with-xeon-phi" option works. 


./install.pl --with-xeon-phi --all

# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.1 (Maipo)

# uname -r
3.10.0-229.1.2.el7.x86_64

rpm -qi mpss-sdk-k1om-3.5-1.x86_64
Name        : mpss-sdk-k1om
Version     : 3.5
Release     : 1
Architecture: x86_64
Install Date: Thu 09 Apr 2015 10:00:28 PM CDT
Group       : base
Size        : 484359036
License     : various
Signature   : DSA/SHA1, Thu 02 Apr 2015 06:57:59 AM CDT, Key ID
718a1696ef328191

Elusive Algorithms - Parallel Scan

Last month there was a query on the IDZ MIC forum "how to perform inclusive scan in C cilk" in which my initial reply was:

Parallelizing this is problematic due to the next result being dependent upon the prior result. While this is not impossible, it is rather difficult and it introduces some redundant additions.

Parallelization of dyadic product

Hi,

I have two vectors (they can address the same vector) and I need to perform the product x[i]*y[j] with i,j=1..n.

What is the best way to perform this operation in parallel? I've tried

cilk_for(h=0;h<n*n;h++)r[h]=x[h/n]*y[h%n];

but I guess it is only a naive tentative to do that. Indeed vec-report says it is uneffcient.

Thanks.

Fabio

 

PARDISO consistent crash for INCORE RUN

Hi, I am trying to solve several big 3d solid FE models with PARDISO 11.2

Although the out-of-core run is successful I am consistently getting segmentation fault errors for the in core runs.

This also happens with pardiso_64 and cpardiso when only 1 mpi process is used

With more than 1 mpi processes the run is successful.

The error is reproducible and occurs for almost all big models which I have tried.

Thanks

Kostas

Suscribirse a Vectorización