教授

New Jim Dempsey article: Elusive Algorithms – Parallel Scan

 

Since I haven't seen a notification of this elsewhere, the ever knowledgeable Jim Dempsey (QuickThreadProgramming.com) just published one of his great technical articles entitled, "Elusive Algorithms – Parallel Scan".

I believe this was an outgrowth of another discussion on the forums, "how to perform inclusive scan in C cilk".

--
Taylor
 

Error getting OFED to compile when mic is selected.

Compiling OFED with with phi and --all fails when compiling compat-rdma.
Compile without the "--with-xeon-phi" option works. 


./install.pl --with-xeon-phi --all

# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.1 (Maipo)

# uname -r
3.10.0-229.1.2.el7.x86_64

rpm -qi mpss-sdk-k1om-3.5-1.x86_64
Name        : mpss-sdk-k1om
Version     : 3.5
Release     : 1
Architecture: x86_64
Install Date: Thu 09 Apr 2015 10:00:28 PM CDT
Group       : base
Size        : 484359036
License     : various
Signature   : DSA/SHA1, Thu 02 Apr 2015 06:57:59 AM CDT, Key ID
718a1696ef328191

Participe do Intel IoT Roadshow 2015 - Sao Paulo

Nos dias 19 e 20 de Junho, vamos realizar em São Paulo, no Insper, a edição brasileira do Intel IoT Roadshow 2015, uma série de 20 eventos que serão realizados em diversos países, divulgando o kit para desenvolvimento de IoT da Intel.

Com o formato de hackathon, iremos utilizar no evento a placa Intel Edison e o Grove Starter Kit, em conjunto com o kit para desenvolvimento de IoT da Intel, um conjunto de softwares e bibliotecas Open Source que permitem o desenvolvimento de soluções utilizando a IDE do Arduino, Javascript (node.js), C/C++, Python e Scratch (via Wyliodrin).

Using MKL to generate random data on Xion Phi

Hi,

I try to use MKL to generate lots of random data every time on Xeon phi, but the performance is very bad comparing the performance on Xeon CPU.(E5620) .

The attachment is the original code, and the compile option for Xeon Phi is -O3  -mkl -mmic. and it takes about 115 seconds, however when I run it on Xeon CPU,it only takes 3.5 seconds. I do not know why the difference is so much. Is the way in which I use the Xeon Phi  wrong or the real  performance on Xeon Phi is bad? 

Thank you!

Qiang

 

not vectorizing for no reason

Hi all,

I have isolated a small section of a loop in my code to vectorize and test for other kinds of optimization a well(like alignment etc)

Here is the actual code.

WORK1(:,:,kk) =  KAPPA_THIC(:,:,kbt,k,bid)  * SLX(:,:,kk,kbt,k,bid) * dz(k)

The optrpt says this 

LOOP BEGIN at loop.F90(91,13)
   remark #15541: outer loop was not auto-vectorized: consider using SIMD directive
   remark #25436: completely unrolled by 8

OpenCL Code Builder and runtime vs MPSS version

Hi,

we would like to provide OpenCL support to Intel Core and Xeon processors and Intel Xeon Phi coprocessors on our cluster. On the online documentation I read that "For Intel® Xeon Phi™ coprocessor support, you must install the OpenCL runtime 14.2 here, and the Intel® Manycore Platform Software Stack (Intel® MPSS) 3.3 here".

订阅 教授