视频 - 借助英特尔至强融核协处理器实现并行编程

Colfax International 最近发布了下列一组关于英特尔(R) 至强融核(TM) 协处理器的视频。

该视频主要介绍了开发面向英特尔至强融核协处理器的应用时所需要并推荐采用的软件工具。 我们首先介绍启动协处理器以及运行预编译的可执行文件所需的软件。

我的应用是否能受益于 MIC 架构
在本视频中,我们将探讨可在英特尔至强融核协处理器上有效运行的应用类型。 我希望上述介绍能够帮助大家回答“我的应用是否能受益于 MIC 架构?”

Videos - Parallel Programming and Optimization with Intel Xeon Phi Coprocessors

Here is a set of introductory videos from Colfax International on Parallel Programming and Optimization with Intel(R) Xeon Phi(TM) Coprocessors.

In this video episode we will introduce Intel Xeon Phi coprocessors based on the Intel Many Integrated Core, or MIC, architecture and will cover some of the specifics of hardware implementation.

视频 - 借助英特尔至强融核协处理器实现并行编程和优化

下面是 Colfax International 发布的一组关于借助英特尔(R) 至强融核(TM) 协处理器实现并行编程和优化的视频。

第 2.1 集 MIC 架构的用途
在本段视频中,我们将介绍基于英特尔集成众核(或 MIC)架构的英特尔至强融核协处理器,以及硬件实施的几点特性。

第 2.2 集 英特尔 MIC 架构详情
本视频将详细介绍英特尔 MIC 架构的一般属性,然后重点介绍矢量指令支持。

第 2.3 集- 英特尔架构对矢量指令的支持

CentOS 7 + MPSS 3.4.x + OFED 3.1x: Bug in ibp_server?


I'm currently in the process of setting up the OS for a diskless cluster with two Xeon Phi Cards per host.

Currently working with CentOS 7.0, MPSS 3.4.3, OFED 3.12-1 and Lustre 2.7.0.

Installation and booting host and two Xeon Phis works fine so far, except that as soon as I try load Lustre (using o2ib) on the second Xeon Phi the complete system crashes due to an error within the ibp_server module (logs can be found a. Using only one Xeon Phi lustre works fine, including mount over Infiniband.

Regarding intel MIC offload error: buffer write failed

I am trying to explore the code offloading construct .In the following program
 the offloaded region fetches the architecture of MIC card.
void main()
  FILE *fp,*fp1;
 char data[100],data1[100],final[100];
#pragma offload target(mic: 0) inout(data , fp)
	fp=popen("uname -m","r");
	fread(data, sizeof(char),100 , fp);
Here are three sample runs of this program:
  • The first run succeeds ,

Can AVX instruction be executed in parallel


Can two avx instrcutions can be executed in parallel?

For example,


            a1= _mm256_load_ps((Rin +offset)); 
            a2= _mm256_load_ps((Gin +offset));  
            a3= _mm256_load_ps((Bin +offset));

            ac0 = _mm256_mul_ps(a1, in2outAvx_11); 
            ac1 = _mm256_mul_ps(a2, in2outAvx_12);
            ac2 = _mm256_mul_ps(a3, in2outAvx_13);
            z0 = _mm256_add_ps(ac0,ac1);
            z1 = _mm256_add_ps(z0, ac2);

Parallel Image Processing in OpenMP - Image Blocks

I'm doing my first steps in the OpenMP world.

I have an image I want to apply a filter on.
Since the image is large I wanted to break it into non overlapping parts and apply the filter on each independently in parallel.
Namely, I'm creating 4 images I want to have different threads.

I'm using Intel IPP for the handling of the images and the function to apply on each sub image.

I described the code here:

Abaqus/Standard Performance Case Study on Intel® Xeon® E5-2600 v3 Product Family


The whole point of simulation is to model the behavior of a design and potential changes against various conditions to determine whether we are getting an expected response; and simulation in software is far cheaper than building hardware and performing a physical simulation and modifying the hardware model each time.

  • Desarrolladores
  • Socios
  • Profesores
  • Estudiantes
  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8.x
  • Servidor
  • Avanzado
  • Intermedio
  • server
  • abaqus
  • abaqus/standard
  • AVX2
  • Xeon
  • Linux
  • parallel computing
  • vtune
  • Optimización
  • Computación en paralelo
  • Suscribirse a Servidor