Intel® Cilk™ Plus

Parallel Building Blocks - Slick Array Notations Syntax Inside a Cilk ‘for’

In the previous blog, I explained the rationale behind vectorizing with Array Notations inside of the Cilk ‘for’. There are lots of ways to vectorize inside a for loop once you’ve used the Cilk ‘for’, but using this colon/bracket style syntax has been the easiest way for me to get vectorization without having to think about using AVX128 vs. AVX256 at all. That’s simply much harder to wrap your head around than just changing your data structures to use a different, compiler-friendly style of expressing code:"Hi compiler, please vectorize this stuff inside my for loop. Thanks!"

Parallel Building Blocks - Vectorize Inside the Cilk 'for' with Array Notations

In the previous blog, I gave a very fast crash course on the Cilk ‘for’. There are a few other ways you can manipulate that for loop, but I will go over that another time. The key is to understand what is possible first, get the basics running and parallelized, and use the more expert features later. I think people new to parallel programming get too caught up on speedups and start using the full gamut of cilk features. I really feel the best course of action with any of the PBB models is the following:

Parallel Building Blocks – Cilk ‘for’ Primer

So you’ve made the decision to go with a language extension and finish the parallel coding quickly. You don’t have to spend a lot of time thinking about your algorithm and problem with Cilk, and it's not meant for people that do want to spend a lot of time thinking about it. For this PBB model you can simply learn the syntax and usage of the Cilk ‘for’ and quickly augment your serial ‘for’ to get task parallelism across cores. After that you can protect against data races and add vectorization later (I will talk about those things in subsequent blogs/videos).

Parallel Building Blocks – Time vs. Freedom vs. THINK

So you're looking to parallelize that pesky for loop that's taking a lot of time in your code. You're probably thinking, when are we going to actually start talking about the different 'for' loops offered within PBB? Not yet...there are still some other things to consider. In this fashion, you will more quickly find what you need out of the three different models.

Parallel Building Blocks - 'for' Loop Considerations

So you've used Intel Parallel Amplifier to discover that a sizable percentage of your processing time is being spent inside of a very hairy for loop. Task Parallel or Data parallel algorithm considerations aside, your first reaction is, how can I get this for loop, at the very least, parallelized across some or all of the cores (note: these considerations vary depending on data size - there are cases where parallelizing across all cores does not warrant the overhead in distributing across those cores)?

Cilk_for和#pragma SIMD的区别和使用说明(2)

先说说向量化的问题,使向量化计算(vectorizer)是英特尔编译器的一个功能组件,它用到了在MMX™中的SIMD 指令和Intel® SSE, SSE2, SSE3, SSE4以及SSSE3指令集。该功能组件能在编译过程中诊断相应的操作是否能进行并行计算,然后将这些操作根据数据类型翻译成相应的SIMD指令使其能同时执行多达16个单元的处理计算。

从编译器角度,可以有两种方式做到这一点:一种是设置编译器编译开关;另一种是在程序中添加相应的编译指示。

下面是有关设置编译器编译开关的汇总(这些编译选项同时支持 IA-32和 Intel® 64架构)。
Linux* OS Mac OS*             X Windows* OS 说明
-x                                        /Qx                     将生成专有处理器指令集的代码
-ax                                      /Qax                    在单一的执行代码中既生成一组专有的指令集代码,同时还生成一组通用的执行代码。通常通用代码的性能比较差。

  • Intel® Cilk™ Plus
  • Parallel Computing
  • Visualize this! Intel Cilk Plus SDK

    Welcome to Visualize this! the show where we talk about game development. I have an interesting lineup of guests for this year, and if you have guests you would like to hear from, send me note.

    My guest today is Barry Tannenbaum Software Engineer and Cilk Developer at Intel. Barry and his team recently released the Cilk PLus SDK and he spoke with me on its value proposition to developers.


    Elemental functions: Writing data parallel code in C/C++ using Intel® Cilk™ Plus

    Intel® Cilk™ Plus provides simple to use language extensions to express data and task-parallelism to the C and C++ language. This article describes one of these programming constructs: “elemental functions”.
  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Apple Mac OS X*
  • C/C++
  • Intel® C++ Compiler
  • Intel® Cilk™ Plus
  • Intel® Parallel Composer
  • Intel® Cilk™ Plus
  • elemental function
  • __declspec(vector)
  • Optimization
  • Pages

    Subscribe to Intel® Cilk™ Plus