Intel® Cilk™ Plus

gcc cilkplus non-support of reducers (other than int type ?)

problem confirmed

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71473

Don't know the answer to the question about int vs. size_t for min|max_ind reducers.  Note that Intel cilk(tm) plus uses size_t in several contexts where they must be cast to (int) in order to attain satisfactory performance, but the reducers in question don't give good performance, so may be using size_t internally.

Splitting array notation work accross threads

Hi,

I have an operation on a large array written in array notation. Since the array is large, what I really want is the work to be split up across many cores, and each core to use SIMD units to perform its work. Is there an easy way to specify that the work should be divided up among however many threads there are in the machine?

Vector of reducers that are not cache aligned

I am using Cilk and a custom reducer as described here: https://software.intel.com/en-us/node/522608. In the example, they use the reducer for append operation in a linked list.

Now, I want to create a vector of reducers (using std::vector); however, I get the following runtime error: 

Reducer should be cache aligned. Please see comments following this assertion for explanation and fixes.

Hybrid Parallelism: A MiniFE* Case Study

This case study examines the situation where the problem decomposition is the same for threading as it is for Message Passing Interface* (MPI); that is, the threading parallelism is elevated to the same level as MPI parallelism.
  • Professional
  • Professors
  • Students
  • Linux*
  • Modern Code
  • Server
  • C/C++
  • Intermediate
  • Intel® C++ Compiler
  • Intel® Cilk™ Plus
  • MiniFE*
  • Message Passing Interface (MPI)
  • OpenMP*
  • Academic
  • Cluster Computing
  • Intel® Many Integrated Core Architecture
  • Optimization
  • Parallel Computing
  • Threading
  • 整理您的数据和代码: 数据和布局 - 第 2 部分

    Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
  • Students
  • Modern Code
  • Server
  • Windows*
  • C/C++
  • Fortran
  • Intermediate
  • Intel® Advisor
  • Intel® Cilk™ Plus
  • Intel® Threading Building Blocks
  • Intel® Advanced Vector Extensions (Intel® AVX)
  • OpenMP*
  • Intel® Many Integrated Core Architecture
  • Optimization
  • Parallel Computing
  • Threading
  • Vectorization
  • CILK PLUS w/ VxWorks 7

    Hi,

    I'm trying to load the Cilk Plus test code as documented in "GETTING STARTED WITH INTEL CILK PLUS WITH VXWORKS 7" as a downloadable kernel module (DKM) but I am getting the following undefined symbols:

    __cilkrts_hyper_destroy.

    __cilkrts_hyper_create.

    __cilkrts_cilk_for_32.

    __cilkrts_hyper_lookup.

    I have built the VxWorks 7 kernel w/ CILK support and was able to successfully execute the test code when it is linked directly into my VIP project.  Any ideas why I'm seeing issues when the test program is built as a DKM ? Thanks.

    Improve Performance with Vectorization

    This article focuses on the steps to improve software performance with vectorization. Included are examples of full applications along with some simpler cases to illustrate the steps to vectorization.
  • Professional
  • Students
  • Modern Code
  • Server
  • C/C++
  • Fortran
  • Intermediate
  • Intel® Cilk™ Plus
  • Intel® Advanced Vector Extensions (Intel® AVX)
  • Intel® Many Integrated Core Architecture
  • Optimization
  • Parallel Computing
  • Vectorization
  • tracking clang updates

    I recently downloaded, from the git repo, and built the Cilk Plus/LLVM stuff.

    Playing with the generated Clang using -v, it claims to be 3.9. Surprising since 3.9 isn't really available yet, and notes about Cilk Plus/LLVM suggest it's made from a branch in February 2016. The git repo doesn't show any updates since February either.

    What's going on?

    Thanks

     

    Cilk algorithm slower than the scalar counterpart

    Hello everyone,

    I am writing a greyscale both in cilk and scalar as a project for university to compare cilk with "normal" code. Each pixel is represented as RGB (floats), as the title already states my issue is that the cilk is slower than the scalar part which i don't really understand so maybe could maybe take a look and tell me if i am doing something wrong and what would be the correct approach to implement such an algorithm in Cilk. 

    Subscribe to Intel® Cilk™ Plus