Composer XE C++ Intel®

How to use the cilkview?

I have a C search application  on a centos 6.x 64 bit linux server that I just installed the cilkplus compiler on to take advantage of more cpu/cores. I've added the cilk_spawn function to some recursive scanning functions in my program.  After re-compiling the search application with the cilkplus gcc compiler, the search program is working as intended without any seg faults or any other errors.

My question is how do I use the cilkview analyzer? I want to if cilkplus/spawning is helping my search application and if so by how much?





The Chronicles of Phi - part 4 - Hyper-Thread Phalanx – tiled_HT2

The prior part (3) of this blog showed the effects of the first-level implementation of the Hyper-Thread Phalanx. The change in programming yielded 9.7% improvement in performance for the small model, and little to no improvement in the large model. This left part 3 of this blog with the questions:

What is non-optimal about this strategy?
And: What can be improved?

There are two things, one is obvious, and the other is not so obvious.

Data alignment

Explicit Vector Programming – Best Known Methods

Explicit Vector Programming – Best Known Methods

Why do we care about vectorizing applications? The simple answer: Vectorizing improves performance, and achieving high performance can save power. The faster an application can compute CPU-intensive regions, the faster the CPU can be set to a lower power state.

  • Desenvolvedores
  • Parceiros
  • C/C++
  • Avançado
  • Intermediário
  • Compilador C++ Intel®
  • Composer XE C++ Intel®
  • Composer XE Fortran Intel®
  • Explicit Vector Programming
  • OpenMP* 4.0 Vectorization
  • Intel(R) Cilk(TM) Plus
  • Otimização
  • Vetorização
  • The Chronicles of Phi - part 3 Hyper-Thread Phalanx – tiled_HT1 continued

    The prior part (2) of this blog provided a header and set of function that can be used to determine the logical core and logical Hyper-Thread number within the core. This determination is to be use in an optimization strategy called the Hyper-Thread Phalanx.

    developer documents for Cilk Plus


    First I would like to thank you all for the awesome cilk plus tools you have open source in GCC and LLVM.

    I am trying to study the runtime library and finding it a bit difficult to follow the execution in a sample application.

    Are there any developer documents available? A wiki perhaps.

    Specifically, I am trying to trace the execution path for cilk_spawn which is a key word. Any helpful links to get me started would be really great!



    Question about steal-continuation semantics in Cilk Plus, Global counter slowing down computation, return value of functions

    What I understood about steal-continuation is, that every idle thread does not actually steal work, but the continuation which generates a new working item.
    Does that mean, that inter-spawn execution time is crucial? If 2 threads are idle at the same time, from what I understand only one can steal the continuation and create its working unit, the other thread stays idle during that time?!

    As a debugging artefact, I had a global counter incremented on every function call of a function used within every working item.

    How Intel® AVX Improves Performance on Server Application

    The latest Intel® Xeon® processor E7 v2 family includes a feature called Intel® Advanced Vector Extensions (Intel® AVX), which can potentially improve application performance. Here we will explain the context, and provide an example of how using Intel® AVX improved performance for a commonly known benchmark.

    For existing vectorized code that uses floating point operations, you can gain a potential performance boost when running on newer platforms such as the Intel® Xeon® processor E7 v2 family, by doing one of the following:

    Assine o Composer XE C++ Intel®