Blogs do autor

Generic Parallel Algorithms for Intel® TBB - "They're Already in There" Part 2
Autor: Noah Clemons (Intel) Publicado em 03/06/11 0
A high-level overview of general algorithms included in Intel® TBB to let you know what's possible: parallel_reduce, parallel_do, parallel_for_each: parallel_invoke, parallel_pipeline, parallel_sort and parallel_scan
Generic Parallel Algorithms for Intel ® TBB - "They're Already in There" Part 1
Autor: Noah Clemons (Intel) Publicado em 03/06/11 1
A high-level overview of some generic parallel algorithms included in Intel® TBB to let you know what's possible: parallel_reduce, parallel_do, parallel_for_each
Easier Intel® TBB parallel_for with C++0x Lambda Expressions
Autor: Noah Clemons (Intel) Publicado em 27/05/11 0
In the last blog, I explained how to “build” a parallelized for out of templatized components. Today I’m going to show you an easier way to implement the Intel® Threading Building Blocks (Intel® TBB) parallel_for. The Final Draft International Standard (FDIS) for C++11, also known as C++0x came o...
Intel® TBB parallel_for and its Sandbox of Options
Autor: Noah Clemons (Intel) Publicado em 24/05/11 0
In the last 6 blog posts, I've explained thecilk_for and getting vectorization insidereductions to prevent data races on the shared datacreating your own custom vectorized functions What's great about those is that they are "quick and dirty" - you can get a whole lot of parallelism that you previ...
Parallel Building Blocks - Eliminate Data Races, THEN Add Vector + Thread Parallelism with Cilk
Autor: Noah Clemons (Intel) Publicado em 04/05/11 0
In the last 5 blogs, I’ve explained various ways you can have the compiler generate vectorized code for you. If you understand and master each of the different ways Array Notations can vectorize your code, there is one last* recommended step before you start optimizing the “meat” of your ‘for’ lo...
Parallel Building Blocks - Have Cilk Vectorize your own Scalar Functions
Autor: Noah Clemons (Intel) Publicado em 02/05/11 0
In the previous blog, I explained two mini-kernels, the scatter and gather, which can be written up quickly and still have the benefits of compiler vectorization with Array Notations. There’s also a variety of run-of-the-mill functions, primarily in the “raw plug and chug math” category that you ...
Parallel Building Blocks - Vectorized Parallel Patterns inside a Cilk ‘for’
Autor: Noah Clemons (Intel) Publicado em 29/04/11 1
100 blogs and 100 videos in 100 days about PBB #6: Vectorized Parallel Patterns inside a Cilk ‘for’ “Educate the compiler a little on what you’re trying to do, and it will vectorize a ton for you.” In the previous blog, I explained the syntax and behavior of basic Array Notations vectorization...
Parallel Building Blocks - Slick Array Notations Syntax Inside a Cilk ‘for’
Autor: Noah Clemons (Intel) Publicado em 28/04/11 0
In the previous blog, I explained the rationale behind vectorizing with Array Notations inside of the Cilk ‘for’. There are lots of ways to vectorize inside a for loop once you’ve used the Cilk ‘for’, but using this colon/bracket style syntax has been the easiest way for me to get vectorization w...
Parallel Building Blocks - Vectorize Inside the Cilk 'for' with Array Notations
Autor: Noah Clemons (Intel) Publicado em 27/04/11 0
In the previous blog, I gave a very fast crash course on the Cilk ‘for’. There are a few other ways you can manipulate that for loop, but I will go over that another time. The key is to understand what is possible first, get the basics running and parallelized, and use the more expert features la...
Parallel Building Blocks – Cilk ‘for’ Primer
Autor: Noah Clemons (Intel) Publicado em 20/04/11 1
So you’ve made the decision to go with a language extension and finish the parallel coding quickly. You don’t have to spend a lot of time thinking about your algorithm and problem with Cilk, and it's not meant for people that do want to spend a lot of time thinking about it. For this PBB model yo...