Okay, so it's not something that I would promote as "New & Improved" with big splashy TV commercials, like some laundry detergent or kids' sugary drink. I would be more likely to skip the trumpets and play a fanfare on a kazoo instead. Even so, I'm excited to announce that the Intel Guide for Developing Multithreaded Applications has been updated.
Many of the articles featured in the Guide remain as relevant to parallel programmers today as they did when the Guide was first put together. Good parallel progarmming practice will always be good parallel programming practice. Even so, technology tools are changing at a steady pace, and we know that the Guide needs to keep up with those changes. To that end we've added three new articles and revised two others.
The revised articles illustrate some new features of Intel software tools. The article "Getting Code Ready for Parallel Execution with Intel® Parallel Composer" has added explanations of new features and new programming libraries supported by the Intel® Parallel Composer compiler. "Using Intel® Inspector XE 2011 to Find Data Races in Multithreaded Code" updates how to use the latest version of the thread debugging tool.
One of the new articles also deals with Intel software tools. "Optimize Data Structures and Memory Access Patterns to Improve Data Locality" poses a tricky parallel performance issue. Intel® VTune™ Amplifier XE is used to not only detect the problem, but the tool gives users some low-level analysis option to diagnose the root-cause of the problem. The article illustrates a methodology to delve deeper into parallel code to identify and correct bottlenecks to scalability and performance.
Vector computation has been a staple of parallelism for many decades. The second new article, "Using AVX Without Writing AVX Code" shows how programmers can take advantage of the new vector hardware units and AVX instructions without needing to go down to the assembly language level of coding. For viewers of Parallel Programming Talk this might sound familiar. Our guest for show #114was Richard Hubbard who talked about this exact topic. We felt it was something parallel programmers would want to know about, so Richard and Eric Palmer (Intel) wrote it up for the Guide.
The topic of the third new article, "Optimizing Applications for NUMA " was also featured on Parallel Programming Talk. David Ott appeared in show #113 to tell us about NUMA architecture and how best to program for NUMA. If you have never heard of NUMA or if you have, but have been unsure about what you need to be doing to get the best performance from your applications, then David's article is an excellent place to start.
Even if you've read articles from the Guide before this, you might find something you can apply to your own work in the new articles. This isn't the last revision we'll make to the Guide, of course. I've got some ideas for topics that I'm sure will find their way into the collection. Future articles may be inserted without much hoopla, so check back periodically to see if we've added something new.
If you have a topic that you think should be featured in the Guide, let me know. We are always interested in delivering what readers and parallel programming practitioners want to know more about.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804