Right up front, I am going to tell you that P-states are irrelevant, meaning they will not impact the performance of your HPC application. Nevertheless, they are important to your application in a more roundabout way. Since most of you belong to a group of untrusting and always questioning skeptics (i.e. engineers and scientists), I am going to go through the unnecessary exercise of justifying my claim.
Hi, my name is Taylor Kidd. You many know me from such notables as, “The Beginning Intel® Xeon Phi™ Coprocessor Workshop,” and, “The Advanced Intel® Xeon Phi™ Coprocessor Workshop,” where I mesmerized audiences with over 10 hours of highly technical information.
PART 0: “Introduction”
(This work was done by Vivek Lingegowda during his internship at Intel.)
If you’ve ever heard about parallel programming it probably sounded like a painful endeavor. Those who have experience with it know that it’s mostly true with “traditional” approaches which incorporate parallel constructs into main-stream programming languages like C++ or FORTRAN. In the light of decades of research in parallel computing this is an irritating situation - and more so now that multi-core systems have been mainstream for years.
But why does parallelism hurt? And does it really have to?
Today, tuning software isn’t just about making an application run faster, it is also about making sure it is running efficiently.
One of the most useful aspects of Intel® Advisor XE is its ability to model parallelism in my application without actually running the code in parallel. Simply by doing this modeling it can tell me potential race conditions and correctness issues while still running everything serially. At first glance I ask myself “Why do I need this modeling? Shouldn’t I just try the parallel code and let another tool, like Intel® Inspector XE, find the race conditions?”
Intel® Advisor XE along with the other Intel® Parallel Studio XE tools lay out a multi-step process to aid developers in transitioning their serial code to efficient and correct parallel code. This blog will focus on the first step of the process: How to determine where to add parallelism in an application.
Has this ever happened to you: You work tirelessly to add threads to your serial code, all your correctness tests are passing, and your application is zooming along almost twice as fast as the serial version on your 2 core machine. Now your friend sees your results and would love to run your program on his machine which is fully-loaded with four cores that are all equipped with Intel® Hyper-Threading Technology (that’s 8 "logical" processors).