I have switched to using Intel Cluster Studio XE tools 2019 Update 4 but now both Vtune Amplifier and Thread Advisor refuse to run claiming that an X11 library is missing.
This will be the final post in my planned short vectorization series. Although I reserve the right to post more on vectorization in the future!
One of my performance focus areas for this year is vectorization.
As part of my focus on software performance, I also support and consult on implementing scalable parallelism in applications.
Last week I posted a blog explaining the front-end of the pipeline on Intel® Micro
As I'm sure you know, modern processors employ a technique called pipelining to increase instruction throughput.
The dust of SC’11 starts to settle and several announcements around OpenMP have been made in Seattle.
I've known this day was coming - but when I saw Knights Corner clearly sustaining a TeraFlop (DGEMM, wide range of block sizes) per second - I was surprised by my emotional reaction inside.