Intel® Developer Zone:
Performance

Highlights

Just published! Intel® Xeon Phi™ Coprocessor High Performance Programming 
Learn the essentials of programming for this new architecture and new products. New!
Intel® System Studio
The Intel® System Studio is a comprehensive integrated software development tool suite solution that can Accelerate Time to Market, Strengthen System Reliability & Boost Power Efficiency and Performance. New!
In case you missed it - 2-day Live Webinar Playback
Introduction to High Performance Application Development for Intel® Xeon & Intel® Xeon Phi™ Coprocessors.
Structured Parallel Programming
Authors Michael McCool, Arch D. Robison, and James Reinders uses an approach based on structured patterns which should make the subject accessible to every software developer.

Deliver your best application performance for your customers through parallel programming with the help of Intel’s innovative resources.

Development Resources


Development Tools

 

Intel® Parallel Studio XE ›

Bringing simplified, end-to-end parallelism to Microsoft Visual Studio* C/C++ developers, Intel® Parallel Studio XE provides advanced tools to optimize client applications for multi-core and manycore.

Intel® Software Development Products

Explore all tools the help you optimize for Intel architecture. Select tools are available for a free 30-day evaluation period.

Tools Knowledge Base

Find guides and support information for Intel tools.

Courseware Algorithmic Strategies
By adminPosted 02/27/20150
Brute-force algorithms Greedy algorithms Divide-and-conquer Backtracking Branch-and-bound Heuristics Pattern matching and string/text algorithms Numerical approximation algorithms     Parallel Solution to Cat-and-Mouse strategy game problem (Vyukov)     Material Type: Codi...
Courseware - Software Processes
By adminPosted 02/27/20150
Software life-cycle and process models Software process capability maturity models Approaches to process improvement Process assessment models Software process measurements     CSE445/598 Project on Multithreading and Multi-Core Processing (ASU)     Material Type: Problem set...
Courseware - Data Structures
By adminPosted 02/27/20150
Representation of numeric data Range, precision, and rounding errors Arrays Representation of character data Strings and string processing Runtime storage management Pointers and references Linked structures Implementation strategies for stacks, queues, and hash tables Implementation str...
Intel® Parallel Studio XE 2016 Beta
By Gergana Slavova (Intel)Posted 02/27/20150
Contents What's New Overview What's New in Intel® Compilers 16.0 What's New for the rest of the tools License changes in 2016 product Check out the full What's New Technical Document Details Frequently Asked Questions Beta duration and schedule Support How to enroll in th...
Subscribe to Intel Developer Zone Articles
Power Management Policy: Summary and Future Policies
By Taylor Kidd (Intel) Posted on 06/17/14 0
How about the future? Have we reached the pinnacle of power management? Hardware and software are still evolving to be even more energy efficient. An example is the “tickless” OS. In the old days, OSs had to periodically wake up the processor (i.e., perform an interrupt) around a hundred times a...
Optimizing Big Data processing with Haswell 256-bit Integer SIMD instructions
By gaston-hillar Posted on 06/11/14 0
Big Data requires processing huge amounts of data. Intel Advanced Vector Extensions 2 (aka AVX2) promoted most Intel AVX 128-bits integer SIMD instruction sets to 256-bits. Intel AVX brought 256-bits floating-point SIMD instructions, but it didn't include 256-bits integer SIMD instructions. Intel...
Submissions open: High Performance Parallelism Gems
By Taylor Kidd (Intel) Posted on 05/19/14 0
We have all had our little discoveries and triumphs in identifying new and innovative approaches that increased the performance of our applications. Occasionally we find something more, something that could also help others, an innovative gem. You now have an opportunity to broadcast your successes more widely to the benefit of our community. You are invited to submit a proposal to a contribution-based book, working title, “High Performance Parallelism Gems – Successful Approaches for Multicore and Many-core Programming” that will focus on practical techniques for Intel® Xeon® processor and Intel® Xeon Phi™ coprocessor parallel computing. Submissions are due by May 29, 2014.
Debugging performance issues in Go programs
By Dmitry Vyukov Posted on 05/10/14 2
A comprehensive guide on performance debugging tools for the Go language.
Subscribe to Intel Developer Zone Blogs
How to track down OpenMP segfault caused by the addition of ORDERED?
By Alastair M.4
Dear all, I hope this is the right place to ask this question. I am working on adding OpenMP support to some existing Fortran code, using ifort version 15. I noticed that the addition of the c$OMP ORDERED clause to my outer parallel do loop causes the program to segfault in the second loop iteration, when attempting to access a FIRSTPRIVATE variable.  This occurs with OMP_NUM_THREADS=1.  The same error also occurs with ifort 14.0.2. On further inspection I realised that at some point during the 2nd loop iteration the stack becomes corrupted.  That is, "info locals" in gdb complains about not being able to read certain variables, when it previously could, and then the segfault follows shortly afterwards.  I also noticed that the location of the segfault is repeatable but changes when the list of FIRSTPRIVATE variables is changed. With the ORDERED construct removed from the loop, the program executes correctly and tests with valgrind and inspxe indicate zero problems.  I have ulimit -...
Where can i download Intel MPI Benchmarks?
By Bo W.1
Hello everyone, where can i  download the intel mpi benchmarks? Cheers, Bo
'Wildhoney' - the 512bit superfast textual decompressor - some thoughts
By Georgi M.19
Hi to all. Glad I am that finally joined the Intel forum, long overdue. Here I want to share my amateurish vision on superfast textual decompression topic. For 4 months now I have been playing with my file-to-file decompressor named Nakamichi. I am on quest for writing the fastest possible variant of my approach, branchlessness combined with one only native (hifhest order) register on latest machines. This translates to 64bit/512bit mixed code. Few hours ago I wrote 'Wildhoney' variant using just that configuration. And two important things: - Nakamichi is 100% FREE - no restictions at all for modifying as the original Lempel-Ziv was; - Speed is religion, the fastestness is the ultimate goal. So far, I have written two OpenMP console tools, each enforcing 16 threads - MokujIN and Kazahana, I hope Nakamichi 'Wildhoney' to be the third. Any help in developing it I would appreciate, many basic still things I don't know. The ZMM executable with the C source is here:http://www.san...
need something like a sorted tbb::parallel_do
By foelsche@sbcglobal.net1
    from what I see there is tbb::concurrent_priority_queue.         but with this I would have to deal with thread pools myself.       is this really true?
TBB: Using task_scheduler_observer to set worker thread's OS scheduling priority
By Tim Day5
I'm looking at TBB's task_arena and task_scheduler_observer. The documentation for task_scheduler_observer sketches out a nice example of it being used to set thread affinity on worker threads to lock an arena's threads onto a subset of cores. I'm curious to know whether this class and a similar pattern could practically be used to set OS scheduling priority for an arena.  What I'm interested in doing is, on my N core HW, creating an arena with N normal worker threads, and another arena with N threads on a lower OS scheduling priority.  However, the issue with scheduler priority is that generally you only get to lower it (unless running as root, but assume not), and it's not clear to me to what extent TBB worker threads move around between arenas (which would defeat the object of keeping all the low priority threads in one arena); the task_scheduler_observer docs mention returning false from on_scheduler_leaving() to keep a thread in an arena... but also mentions the possibility of ...
API for Haswells TSX
By roberto c.2
hello, i have just begun my research focus with HTM, primarily focusing on RTM(restricted transaction memory). is there any APIs for RTM? I have looked on the internet but only the basic operands exist for RTM, such as xbegin, xend, xabort, xtest. I want to be able to access the shared memories with HTM but i can not find any library files for it.  Can you please point me in the right direction, thanks for your support.
CL_DEVICE_TYPE_CPU not working in Windows 8.1
By Yaknan G.1
Hi, I recently tried to run my OpenCL program on a new windows 8.1 computer but the program returns an error when the device type is CL_DEVICE_TYPE_CPU. When I change the device type to a CL_DEVICE_TYPE_GPU or CL_DEVICE_TYPE_ ALL it ran the program on the GPU. Here is the system specification of the new computer: OS: Windows 8.1 Processor: Intel Core i7 - 4700MQ clocked at 2.40GHz Display Adapter: Intel HD Graphic 4600 and NVIDIA GeForce GT 740M How can I resolve this problem and is OpenCL having issues with windows 8.1? Please help! Yaknan
If the Policies are changed
By Luis B.0
[url=http://www.reddit.com/r/pesta3/comments/2b1ixd/]Watch British Open 2014 Live Stream WatchESPN 2nd Round free Online[/url] [url=http://www.reddit.com/r/pesta3/comments/2b19ls/]British Open Golf 2014 Live Stream Round 2 WatchESPN Online Coverage[/url]
Subscribe to Forums

Highlights