Intel® Developer Zone:
Performance

Highlights

Just published! Intel® Xeon Phi™ Coprocessor High Performance Programming 
Learn the essentials of programming for this new architecture and new products. New!
Intel® System Studio
The Intel® System Studio is a comprehensive integrated software development tool suite solution that can Accelerate Time to Market, Strengthen System Reliability & Boost Power Efficiency and Performance. New!
In case you missed it - 2-day Live Webinar Playback
Introduction to High Performance Application Development for Intel® Xeon & Intel® Xeon Phi™ Coprocessors.
Structured Parallel Programming
Authors Michael McCool, Arch D. Robison, and James Reinders uses an approach based on structured patterns which should make the subject accessible to every software developer.

Deliver your best application performance for your customers through parallel programming with the help of Intel’s innovative resources.

Development Resources


Development Tools

 

Intel® Parallel Studio XE ›

Bringing simplified, end-to-end parallelism to Microsoft Visual Studio* C/C++ developers, Intel® Parallel Studio XE provides advanced tools to optimize client applications for multi-core and manycore.

Intel® Software Development Products

Explore all tools the help you optimize for Intel architecture. Select tools are available for a free 30-day evaluation period.

Tools Knowledge Base

Find guides and support information for Intel tools.

Diagnostic 15038: remark: loop was not vectorized: conditional assignment to a scalar (Fortran)
By Ronald W Green (Intel)Posted 03/05/20140
Causes: 1. A loop contains a conditional statement 2. The conditional statement is controlling the assignment of a scalar value. 3. The logic of the assignment is such that the value of the scalar at the end of execution of the loop depends on the loop executing iterations strictly in-order  A...
Selective Use of gatherhint/scatterhint Instructions
By Rakesh Krishnaiyer (Intel)Posted 02/20/20140
Compiler Methodology for Intel® MIC Architecture Selective Use of gatherhint/scatterhint Instructions Overview The -opt-gather-scatter-unroll=<N> compiler option can be used to generate gatherhint/scatterhint instructions supported by the coprocessor.  This is useful if your code is doin...
Intel® Xeon® Processor E7 v2 Family
By BELINDA L. (Intel)Posted 02/18/20140
  Based on Intel® Core™ microarchitecture (formerly codenamed Ivy Bridge) and manufactured on 22-nanometer process technology, the Intel® Xeon® Processor E7 V2 Family processors provide significant performance, memory and cache bandwidth, and memory capacity over the previous-generation Intel®...
Intel® Xeon® Processor E7 V2 Family Technical Overview
By Sreelekshmy Syamalakumari (Intel)Posted 02/18/20140
Download PDF Contents 1. Executive Summary 2. Introduction 3. Intel® Xeon® processor E7 V2 family enhancements   3.1 Intel® C104/102 Scalable Memory Buffer   3.2 Intel® Secure Key (DRNG)   3.3 Intel® OS Guard (SMEP)   3.4 Intel® Advanced Vector Extensions (Intel® AVX)   3.5 Advanced Pro...

Pages

Subscribe to
Introduction to Embree 2.1 - Part 1
By louis-feng (Intel)Posted 01/24/20140
This is part of a series of blogs on Embree, a collection of high performance ray tracing kernels. Embree has been released open source since version 1.0. Version 2.0 was released during SIGGRAPH 2013 and Embree 2.1 was published on github just before Christmas 2013. The official web site has an ...
Intel® Xeon Phi™ coprocessor Power Management Configuration: Using the micsmc GUI Interface
By Taylor Kidd (Intel)Posted 01/17/20140
Previous blogs on power management and a host of other power management resources can be found in, “List of Useful Power and Power Management Articles, Blogs and References” at http://software.intel.com/en-us/articles/list-of-useful-power-and-power-management-articles-blogs-and-references. See [L...
Bubble, Bubble, Toil and Trouble; Mutex Lock and Buffer Double
By Clay Breshears (Intel)Posted 12/31/20134
Macbeth may have 99 problems, but parallel programming ain’t one of them.
Intel® Xeon Phi™ coprocessor Power Management Configuration: Why should I worry about configuring anything?
By Taylor Kidd (Intel)Posted 12/30/20130
Previous blogs on power management and a host of other power management resources can be found in List of Useful Power and Power Management Articles, Blogs and References. WHAT AND WHY DO WE WANT TO CONFIGURE IT There are several reasons why you might want to configure your power management in ...

Pages

Subscribe to Intel Developer Zone Blogs
Pointers defined in modules and OpenMP
By Jerome B.2
I am working with a program (which I did not write) which has a pointer to a derived type in a module; module X type mytype     integer x, y, z end type mytype type (mytype), pointer :: p_mt end module X   This module is accessed in a subroutine; subroutine Loop use X p_mt  => GoGetOne() p_mt % x = 7.0 ... So far, so good. However, subroutine Loop is accessed from with a parallel loop in another subroutine;   subroutine CallLoop() integer i !$OMP parallel do do i = 1 to 10000     call Loop(i) enddo It is my understanding that p_mt is global in scope, and therefore should not be accessed from within a parallel loop. If I declare Loop as pure; pure subroutine Loop() the compiler flags the assignment of a value to p_mt as an error. Am I missing something? Or is this a potential bug?    
2 CPUs vs num_threads
By Leos P.6
I have 2 xeon CPUs in the PC, each has 4 cores. However, I can only set num_threads to 4. If I set it to a number > 4, I get a message: OMP: Error #136: Cannot create thread. OMP: System error #8: Not enough storage is available to process this command. OMP: Error #178: Function GetExitCodeThread() failed: OMP: System error #6: The handle is invalid. Is it not possible to use all the cores in the system because they are distributed across 2 cpus or why is this happening? (Compiler: Intel C++ 13.0 OS: Windows server 2008 R2)
RTM abort due to RTM_RETIRED.ABORTED_MISC5
By le g.2
Hi there, I drop a piece of CPU-bounded code into the Linux Kernel with local interrupt disabled. The code is surrounded by RTM instructions. On average, the code commits successfully within around 100 tries. On abortion, the reason reported by PMU is RTM_RETIRED.ABORTED_MISC5  I wonder what would be the reason provided that the local interrupt has been disabled? PS. The description of RTM_RETIRED.ABORTED_MISC5: none of the previous 4 categories (e.g. interrupt). Thanks in advance. BR, Le Guan
MultiThreading with MKL library nonlinear least square solver
By Nikolay P.3
Hello everybody,  I am using the intel solution for Nonlinear Least Squares Problem with Linear (Bound) Constraintshttp://software.intel.com/sites/products/documentation/hpc/mkl/mklman/GUID-B6BADF1C-F90C-4D30-8B84-CF9A5F970E08.htm#GUID-B6BADF1C-F90C-4D30-8B84-CF9A5F970E08 Question: what do I need to do to run the optimizer in parallel? A. Let me consider the intel example ex_nlsqp_bc_c.c, let's say I just call omp_set_num_threads(n) before starting the minimization loop: omp_set_num_threads(n); //no pragmas!!! Just want to make sure I don't have to put any pragmas in the cycle. while(not_converged) {  dtrnlspbc_solve(OPTION);  //intel mkl function minimizer;    if(OPTION-1) {my_function();}  // user-supplied function else if (OPTION-2) {djacobi(my_function);} //intel mkl function (numerical gradient);   Does it call my_function from different threads? } In the multithreading mode what is done in parallel? Jacobian construction or just manipulations with Jacobian? I hope that  call...
threading
By Divyesh2
I have intel i3 processor on my laptop. though it has 2 cores it can run 4 threads at a time. When I see task manager I see programs with 11 threads , 40 threads. How are these threads scheduled? is it hardware implemented or managed by the host OS?
I have come to an interresting subject
By aminer100
Hello... I have come to an interresting subject, as you have noticed i have designed and implemented parallel programs that you can find in my following website: http://pages.videotron.com/aminer/ But i was thinking more and more about my parallel programs,and asking myself some questions... if you take a look carefully at my compression library or my parallel archiver you will notice that this compression libraries are construtions of easier high level objects that you can use to do your compression EASILY, it's like robotics and automatization , now you are not required to write compression algorithms or write those high level objects that easy for you the compression process, you are only required to call the methods of those high level objects that do the compression for you, so it's like robotic automatization, you are only required to instantiate high level objects that do the compression and call the methods and it is much easier, but since it's like robotic automatizat...
Are my Parallel Studio packages updating or not?
By dnesteruk2
I've fired up the Intel Software Manager, pressed the download buttons and it all looks like this: So instead of pause buttons I get resume buttons. I've tried pressing them, they briefly turn into pause buttons. So my question: is anything being downloaded or is this thing broken? Thanks. P.S.: registration on this forum is atrocious. Finding this forum was next to impossible. The media upload thing is so far below I didn't notice it and uploaded elsewhere. Usability hint-hint!
Capacity planning
By aminer103
Hello, I have come to an interresting subject, if we have a distributed database and a webserver and HTML files and you want to do a capacity planning of your webserver this will complicate the things, cause the database server must be modelized as an hyperexponential distribution that is an M/G/1 queuing system , but as you have noticed since the database server system , in our network , comes before the internet connection that will be modeled as an M/M/1 queuing system, so you have to use a queuing network simulation to solve this problem , but if you have noticed, in capacity planning we have also to calculate the response time of the worst case performance, so this will easy the job for us cause in the worst case scenario since the M/G/1 queuing system of the database server have three exponential distributions for the reads and writes and deletes transactions, so we have to choose the worst service time that is exponentially distributed , so i think we have to choose only the...

Pages

Subscribe to Forums

Highlights