Zona para desarrolladores Intel®:
Rendimiento

Destacado

¡Recién publicado! Intel® Xeon Phi™ Coprocessor High Performance Programming 
Aprenda los principios básicos de la programación para esta nueva arquitectura y nuevos productos. ¡Nuevo!
Intel® System Studio
Intel® System Studio es una solución en forma de suite completa de herramientas de desarrollo de software integrado que puede acelerar el tiempo de inserción en el mercado, fortalecer la fiabilidad del sistema e impulsar el consumo eficaz de energía y el rendimiento. ¡Nuevo!
Si no pudo asistir: Reproducción del webinario en vivo de 2 días
Introducción al Desarrollo de aplicaciones de alto rendimiento para coprocesadores Intel® Xeon e Intel® Xeon Phi™.
Structured Parallel Programming
Los autores Michael McCool, Arch D. Robison y James Reinders utilizan un método basado en patrones estructurados que debería poner el tema al alcance de todos los desarrolladores de software.

Brinde el mejor desempeño de su aplicación a sus clientes mediante la programación en paralelo con la ayuda de los recursos innovadores de Intel.

Recursos de desarrollo


Herramientas de desarrollo

 

Intel® Parallel Studio

Oferta de paralelismo simplificado, de principio a fin, a desarrolladores de Microsoft Visual Studio* C/C++, Intel® Parallel Studio proporciona herramientas avanzadas para optimizar las aplicaciones de clientes para procesadores multi-core y manycore.

Productos Intel® para desarrollo de software

Examine todas las herramientas que le ayudan a optimizar para la arquitectura Intel.Ciertas herramientas están disponibles para una evaluación gratuita por 45 días.

Base de conocimiento de herramientas

Encuentre guías e información de asistencia técnica sobre las herramientas de Intel.

Intel® Xeon™ E5-2600 v3 Product Family
Por BELINDA L. (Intel)Publicado en 09/08/20140
Based on Intel® Core™ microarchitecture (formerly codenamed Haswell) and manufactured on 22-nanometer process technology, these processors provide significant performance over the previous-generation Intel® Xeon™ processor E5-2600 v2 product family. This is the first Intel® Xeon® processor fami...
How Intel® AVX2 Improves Performance on Server Applications
Por Thai Le (Intel)Publicado en 09/05/20140
The latest Intel® Xeon® processor E5 v3 family includes a feature called Intel® Advanced Vector Extensions 2 (Intel® AVX2), which can potentially improve application performance related to high performance computing, databases, and video processing. Here we will explain the context, and provide ...
What’s New in the Intel Compiler
Por AmandaS (Intel)Publicado en 08/25/20140
The list below summarizes new features in the Intel® C++ Compiler 15.0 and the Intel® Fortran Compiler 15.0. For more details about changes in the Intel compilers since the previous release, including a list of new options, please refer to the ‘What’s New’ section in the release notes (C++, Fortr...
OpenMP* 4.0 combined offload constructs support for the Intel® Xeon Phi™ coprocessor
Por Kevin Davis (Intel)Publicado en 08/22/20140
The Intel® Parallel Studio XE 2015 Composer Editions for Windows* and Linux* have feature enhancements that provide near full support of the OpenMP* 4.0 API (July 2013) specification. Extensions to the reduction clause and the new declare reduction construct added to support user defined reductio...
Suscribirse a Artículos de la Zona para desarrolladores Intel
No se encontró contenido
Suscribirse a Blogs de la Zona para desarrolladores Intel®
HTM/STM and Scheduling
Por Simone A.1
Hi, I have a question about Hardware and Software Transactional Memory. Given the types of versioning (eager and lazy) and conflict detection (optimistic and pessimistic) and let's say that 2 or more threads are performing a transaction that write/read the same memory location. The scheduling of the threads could affect the ability of detect a conflict? Which combination of versioning and conflict detection would be better to always catch the conflicts? Hope my question is clear. Thanks. Best Regards, Simone
Locking CPU cache lines for a thread ( L1)
Por Younis A.14
Hi I'm working on securing access to L1 cache by locking it line by line. Is there any way to do it? For example, two threads accessing the L1 and L1 lines are locked for a certain time to each thread accessed them. Regards, Younis
Responsive OpenMP Theads in Hybrid Parallel Environment
Por Don K.1
I have a Fortran code that runs both MPI and OpenMP.  I have done some profiling of the code on an 8 core windows laptop varying the number of mpi  tasks vs. openmp threads and have some understanding of where some performance bottlenecks for each parallel method might surface.  The problem I am having is when I port over to a Linux cluster with several 8-core nodes.  Specifically, my openmp thread parallelism performance is very poor.  Running 8 mpi tasks per node is significantly faster than 8 openmp threads per node (1 mpi task), but even 2 omp threads + 4 mpi tasks runs was running very slowly, more so than I could solely attribute to a thread starvation issue.  I saw a few related posts in this area and am hoping for further insight and recommendations in to this issue.  What I have tried so far ... 1.  setenv OMP_WAIT_POLICY active      ## seems to make sense 2.  setenv KMP_BLOCKTIME 1          ## this is counter to what I have read but when I set this to a large number (2500...
Optimizing cilk with ternary conditional
Por Fabio G.3
What is the best way to optimize the cycle cilk_for(i=0;i<n;i++){ x[i]=x[i]<0?0:x[i]; }or somethings like that? Thanks, Fabio
have asked them to
Por Robert P.0
ICC t20 World Cup 2014 Live StreamIndia vs Pakistan Live Stream
Optimizing reduce_by_key implementation using TBB
Por Shruti R.0
Hello Everyone, I'm quite new to TBB & have been trying to optimize reduce_by_key implementation using TBB constructs. However serial STL code is always outperforming the TBB code! It would be helpful if I'm given an idea about how reduce_by_key can be improvised using tbb::parallel_scan. Any help at the earliest would be much appreciated. Thanks.
reading a shared variable
Por VIKRANT G.4
hello everyone I am relatively new to parallel programming and have the following doubt:- is reading a shared variable(that is not modified by any thread) without using locks a good practice thanks for the help in advance  
Weird Openmp bug
Por Cheng C.1
Dear all, I want to combine OpenMP and RSA_public_encrypt and RSA_private_decrypt routines. However, I was confused by a weird bug for a few days.    In the attached program, if I generated 2 threads for parallel encryption and decryption, everything works well. If I generated 3 or more threads, the RSA_public_encrypt routine works fine. All strings are successfully encrypted (encrypt_len=256). However, the RSA_private_decrypt routine went wrong, that is, only one thread works properly, all the other threads failed to decrypt some of the strings (decrypt_len=-1, rsa_eay_private_decrypt padding check failed). If there are 1000 strings and 4 threads, the total number of string failed to decrypt went around 710 (some times as low as around 200). So as expected, if I use 4 threads for parallel RSA_public_encrypt and one thread for RSA_private_decrypt, nothing went wrong.   It would be great if you could give some ideas. Thanks very much.    #include <openssl/rsa.h> #include <...
Suscribirse a Foros

Destacado

Haga que el rendimiento prospere - Usando la innovación de código abierto desarrollado por las herramientas Intel ›


¡Obtenga LA GUÍA y empiece de inmediato! Subprocesamiento de aplicaciones, administración de memorias, herramientas de programación y sincronización.
Guía Intel para desarrollar aplicaciones con multi-subprocesos ›


¡Rápido, fácil y gratis!
Intel® Concurrency Checker ›


Imagine el futuro ahora.
Intel® AVX ›


Intel® Parallel Studio XE
Reciba un software para evaluación gratis ›