Zone des développeurs Intel® :
Performance

Points forts

Juste publié ! Intel® Xeon Phi™ Coprocessor High Performance Programming 
Apprenez les fondements de la programmation pour cette nouvelle architecture et les nouveaux produits. Nouveau !
Intel® System Studio
Intel® System Studio est une suite exhaustive d’outils intégrés de développement de logiciels qui peut accélérer la mise sur le marché, renforcer la fiabilité des systèmes et améliorer l’efficacité énergétique et les performances. Nouveau !
Au cas où vous l’avez manqué – Rediffusion du webinaire en direct de deux jours
Introduction au développement d’applications hautes performances pour processeurs Intel® Xeon® et coprocesseurs Intel® Xeon Phi™.
Structured Parallel Programming
Les auteurs Michael McCool, Arch D. Robison et James Reinders utilisent une approche basée sur des modèles structurés qui devrait rendre le sujet accessible à tous les développeurs de logiciels.

Optimisez les performances de vos applications grâce à la programmation parallèle et avec l'aide des ressources novatrices d'Intel.

Ressources de développement


Outils de développement

 

Intel® Parallel Studio

Intel® Parallel Studio, qui apporte aux développeurs Microsoft Visual Studio* C/C++ un traitement parallèle de bout en bout simplifié, fournit des outils avancés permettant d’optimiser les applications clientes pour un traitement multicœur et à nombreux cœurs.

Produits Intel® de développement logiciel ›

Explorez tous les outils qui vous aideront à optimiser vos applications pour l’architecture Intel. Certains outils sont disponibles pour une période d’évaluation gratuite de 45 jours.

Base de connaissances sur les outils

Trouvez des guides et des informations d'assistance sur les outils Intel.

Using SPIR for fun and profit with Intel® OpenCL™ Code Builder
Par Robert Ioffe (Intel)Publié le 02/23/20150
This short tutorial provides a brief introduction to Khronos SPIR. It touches on the differences between a SPIR binary and an Intel proprietary Intermediate Binary, demonstrates ways to create SPIR binaries using tools shipped with Intel® INDE, and explains how to use SPIR binaries in your OpenCL...
Sharing Surfaces between OpenCL™ and DirectX* 11 on Intel® Processor Graphics
Par Adam Lake (Intel)Publié le 02/23/20150
Download PDF Download code sample Content Introduction Intel® Processor Graphics with Shared Physical Memory Synchronization between OpenCL and DirectX 11 Overview of Surface Sharing between OpenCL and DirectX 11 Initialization Writing to the shared surface The Render Loop Shutdo...
Bitonic Sorting
Par Vadim Kartoshkin (Intel)Publié le 02/12/20150
Demonstrates how to implement an efficient sorting routine with the OpenCL™ technology that operates on arbitrary input array of integer values. The sample uses properties of bitonic sequence and principles of sorting networks and enables efficient SIMD-style parallelism through OpenCL vector dat...
PinPlay:FAQ
Par adminPublié le 02/10/20150
I. How long does record/replay take? Record/replay overhead is a function of number of memory accesses and the amount of sharing in the test program. 1. Time for recording/replaying a 'region':  Source : CGO2014 paper on DrDebug 2. Slow-down for whole-program recording. Source: Measured wi...
S’abonner à Articles de la Zone des développeurs Intel
Aucun contenu trouvé
S’abonner à Blogs de la Zone des développeurs Intel®
Slowdown with OpenMP
Par Matt S.11
I'm getting some pretty unusual results from using OpenMP on a fractional differential equations code written in fortran. No matter where I use OpenMP in the code, whether it be on an intilization loop or on a computational loop, I get a slowdown across the entire code. I can put OpenMP in one loop and it will slow down an unrelated one (timed seperately)! The code is a bit unusual, as it initalizes arrays starting at 0 (and some even negative). For example, real*8 :: gx(0:Nx) real*8 :: AxLh(1-Nx:Nx-1), AxRh(1-Nx:Nx-1), AxL0(1-Nx:Nx-1), AxR0(1-Nx:Nx-1) Where Nx is, let's say, 512. Would that possibly have anything to do with the ubiquitous slowdown with OpenMP? Also, any ideas on reducing "pow" overhead in the following snippet would be greatly appreciated do k = 1, 5 hgck = foo_c(k) hgpk = foo_p(k) do j = 1, 100 vx = vx + hgck * ux(x, t, foo(j) + hgpk) end do end do where ux is a function defined by function ux(x,t,xi) impl...
web crawling through "Intel Xeon Phi Coprocessors"
Par Sunil K.1
I am new to this forum. I want to implement parallel crawling on "Intel Xeon Phi Coprocessors" as for my project. Before buying equipment, installing software and start learning about this platform I want to know that whether it is possible to somehow connect to Network and get web URLs in parallel using this technology? (I don't want to create cluster of CPUs to do. I want to do it using single card).
Intel MPI for Phi tuning tips?
Par Ronald W Green (Intel)3
Does setting     I_MPI_MIC=enable change other MPI environment variables, particularly any that would tune MPI for the MIC system architecture?   As a side question, has anyone written a Tuning and Tweaking guide for IMPI for Phi?  For example, what I_MPI variables could one use to help tune an app targeting 480 ranks across 8 Phis? Thanks Ron
Lock-free Java, or better scaling on multi-core systems
Par William L.0
Everyone these days has to address multi-core issues, or vertical scaling, at least on the server-side of things. And there does not seem to be a general approach, so we end up re-architecting our applications every time we add cores. At the same time, the availability of many-core processors seems to be constrained by the lack of a reasonable software technology to make good use of them. Actors seems like a good approach, and allow you to write fast, lock-free code. But large actor-based systems are not robust. Most actor implementations require applications to implement a state machine per actor for determining what messages are to be processed, and maintaining a large number of interacting state machines is well beyond the abilities of most developers. Which is very sad, as throughput of actor-based applications typically scales with the number of cores. I've worked on this problem for a number of years now and have developed a simple variation on actors which support non-blockin...
igzip for VS10 C++?
Par David L.6
I was searching for a zlib-compatible compressor but faster, and came cross the paper describing igzip -- High Performance DEFLATE Compression on Intel Architecture Processors igzip looks like exactly (!) what I am looking for.  Compatible with zlib, but faster. However, the downloadable source was for Linux.  I need it for a VS10 C++ project.  I have successfully (I think) compiled and assembled the desired modules (common, crc, crc_utils, hufftables, hufftables_c.cpp, igzip0c_body, igzip0c_finish, init_stream) into a .lib.  But when I attempt to link the library into my project, I get error LNK2019: unresolved external symbol fast_lz (and init_stream) from where they are called.  I also have a "C" lz4 compression library linked into the project, and it works fine.  I have spent 3 days playing with it, looking for the clue that will unlock the symbols, but no luck so far. I get no other warnings and/or errors during the compiling/assembling of the library or project.  Any help (esp...
OpenCL vs Intel Cilk Plus Issues, Differences and Capabilities
Par Yaknan G.0
I  am curious as to the differences between OpenCL and Intel Cilk Plus. They are both parallel programming paradigms that are receiving wide recognition but technically speaking is one better than the other or are they simply different. Also what yardstick do I use when choosing between the two when solving an embarrassingly parallel problem. Please i need answers. Thanks! Yaknan
Thread complexion(Multi-threading)
Par Masood Ali M.4
Hello everyone,                            On the other day was trying to create a thread which could capture the working of an already existing(working) thread and copy its working. Setting priority of threads so that they can capture the working of the same priority level threads and also dynamic increase in the thread capacity to handle similar kind of work. would appreciate if anybody could help with it. Thanks. -Ali
The list of out-of-order CPUs
Par bp1
Hi, I would like to know the list of commercial products ( CPUs / SoCs ) made by Intel that support an out-of-order execution . I noticed that the new Baytrail architecture apparently should support this kind of execution, but I have no information about other architectures, about Xeon, iCore, previous Atoms, Celerons and Pentiums; at this point I also have no specific information about the subsets of a given family, for example Baytrail is usually shifted into Baytrail-M and Baytrail-T and I can only speculate that this new out-of-order applies to both . It would also be really nice if you can spend some time describing the support to this kind of memory models given by open source compilers such as gcc and clang . Thanks .
S’abonner à Forums

Points forts