Intel® Developer Zone:
Desempenho

Destaques

Recentemente publicado! Intel® Xeon Phi™ Coprocessor High Performance Programming 
Aprenda os conceitos básicos da programação para essa nova arquitetura e novos produtos. Novo!
Intel® System Studio
O Intel® System Studio e uma abrangente suíte de ferramentas integradas de desenvolvimento de software que pode diminuir o tempo de lançamento do produto no mercado, aumentar a confiabilidade do sistema e melhorar a eficiência energética e o desempenho. Novo!
No caso de você ter perdido - Repetição do webinar de dois dias
Introdução ao desenvolvimento de aplicativos de alto desempenho para os coprocessadores Intel® Xeon & Intel® Xeon Phi™.
Structured Parallel Programming
Os autores Michael McCool, Arch D. Robison e James Reinders usam uma abordagem baseada em padrões estruturados que podem tornar o assunto acessível a qualquer desenvolvedor de software.

Forneça aos clientes o melhor desempenho de seus aplicativos com a programação paralela e a ajuda dos inovadores recursos da Intel.

Recursos de desenvolvimento


Ferramentas de desenvolvimento

 

Intel® Parallel Studio

Trazendo um paralelismo simplificado e completo para os desenvolvedores do Microsoft Visual Studio* C/C++, o Intel® Parallel Studio tem ferramentas avançadas para otimizar aplicativos cliente para multi-core e muitos cores (núcleos).

Produtos Intel® para desenvolvimento de software

Explore todas as ferramentas para ajudar você a otimizar na arquitetura Intel. Ferramentas selecionadas estão disponíveis por um período de avaliação gratuita de 45 dias.

Base de conhecimento das ferramentas

Guias e informações de suporte para as ferramentas Intel.

Easy SIMD through Wrappers
Por adminPublicado em 03/27/20150
SIMD operations are widely used for 3D graphics applications. This tutorial provides new insights into SIMD by comparing SIMD lanes and CPU threads, and steps you through the process of creating a simple, straightforward SIMD implementation in your own code.
Abaqus/Standard Performance Case Study on Intel® Xeon® E5-2600 v3 Product Family
Por Khang Nguyen (Intel)Publicado em 03/27/20152
Background The whole point of simulation is to model the behavior of a design and potential changes against various conditions to determine whether we are getting an expected response; and simulation in software is far cheaper than building hardware and performing a physical simulation and modif...
Avoid frequency drop in GPU cores when executing applications in Heterogeneous mode
Por Anoop Madhusoodhanan Prabha (Intel)Publicado em 03/23/20150
Introduction Intel(R) C++ Compiler 15.0 provides a feature which enables offloading general purpose compute kernels to processor graphics. This feature enables the processor graphics silicon area for general purpose computing. The key idea is to utilize the compute power of both CPU cores and GP...
Intel Cluster Ready FAQ: Software vendors (ISVs)
Por Werner Krotz-vogel (Intel)Publicado em 03/23/20150
Why should we join the Intel Cluster Ready program? A: By offering registered Intel Cluster Ready applications, you can provide the confidence that applications will run as they should, right away, on certified clusters. Participating in the program will help you increase application adoption, e...
Assine o Artigos do Espaço do desenvolvedor Intel
The Developer's Conference 2012: a Intel Software marca presença
Autor: rajani bhargava Publicado em 13/07/12 0
The Developer’s Conference. Este é o nome do grande evento que aconteceu durante a semana passada, reunindo desenvolvedores de diversas tecnologias em São Paulo. Uma das características do evento é a cobertura de diversas linguagens de programação. O desenvolvedor tem um volume extraordinário de ...
Primeiro bate-papo entre desenvolvedores e a Intel
Autor: George H. Silva (Intel) Publicado em 29/06/12 0
No dia 26/06/2012 tivemos a primeira mesa de trabalho do The AppDate São Paulo com o bate-papo entre desenvolvedores e a Intel. Vou relatar um pouco do que aconteceu até para aquecer para o nosso próximo encontro que acontecerá em Julho. A Intel trouxe Luciano Palma, Comunity Manager de Servidore...
Intel Brasil aumenta investimentos em seu grupo de Software e Serviços
Autor: rajani bhargava Publicado em 01/06/12 0
A Intel Brasil anuncia o aumento nos seus investimentos na área de software e serviços a fim de colaborar com o crescimento de todo ecossistema de software do país - do qual fazem parte os fornecedores independentes de software, desenvolvedores, universidades, parques tecnológicos e agências de ...
Intel Software Network – agora no Brasil!
Autor: rajani bhargava Publicado em 31/05/12 0
Lançamento da Intel Software Network em Português no IDF 2012 Quem participou do Intel Developer Fórum 2012 em São Paulo pôde perceber a importância que a Intel está atribuindo ao Brasil, anunciando um significativo aumento dos investimentos no país. Um destes investimentos é a expansão do Grupo ...
Assine o Blogs do Intel® Developer Zone
Memory to CPU (mov) bandwidth limitations
Por albus d.3
(sorry for weak english I am not native english, Not sure if right forum, first time here - This is general about some hardware limits i do not understand technical reason and I would very like to know) We have now parallelised SIMD arithmetic (like 8 float mulls or divisions in one step) theoretical (but also nearly practical) arithmetical bandwidth per core is thus like 4GHz * 8 floats = about 30 GFLOPS per core or something like that But we still AFAIK have quite low RAM to CPU bandwidth at the level of read or write of 1 or 2 int of float per nanosecond, such ram-2-cpu bandwidth when i am testing it is like only 2 GLOP per second per core or something like that; (both those values are rough but this difference seem to be physical truth at least from my experience) I mean arithmetic can be paralelised (like 8-vectorised) but load/store movs are not - thus SIMD paralistation has obly a fraction of its potential power This is extremally crusial to increase this memory bandwith (muc...
speedup problem using openMP in intel fortran
Por bohluly2
Dear all, I have developed  a program and unfortunately I have speedup problem in it. My program is so big so I have tried to write a sample similar to my program, fortunately this simple program has a same problem with my program.  I need other experiences and your help if it is possible. Thanks, I am using VS2010 and Intel FORTRAN XE 2011 Program:     TYPE var         REAL(8),POINTER :: A, B, C      END TYPE var      REAL(8),POINTER :: A(:), B(:), C(:)      TYPE(var),POINTER  :: vars(:)        TYPE(var),POINTER :: varOMP            REAL*8  t1,t2 ,ai,bi,ci,di,ei,fi        INTEGER(4) c1,c2      INTEGER N, CHUNKSIZE, I, id, f , l      PARAMETER (N=200)      PARAMETER (CHUNKSIZE=10)            Allocate (A(N), B(N), C(N),vars(N)) !     initializations         DO I = 1, N          A(N)      =   I * 1.0          B(N)      =   A(N)          vars(I)%A =>  A(N)          vars(I)%B =>  B(N)          vars(I)%C =>  C(N)          vars(I)%A = 0.51          vars(I)%B...
How can I verify license key?
Por Aleksandr S.1
I have bought few Xeon Phi units. The reseller provided with keys for Intel Parallel Studio. I think they are 6 months demo. However I'd like to know for sure. Is there a way I can check the terms of these keys without activating them, directly with Intel?
Doubts before buy Intel Studio
Por Marcelo C.2
Hi All   I have some doubts regarding the Intel software studio for parallel arch and the Brazilian seller is not able to answer. I need to solve these doubts before buy the Studio for my company. Can somebody help me? 1- Currently we are using OpenMPI. Which advantages Intel MPI provides over OpenMPI? 2- OpenMPI error handling is not good. The MPI Lib from Intel is better for error handling and recovering? For example, if one rank in my mpi comm world dies how can I handle this using Intel lib? 3- Currently we use GCC. Intel compiler is better? We are running in a cluster with several nodes, with MPI doing the communication between the nodes.  Any other recommendations? We host our application at Amazon.  Thank you all in advance!  
Openmp task and parallel construct
Por Patrice l.1
Hi, I am trying to understand the behavior of the Openmp implementation when a parallel do is enclosed in a task. When using nested  the parallel do uses multiple threads. The first question is is that possible to restrict the number of threads to the original thread pool (hardware thread), so that they work on the parallel construct has they become available after completing other task ? (see code below) From reading the forum, i suspect the answer will be no, then what is the best way to combine task and parallel do , inside a task and outside a task. Is it worth it to close the master or single region to do a parallel one, and reopen it right after ? Last question, is there any  becnhmark of using task for a loop instead of a classic parallel do , in both case, fixed work load, and variable work load for each iteration ?   Thanks program omptest use omp_lib implicit none integer :: i !$omp parallel !$omp master print *,'omp get max threads',omp_get_max_thr...
Draining store buffer on other core
Por Boris D.10
Hello, I've a weird question: As I understand, mfence instruction causes draining of the store-buffer on the same core on which it was executed. Is there some way for thread on core A, to cause draining of the store-buffer of core B, without running on core B? Maybe some dirty tricks like simulating IO or exception interrupts?   Thanks!
TBB error : atomic is undefined
Por Aleksandr S.1
I got a C++ code in VS2013 using Intel Compiler XE 15. I write #include "tbb/atomic.h" ...atomic<int> x; I get identifier 'atomic' is undefined. what did I do wrong?
Thread heap allocation in NUMA architecture lead to decrease performance
Por hamed i.4
hi i have server that has 80 logical core (model:dl580g7) .I'm running a single thread per core. each thread doing mkl fft , convolution and many Allocation and DeAllocation from heap with malloc. i previously have server with 16 logical core and there was not a problem and each thread work on its core with 100% cpu usage. when i moved my application from that 16 core server to this 80 core server with numa architecture , after create first thread , that thread works on 100%(kernel time 0%) and With the addition of each thread, performance of other thread decrease so that finally when i have 80 thread cpu usage downgrade to 40% (39% kernel time). because kernel time is increased ,I think the reason for this event is heap sequential mechansim and heap lock. Because of the increasing demand for memory allocation,increased waiting time for each request. i use createheap() on each thread  to eliminate wait for unlock heap memory. but heapalloc can alloc memory up to 512KB. that Insuffic...
Assine o Fóruns

Destaques