Filters

Article

Putting Your Data and Code in Order: Data and layout - Part 2

In this pair of articles on performance and memory covers basic concepts to provide guidance to developers seeking to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
Authored by David M. Last updated on 02/05/2016 - 16:49
Article

Putting Your Data and Code in Order: Optimization and Memory – Part 1

This series of two articles discusses how data and memory layout affect performance and suggests specific steps to improve software performance. The basic steps shown in these two articles can yield significant performance gains. These two articles are designed at an intermediate level. It is assumed the reader desires to optimize software performance using common C, C++ and Fortran* programming...
Authored by David M. Last updated on 02/05/2016 - 16:43
Forum topic

Highest valid sub-leaf index of CPUID(EAX = 0DH)

 

Authored by Jeremy W. Last updated on 02/05/2016 - 11:09
Article

Intel® Compiler Options for Intel® SSE and Intel® AVX generation (SSE2, SSE3, SSSE3, ATOM_SSSE3, SSE4.1, SSE4.2, ATOM_SSE4.2, AVX, AVX2, AVX-512) and processor-specific optimizations

Explains which Intel® Compiler switches to use to target and optimize for a specific platform, microarchitecture, CPU or processor.
Authored by Martyn Corden (Intel) Last updated on 02/04/2016 - 13:25
Blog post

Compiling for the Intel® Xeon Phi™ processor x200 and the Intel® AVX-512 ISA

Introduction

Authored by Loc N. (Intel) Last updated on 02/03/2016 - 14:10
Forum topic

SGX - Self-modifying Code

Is self-modifying code allowed in SGX enclaves?  I created a simple example that just calls a function stored in a data buffer.  I changed the properties for the enclave DLL so that data is also ex

Authored by John C. Last updated on 02/01/2016 - 22:16
Article

Palestra: Como otimizar seu código sem ser um "ninja" em Computação Paralela

Não perca a palestra "Como otimizar seu código sem ser um "ninja" em Computação Paralela" da Intel que será ministrada durante a Semana sobre Programação Massivamente Paralela em Petrópolis, RJ, no Laboratório Nacional de Computação Científica. Data: 02/02/2016 - 11h30 Local: LNCC - Av. Getúlio Vargas, 333 - Quitandinha - Petrópolis/RJ
Authored by IGOR F. (Intel) Last updated on 01/28/2016 - 18:51
Article

Software Occlusion Culling

This article details an algorithm and associated sample code for software occlusion culling which is available for download. The technique divides scene objects into occluders and occludees and culls occludees based on a depth comparison with the occluders that are software rasterized to the depth buffer. The sample code uses frustum culling and is optimized with Streaming SIMD Extensions (SSE)...
Authored by Kiefer Kuah (Intel) Last updated on 01/28/2016 - 15:09
Article

Reference Implementations for Intel® Architecture Approximation Instructions VRCP14, VRSQRT14, VRCP28, VRSQRT28, and VEXP2

We are providing source files containing reference implementations for the scalar versions of 10 approximation instructions introduced in the "Intel® Architecture Instruction Set Extensions Programming Reference" document
Authored by admin Last updated on 01/26/2016 - 14:51
Forum topic

PIN Failure to initialize DLL file python27.dll

Hi.

I have a PIN tool that uses Python. the problem I'm having is that there is an error when PIN try to load it.

Authored by Albert g. Last updated on 01/25/2016 - 15:02
For more complete information about compiler optimizations, see our Optimization Notice.