Forum topic

CPI rate blows up


Criado por Alexander L. Última atualização em 20/01/2017 - 16:09
Forum topic

How to speed up this code?

    Hello together,

many thanks for all contributors to my past question.

Criado por Alexander L. Última atualização em 19/01/2017 - 02:12

Software Occlusion Culling

This article details an algorithm and associated sample code for software occlusion culling which is available for download. The technique divides scene objects into occluders and occludees and culls occludees based on a depth comparison with the occluders that are software rasterized to the depth buffer. The sample code uses frustum culling and is optimized with Streaming SIMD Extensions (SSE)...
Criado por Kiefer Kuah (Intel) Última atualização em 17/01/2017 - 11:59
Forum topic

mitigating permute costs in AVX 256?

Hello, I'm investigating conversion of a number of compute kernels from AVX 128 to AVX 256 and would appreciate any guidance which might be available on getting a small number of operations on port

Criado por Todd West Última atualização em 15/01/2017 - 09:21
Forum topic

_mm_prefetch usage



Criado por Ioan H. Última atualização em 15/01/2017 - 06:01
Forum topic

Is xend treated as a full memory barrier?

I've started attempting to learn RTM extensions. The most common examples I can find online are using them to implement a mutex or concurrent lock. Often they are similar to:

Criado por william laeder Última atualização em 13/01/2017 - 08:05

Part 3: Expressing Parallelism with Vectors

Episode 3 of the “Hands-On Workshop (HOW) series on parallel programming and optimization with Intel® architectures” introduces data parallelism and automatic vectorization.

Criado por Última atualização em 12/01/2017 - 14:45
Forum topic

Code scales poorly with AVX

This code scales poorly with AVX on my Sandy Bridge, how can I make it more vectorizer friendly:

Criado por CommanderLake Última atualização em 11/01/2017 - 18:32
Mensagem de blog

Resetting the lowest n set bits

Already a couple of years ago, the Bit Manipulation Instruction Set 1 (BMI1) introduced the instruction BLSR, which resets the lowest bit that is set.

Criado por Thomas Willhalm (Intel) Última atualização em 10/01/2017 - 00:54
Forum topic

Parallelization + Vectorization using OpenMP in Sandy Bridge


I would like to ask question about parallelization+vectorization:

Criado por Claudia W. Última atualização em 09/01/2017 - 00:05
Para obter informações mais completas sobre otimizações do compilador, consulte nosso aviso de otimização.