Intel® Advanced Vector Extensions

Links to instruction documentation

Putting Your Data and Code in Order: Data and layout - Part 2

In this pair of articles on performance and memory covers basic concepts to provide guidance to developers seeking to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
  • Desenvolvedores
  • Estudantes
  • Servidor
  • Windows*
  • C/C++
  • Fortran
  • Intermediário
  • Intel® Advisor
  • Intel® Cilk™ Plus
  • Módulos de sub-rotinas Intel®
  • Intel® Advanced Vector Extensions
  • OpenMP*
  • Modernização do código
  • Arquitetura Intel® Many Integrated Core
  • Otimização
  • Computação paralela
  • Thread
  • Vetorização
  • Highest valid sub-leaf index of CPUID(EAX = 0DH)

     

    I refer to the document of ISA extensions at <https://software.intel.com/sites/default/files/managed/07/b7/319433-023..... (page 2-18)

     

     

    The highest valid sub-leaf index, n, is

    (POPCNT(CPUID.(EAX=0D, ECX=0):EAX) + POPCNT(CPUID.(EAX=0D, ECX=0):EDX) - 1)

     

     

    How to obtain this formula of the highest valid sub-leaf index of CPUID.0DH?

    Putting Your Data and Code in Order: Optimization and Memory – Part 1

    This series of two articles discusses how data and memory layout affect performance and suggests specific steps to improve software performance. The basic steps shown in these two articles can yield significant performance gains. These two articles are designed at an intermediate level. It is assumed the reader desires to optimize software performance using common C, C++ and Fortran* programming options.
  • Desenvolvedores
  • Professores
  • Estudantes
  • C/C++
  • Principiante
  • Intermediário
  • Biblioteca kernel de matemática Intel®
  • MPI
  • Intel® Advanced Vector Extensions
  • Modernização do código
  • Computação paralela
  • Vetorização
  • SGX - Self-modifying Code

    Is self-modifying code allowed in SGX enclaves?  I created a simple example that just calls a function stored in a data buffer.  I changed the properties for the enclave DLL so that data is also executable.  It worked when I compiled the project in simulation mode, but it crashes in hardware mode.

    Software Occlusion Culling

    This article details an algorithm and associated sample code for software occlusion culling which is available for download. The technique divides scene objects into occluders and occludees and culls occludees based on a depth comparison with the occluders that are software rasterized to the depth buffer. The sample code uses frustum culling and is optimized with Streaming SIMD Extensions (SSE) instruction set and multi-threading to achieve up to 8X performance speedup compared to a non-culled display of the sample scene.
  • Desenvolvedores
  • Microsoft Windows* 10
  • Microsoft Windows* 8.x
  • Desenvolvimento de jogos
  • Windows*
  • C/C++
  • Intermediário
  • Módulos de sub-rotinas Intel®
  • GameCodeSample
  • GameDev
  • simd
  • AVX2
  • Software Occlusion Culling
  • Intel® Advanced Vector Extensions
  • Extensões Intel® Streaming SIMD
  • Desenvolvimento de jogos
  • Gráficos
  • PIN Failure to initialize DLL file python27.dll

    Hi.

    I have a PIN tool that uses Python. the problem I'm having is that there is an error when PIN try to load it.

    I have read something regarding PIN not loading external libraries anymore. Is that correct? Will I need to use LoadLibrary and GetProcAddress to have it working? Is there any way to avoid it? (There are tons of functions used so It will be very painful)

    I'm using the latest VC12 verision:

    PIN unresolved external symbol xed_operand_values_set_rep

     

    Hi.

     

    I'm including:

    • xed.lib
    • pin.lib
    • pinvm.lib
    • ntdll-64.lib

     

    but I'm still getting this error: 

    Error 1 error LNK2019: unresolved external symbol xed_operand_values_set_rep referenced in function "void __cdecl LEVEL_CORE::INS_AddRep(class LEVEL_CORE::INDEX<6>)" (?INS_AddRep@LEVEL_CORE@@YAXV?$INDEX@$05@1@@Z) C:\pin-2.14-71313-msvc12-windows\source\tools\Tool\build\pin.lib(ins_api_xed_ia32.obj) pintool

     

    Why is the symbol xed_operand_values_set_rep not exported?

     

    intel xe composer compiler with AVX code

    Are there any cases the intel compiler gives worst performance than other compilers especially visualstudio compiler v120. When I compiled a sample of matrix multiplication with intel xe composer 2013 sp1 compiler using inline assembly avx code with visual c 2013 ultimate  the performance of intel compiler is bad than the visual studio compiler 120. Can anyone help me what is the reasons to that?

    my machine is intel core i5

    Thank you

    Assine o Intel® Advanced Vector Extensions