Intel® Advanced Vector Extensions

Links to instruction documentation

small typo in Intel® 64 and IA-32 Architectures Software Developer’s Manual

Hi,

It seems that there is a small typo in the Intel® 64 and IA-32 Architectures Software Developer’s Manual (Order Number: 253665-054US April 2015), page 3-149 (cmpss instruction) :

128-bit Legacy SSE version: The first source and destination operand (first operand) is an XMM register. The second source operand (second operand) can be an XMM register or 64-bit memory location.

It should be 32-bit memory location.

Regards,

BeatriX

 

the issue about APIC drop msix interrupt

hello, I have a difficult problem,.scenes are as follows:

the hardware env is Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz, a Altera FPGA board. 

the os is Linux debian-rss 3.16.7-ckt7

FPGA create 32 DMA transfer to cpu, generate a interrupt per transfer.

This 32 interrput distribution to 8 diffirent msix IRQ.

According to APIC spec, each interrupt maybe one in ISR, one in IRR,the third maybe dropped.

But now i distribution 2 interrputs to each IRQ, why maybe dropped interrputs?

Dynamic Shift

Hello,

I am trying to achieve a dynamic shift. Well, let me explain the task. I process data with SSE, AVX. Data gets loaded, worked with and later results are stored. To support arbitrary lengths, I need some kind of maskload, but also for SSE.

Suppose my lenght is 9 elements, I work with int32 and SSE. First load, second load is fine. Third load is fine from memory bound, this is no problem. But only element 0 in vector register is valid, others need to be zero. How do I achieve this best?

Java* Application Performance Improvement with Intel® Xeon® Processor E7 v3

Background

Java1, 2 is a programming language used for developing applications that can run on any operating system (OS). To do that, Java applications need to be compiled to bytecode.3 This bytecode can then be run on any Java Virtual Machine (JVM)4 without recompiling. To run Java applications on OSs like Windows* and Linux*, a Java Runtime Environment (JRE)7 must be installed.

  • Linux*
  • Serveur
  • Java*
  • JVM
  • Intel® Xeon® Processor
  • TYDIC*
  • Intel® QPI
  • Intel® TSX
  • Intel® AVX2
  • Intel® Advanced Vector Extensions
  • Informatique cloud
  • Centre de données
  • Entreprise
  • Secteur des services financiers
  • Why is my AVX slower than SSE?

    As the description of "IIR Gaussian Blur Filter Implementation using Intel® Advanced Vector Extensions",

    The AVX should be faster than SSE,But, my result of performance measurement as following:

     The computer supports AVX
    number CPU in the system = 4

     IIR Gaussian Filter Coefficients are:
    a0 = 0.021175, a1 = -0.017807, a2 = 0.021103, a3 = -0.017875, b1 = -1.837578, b2
     = 0.844174, cprev = 0.510583, cnext = 0.489409

    image width = 1024, height = 1024

    Running multi threaded SSE code

    Running multi threaded AVX code

    S’abonner à Intel® Advanced Vector Extensions