masked instructions in inline assembler

masked instructions in inline assembler


For testing purposes, I am writing short assembly snippets for Intels Xeon Phi with the icc inline assembler. Now I wanted to use masked vector instructions, but I fail at feeding them to the inline assembler.

For Code like this:

vmovapd  -64(%%r14, %%r10), %%zmm0{%%k1} 

I get the error message

/tmp/icpc5115IWas_.s: Assembler messages:
/tmp/icpc5115IWas_.s:563: Error: junk `%k1' after register

I tried a lot of different combinations, but nothing worked. The Compiler version is intel64/13.1up03.

6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

The compiler accepts a similar form of inline asm so it seems what you are trying should work.

#include <immintrin.h>
  void foo()
    asm ("vmovapd -64(%r14, %r10), %zmm0{%k1}");

$ icpc -mmic t.cpp

Do you have a small complete asm source file I could use to reproduce that error?

I am also checking w/others about this.

Thank you for the reply. I was able to verify that your code sample works. The main difference to the code I am using is the use of extended inline assembler. The first sequence works, the second does not:

#include <immintrin.h>
int main(int argc, char**argv) {

	    double * src = (double*) _mm_malloc( 64, 128 );

	    __asm__("vaddpd -64(%r14, %r10), %zmm0, %zmm0{%k1}");

	    __asm__("vaddpd -64(%[src], %%r10), %%zmm0, %%zmm0{%%k1}"

	            :: [src]"r"(src) : );


icpc -mmic asmtest.cpp

>/tmp/icpcUyuKn1as_.s:42: Error: junk `%k1' after register

This is probably just a syntax error. The extended assembler necessiates the use of %% instead of %.

Thank you for the reproducer. I will pass it along. I'm thinking this is an assembler defect but I'm still awaiting word from others and I'll let you know what I hear.

Development confirmed that the single braces and percent sign must both be doubled in the GNU-style extended syntax so this is just a syntax error. The mask requires double single braces so lines 8-10 in the above reproducer should read:

   __asm__("vaddpd -64(%[src], %%r10), %%zmm0, %%zmm0 {{%%k1}}"
           :: [src]"r"(src) : );

Thank you very much, I can confirm that that works.

Leave a Comment

Please sign in to add a comment. Not a member? Join today