Jumping to Labels in Inline Assembly

Jumping to Labels in Inline Assembly

Can I use labels in inline assembly such as this:

int main()

{

        __asm__(

                      "jmp label_1;"

                      );

        label_1:

        return 0;

}

Mine doesn't seem to compile somehow...

EjO
25 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Use Intel form of assembler:

        __asm {
                      jmp label_1;
  }

        label_1:

Jim Dempsey

www.quickthreadprogramming.com

I assume this is AT&T style of inline assembler and did you have an error like this one:
...
..\Tests>g++ Test.cpp
..\Temp/ccqWaaaa.o(.text+0x2b):Test.cpp: undefined reference to `label_1'
collect2: ld returned 1 exit status
...

Thanks jim. It's something like that but I wish to write it in AT&T format.

Yes Sergey, I can't seem to find the right format/syntax using AT&T (@.@)

EjO

Please take a look at another thread with a similar problem:

Forum Topic: Jump labels in inline assembler
Web-link: http://software.intel.com/en-us/forums/topic/386259

Sergey,

Elbert was trying to jump from within an assembler block to a lable declared in C/C++ code.

Numeric lables appear to work. This works:

int main()
{
  __asm__(
                      "jmp 1;"
                      );
     label_1:
  __asm__("1:");
        return 0;
}

Awkward, but works.

Jim Dempsey

www.quickthreadprogramming.com

Thanks! but then when I tried to check by printf:

int main()

{    __asm__(                      

                    "jmp 1;"

                    );    

     printf("This must not appear!"); 

     __asm__("1:");        

return 0;

}

It seems that it doesn't jump.... (T.T)

EjO

Elbert,

Try using a label that starts with a letter and see if that works.  I tried code like your and Jim's which icc and gcc would compile but would segfault at the jmp.  Swapping 1: for label1: fixed that for me.  For reference, here is the code I compiled:

 int main()
 {
 printf("Before jmp\n");
 __asm__("jmp label1;");
 printf("This must not appear\n");
 __asm__("label1:");
 printf("After jmp\n");
 return 0;
}

This is the relevant assembler produced with the -S flag:

 ..B1.2: # Preds ..B1.7
 # Begin ASM
 # Begin ASM
 jmp label1;
 # End ASM #6.0
 # End ASM
 # LOE rbx r12 r13 r14 r15
 ..B1.8: # Preds ..B1.2
 movl $.L_2__STRING.1, %edi #7.4
 xorl %eax, %eax #7.4
 ..___tag_value_main.10: #7.4
 call printf #7.4
 ..___tag_value_main.11: #
 # LOE rbx r12 r13 r14 r15
 ..B1.3: # Preds ..B1.8
 # Begin ASM
 # Begin ASM
 label1:
# End ASM #8.0
# End ASM

The compiler command line:

$ icc --version
icc (ICC) 13.1.3 20130607
Copyright (C) 1985-2013 Intel Corporation. All rights reserved.
$ icc -o asmtest asmtest.c

and the output:

$ ./asmtest
Before jmp
After jmp

and finally the relevent disassembly in idb (from a build with -g)

asmtest.c:6 __asm("jmp label1;");
0x00000000004005C1 eb 15 jmp 0x4005d8 <main+0x34>
asmtest.c:7 printf("This must not appear\n");
0x00000000004005C3 b8 f8 06 40 00 mov eax, 0x4006f8
0x00000000004005C8 48 89 c7 mov rdi, rax
0x00000000004005CB b8 00 00 00 00 mov eax, 0x0
0x00000000004005D0 e8 cb fe ff ff call 0x4004a0 <_init+0x28>
0x00000000004005D5 89 45 f4 mov dword ptr [rbp-0xc], eax
asmtest.c:9 printf("After jmp\n");
0x00000000004005D8 b8 10 07 40 00 mov eax, 0x400710

If I use the purely numeric labels you and Jim used, the jmp assembler is translated into

0x00000000004005B9 e8 e2 fe ff ff call 0x4004a0 <_init+0x28>

which results in the segfault I mentioned at the top of my post.  For reference, that address in that call instruction is in no-mans land.

>>...Elbert was trying to jump from within an assembler block to a lable declared in C/C++ code.

What about some compromise ( split assembler blocks if it is possible ) like as follows:

// Test034.cpp
// http://software.intel.com/en-us/forums/topic/404730
// http://software.intel.com/en-us/forums/topic/386259

#include "stdio.h"

int main( void )
{
// Test-case 1
/*
__asm__
(
"jmp label_1;"
);
label_1:
*/

// Test-case 2
///*
printf( "Test started\n" );

void *pvAddress = NULL;

__asm__
(
"prefetcht0 %0;" : : "m" ( pvAddress )
);
goto label_2;

printf( "That message should never be shown - 1\n" );

label_2:
__asm__
(
"prefetcht0 %0;" : : "m" ( pvAddress )
);
goto label_3;

printf( "That message should never be shown - 2\n" );

label_3:
//*/

printf( "Test completed\n" );

return ( int )0;
}

Of course, some restrictions are possible ( complexity of codes dependant ) but at least the Test-case 2 works.

Source code for the test is attached and output is as follows:

..\Tests>g++ Test034.cpp
...
..\Tests>a.exe
Test started
Test completed

Attachments: 

AttachmentSize
Downloadtext/x-c++src test034.cpp719 bytes

Casey,

On the version of ICC that I use (2011.9.300), I can use 1:, but not label1:.

I am compiling under Windows with __asm__ (AT&T).

** in running, the jump to address is incorrect (bug in ICC)

Jim Dempsey

www.quickthreadprogramming.com

Jim, 

Interesting differences between platforms.  As far as i know on windows inline asm is assembled by icc and on linux inline asm is assmebled by "as" (the GNU assembler).  

After some testing, I can use (in linux)  the purely numberic lables if I specify the jump as "jmp 1f;", which jumps to the 1: label in the forward direction.   

Thanks guys for the replies... by the way, I'm using intel c++ compiler 2013 in windows on Visual Studio 2012

The attached code by Sergey doesn't seem to compile .... 

Casey's code doesn't seem to compile with word labels ......... (T.T)

EjO

After trying out suggestions and playing with the code, this seem to work:

int main()

{

     printf("let's try this!\n");

     __asm__("mov $0x01, %al;"

                     "cmp $0x00, %al;");

      __asm{jne wee};

      printf("This must not appear!\n");

      wee:

      printf("Done!\n");

      system("PAUSE");

}

But then I tried it with a combination of AT&T and Microsoft Style Syntax .... I wish having it in pure AT&T/GNU style assembly .... (@.@)

@Jim: Is it really a bug in Intel C++ Compiler?

EjO

Duplicate - Removed

You've posted AT&T style inline assembler in the initial post and since Intel C++ compiler is compatible with Microsoft C++ compiler the test case can not be compiled.

Since you've finally informed us that you're using Intel C++ compiler I could inform you that jumping functionality from inline assembler to C/C++ codes is supported and I verified it with three C++ compilers:

...
// icl.exe Test034.cpp - Compiled ( Intel )
// cl.exe Test034.cpp - Compiled ( Microsoft )
// bcc32.exe Test034.cpp - Compiled ( Borland )
...

For example, a file with assembler codes generated by Borland C++ compiler is attached.

Attachments: 

AttachmentSize
Downloadtext/plain test034.asm.txt1.36 KB

Quote:

Sergey Kostrov wrote:

You've posted AT&T style inline assembler in the initial post and since Intel C++ compiler is compatible with Microsoft C++ compiler the test case can not be compiled.

Since you've finally informed us that you're using Intel C++ compiler I could inform you that jumping functionality from inline assembler to C/C++ codes is supported and I verified it with three C++ compilers:

The assembler I posted was produced by icc (I was not calling the c++ compiler) with the -S flag and does assemble in Linux.  As noted in my earlier post, differences in assembling inline asm between platforms is causing confusion with examples working for some but not all.  It has already been established (by Jim) that the code I posted does not work in windows, but the intel compiler in linux will compile and it does work there.

It was already explained by Elbert that Intel C++ compiler version 2013 integrated with VS 2012 under Windows was actually used. Even if initial test-case created some confusion I don't think that Elbert is interested in Linux, GCC and AT&T style of inline assembler. Elbert, am I correct?

Sadly, ICC does not have (as topic name/title suggests) inline assembly jumps implemented. Consider following snippet:

--- snip ---

  __asm__ __volatile__
    ("MULT_M_sse3_single:     \n\t"
     "mov    %0, %4           \n\t" // save %0(oL)
     // float tL0 = oL[0] + iL[0] * fL[0];
     // float tL1 = oL[1] + iL[1] * fL[1];
     "movaps (%1), %%xmm1     \n\t"
     "mulps  (%2), %%xmm1     \n\t"
     "addps  (%0), %%xmm1     \n\t"
     // TODO: Storing float[2] to 64bit MMX register is not a good idea.
     "movdq2q %%xmm1, %%mm0   \n\t" // save float[2] to mm0
     // loop
     "MULT_M_sse3_single_loop:\n\t"
     "prefetchnta 0x80(%1)    \n\t"
     "movaps     (%1), %%xmm0 \n\t" // xmm0,1,3,2
     "movaps 0x10(%1), %%xmm4 \n\t" // xmm4,5,7,6
     "add    $0x20, %1        \n\t"
     "prefetchnta 0x80(%2)    \n\t"
     "movsldup (%2), %%xmm1   \n\t"
     "movshdup (%2), %%xmm3   \n\t"
     "movsldup 0x10(%2), %%xmm5 \n\t"
     "movshdup 0x10(%2), %%xmm7 \n\t"
     "add    $0x20, %2        \n\t"
     "add    $0x20, %0        \n\t"
     "pshufd $0xb1, %%xmm0, %%xmm2 \n\t"
     "pshufd $0xb1, %%xmm4, %%xmm6 \n\t"
     "mulps  %%xmm1, %%xmm0   \n\t"
     "mulps  %%xmm5, %%xmm4   \n\t"
     "prefetcht2 0x80(%0)     \n\t"
     "addps  -0x20(%0), %%xmm0 \n\t"
     "addps  -0x10(%0), %%xmm4 \n\t"
     "mulps  %%xmm3, %%xmm2   \n\t"
     "mulps  %%xmm7, %%xmm6   \n\t"
     "addsubps %%xmm2, %%xmm0 \n\t"
     "addsubps %%xmm6, %%xmm4 \n\t"
     "dec    %3               \n\t"
     "movaps %%xmm0, -0x20(%0)\n\t"
     "movaps %%xmm4, -0x10(%0)\n\t"
     "jne    MULT_M_sse3_single_loop \n\t"
     "MULT_M_sse3_single_save: \n\t"
     // oL[0] = tL0;
     // oL[1] = tL1;
     "movq2dq %%mm0, %%xmm1   \n\t" // restore float[2] from mm0
     // EMMS takes approximately 58 clocks extra.
     "emms                    \n\t"
     "movlps %%xmm1, (%4)     \n\t" // restore float[2]
     :
     : "q"(oL), "q"(iL), "q"(fL), "r"(n/4), "r"(soL)
     : "memory");
  return;

--- snip ---

ICC choackes with error saying jumps in inline assembly is not implemented. At premier support they told me it will not be implemented (at least in near future). G++ does not have any problem with this snippet.

--
With best regards,
VooDooMan
-
If you find my post helpful, please rate it and/or select it as a best answer where applies. Thank you.

Quote:

Casey wrote:

Quote:

Sergey Kostrovwrote:

You've posted AT&T style inline assembler in the initial post and since Intel C++ compiler is compatible with Microsoft C++ compiler the test case can not be compiled.

Since you've finally informed us that you're using Intel C++ compiler I could inform you that jumping functionality from inline assembler to C/C++ codes is supported and I verified it with three C++ compilers:

The assembler I posted was produced by icc (I was not calling the c++ compiler) with the -S flag and does assemble in Linux.  As noted in my earlier post, differences in assembling inline asm between platforms is causing confusion with examples working for some but not all.  It has already been established (by Jim) that the code I posted does not work in windows, but the intel compiler in linux will compile and it does work there.

Sadly, ICC does not have (as topic name/title suggests) inline assembly jumps implemented. Consider following snippet:

--- snip ---

  __asm__ __volatile__
    ("MULT_M_sse3_single:     \n\t"
     "mov    %0, %4           \n\t" // save %0(oL)
     // float tL0 = oL[0] + iL[0] * fL[0];
     // float tL1 = oL[1] + iL[1] * fL[1];
     "movaps (%1), %%xmm1     \n\t"
     "mulps  (%2), %%xmm1     \n\t"
     "addps  (%0), %%xmm1     \n\t"
     // TODO: Storing float[2] to 64bit MMX register is not a good idea.
     "movdq2q %%xmm1, %%mm0   \n\t" // save float[2] to mm0
     // loop
     "MULT_M_sse3_single_loop:\n\t"
     "prefetchnta 0x80(%1)    \n\t"
     "movaps     (%1), %%xmm0 \n\t" // xmm0,1,3,2
     "movaps 0x10(%1), %%xmm4 \n\t" // xmm4,5,7,6
     "add    $0x20, %1        \n\t"
     "prefetchnta 0x80(%2)    \n\t"
     "movsldup (%2), %%xmm1   \n\t"
     "movshdup (%2), %%xmm3   \n\t"
     "movsldup 0x10(%2), %%xmm5 \n\t"
     "movshdup 0x10(%2), %%xmm7 \n\t"
     "add    $0x20, %2        \n\t"
     "add    $0x20, %0        \n\t"
     "pshufd $0xb1, %%xmm0, %%xmm2 \n\t"
     "pshufd $0xb1, %%xmm4, %%xmm6 \n\t"
     "mulps  %%xmm1, %%xmm0   \n\t"
     "mulps  %%xmm5, %%xmm4   \n\t"
     "prefetcht2 0x80(%0)     \n\t"
     "addps  -0x20(%0), %%xmm0 \n\t"
     "addps  -0x10(%0), %%xmm4 \n\t"
     "mulps  %%xmm3, %%xmm2   \n\t"
     "mulps  %%xmm7, %%xmm6   \n\t"
     "addsubps %%xmm2, %%xmm0 \n\t"
     "addsubps %%xmm6, %%xmm4 \n\t"
     "dec    %3               \n\t"
     "movaps %%xmm0, -0x20(%0)\n\t"
     "movaps %%xmm4, -0x10(%0)\n\t"
     "jne    MULT_M_sse3_single_loop \n\t"
     "MULT_M_sse3_single_save: \n\t"
     // oL[0] = tL0;
     // oL[1] = tL1;
     "movq2dq %%mm0, %%xmm1   \n\t" // restore float[2] from mm0
     // EMMS takes approximately 58 clocks extra.
     "emms                    \n\t"
     "movlps %%xmm1, (%4)     \n\t" // restore float[2]
     :
     : "q"(oL), "q"(iL), "q"(fL), "r"(n/4), "r"(soL)
     : "memory");
  return;

--- snip ---

ICC chokes with error saying jumps in inline assembly is not implemented. At premier support they told me it will not be implemented (at least in near future). G++ does not have any problem with this snippet.

--
With best regards,
VooDooMan
-
If you find my post helpful, please rate it and/or select it as a best answer where applies. Thank you.

Quote:

Casey wrote:

Quote:

Sergey Kostrovwrote:

You've posted AT&T style inline assembler in the initial post and since Intel C++ compiler is compatible with Microsoft C++ compiler the test case can not be compiled.

Since you've finally informed us that you're using Intel C++ compiler I could inform you that jumping functionality from inline assembler to C/C++ codes is supported and I verified it with three C++ compilers:

The assembler I posted was produced by icc (I was not calling the c++ compiler) with the -S flag and does assemble in Linux.  As noted in my earlier post, differences in assembling inline asm between platforms is causing confusion with examples working for some but not all.  It has already been established (by Jim) that the code I posted does not work in windows, but the intel compiler in linux will compile and it does work there.

Sadly, ICC does not have (as topic name/title suggests) inline assembly jumps implemented. Consider following snippet:

--- snip ---

  __asm__ __volatile__
    ("MULT_M_sse3_single:     nt"
     "mov    %0, %4           nt" // save %0(oL)
     // float tL0 = oL[0] + iL[0] * fL[0];
     // float tL1 = oL[1] + iL[1] * fL[1];
     "movaps (%1), %%xmm1     nt"
     "mulps  (%2), %%xmm1     nt"
     "addps  (%0), %%xmm1     nt"
     // TODO: Storing float[2] to 64bit MMX register is not a good idea.
     "movdq2q %%xmm1, %%mm0   nt" // save float[2] to mm0
     // loop
     "MULT_M_sse3_single_loop:nt"
     "prefetchnta 0x80(%1)    nt"
     "movaps     (%1), %%xmm0 nt" // xmm0,1,3,2
     "movaps 0x10(%1), %%xmm4 nt" // xmm4,5,7,6
     "add    $0x20, %1        nt"
     "prefetchnta 0x80(%2)    nt"
     "movsldup (%2), %%xmm1   nt"
     "movshdup (%2), %%xmm3   nt"
     "movsldup 0x10(%2), %%xmm5 nt"
     "movshdup 0x10(%2), %%xmm7 nt"
     "add    $0x20, %2        nt"
     "add    $0x20, %0        nt"
     "pshufd $0xb1, %%xmm0, %%xmm2 nt"
     "pshufd $0xb1, %%xmm4, %%xmm6 nt"
     "mulps  %%xmm1, %%xmm0   nt"
     "mulps  %%xmm5, %%xmm4   nt"
     "prefetcht2 0x80(%0)     nt"
     "addps  -0x20(%0), %%xmm0 nt"
     "addps  -0x10(%0), %%xmm4 nt"
     "mulps  %%xmm3, %%xmm2   nt"
     "mulps  %%xmm7, %%xmm6   nt"
     "addsubps %%xmm2, %%xmm0 nt"
     "addsubps %%xmm6, %%xmm4 nt"
     "dec    %3               nt"
     "movaps %%xmm0, -0x20(%0)nt"
     "movaps %%xmm4, -0x10(%0)nt"
     "jne    MULT_M_sse3_single_loop nt"
     "MULT_M_sse3_single_save: nt"
     // oL[0] = tL0;
     // oL[1] = tL1;
     "movq2dq %%mm0, %%xmm1   nt" // restore float[2] from mm0
     // EMMS takes approximately 58 clocks extra.
     "emms                    nt"
     "movlps %%xmm1, (%4)     nt" // restore float[2]
     :
     : "q"(oL), "q"(iL), "q"(fL), "r"(n/4), "r"(soL)
     : "memory");
  return;

--- snip ---

ICC chokes with error saying jumps in inline assembly is not implemented. At premier support they told me it will not be implemented (at least in near future). GCC does not have any problem with this snippet.

IIRC it chokes at line #02 (the jump label)

MSVC 2012, ICC 13 (the latest update), Windows 8 x64, x64 build target

--
With best regards,
VooDooMan
-
If you find my post helpful, please rate it and/or select it as a best answer where applies. Thank you.

Quote:

Sergey Kostrov wrote:

It was already explained by Elbert that Intel C++ compiler version 2013 integrated with VS 2012 under Windows was actually used. Even if initial test-case created some confusion I don't think that Elbert is interested in Linux, GCC and AT&T style of inline assembler. Elbert, am I correct?

Sergey,

   I realized from the beginning that the windows compiler was being used and had no intentions of having someone switch platforms.  I posted my test case (which works) purely to contribute to the thread.  It was not until Jim's followup post to me that I did some digging and learned that the handling of inline assembly is different between windows and linux, making the test case platform specific -- knowledge of which I also contributed to this thread.  There aren't too many issues where the platform is going to matter but this indeed is one of them.  BTW, I never mentioned GCC in my post, my test case compiles against Intel's icc (Which you can see in my post), though the __asm__ is assembled behind the scenes by a GNU assembler, this is not something you interact with directly.  I also acknowledge that it was *my* test case that caused confusion since I posted something that met the OP's requirements but (unknowingly) did not work on his platform.  I'll also reiterate that the AT&T style assembler was produced by icc and only posted to show that correct assembly was produced from the test case -- it does not matter what format it is in for that confirmation.

Thank you everyone for the posts. It is of great help to me. I've been doing assembly for quite some time using LLVM on OS X and have been acquainted using AT&T style of assembly. I am recently studying inline assembly in windows through visual studio for a project and saw intel compiler supports AT&T style, which I believe will make things quite easier for me. 

EjO

Some issues related to the subject of the thread are dated back to 2004 year. This is what I found in Release Notes for Intel C++ compiler version 7.1 Update 29:

...
In an inline asm block, the conditional jumps, jcxz and jecxz, should
not be used to jump to another function. For instance, jcxz and jecxz
will not have the correct target location in the following code:

int main(void);
void DoneThat(void);
void BeenThere(void);
int main()
{
BeenThere();
return 0;
}
void BeenThere()
{
__asm jcxz DoneThat
__asm jecxz DoneThat
}
void DoneThat()
{
exit(0);
}
...

Note: This is only for everyone's knowledge. Thanks.

Quote:

Sergey Kostrov wrote:

Some issues related to the subject of the thread are dated back to 2004 year. This is what I found in Release Notes for Intel C++ compiler version 7.1 Update 29:

...
In an inline asm block, the conditional jumps, jcxz and jecxz, should
not be used to jump to another function. For instance, jcxz and jecxz
will not have the correct target location in the following code:

This makes sense to me. If it were implemented, only the C code would work. But in C++ compiler generates code for calling destructors of instances of non-POD classes/structs allocated on the stack, and even in C the stack needs to be rolled back. The only exception are C++ exceptions that work accross function calls, but compiler takes care of it and knows about this "nearly non-standard" event (the throw), unlike event of jump into another function, about which compiler knows nothing. This technique could corrupt stack and lead to program unstability. Though, it is possible to do this without corrupting the stack, but this is what I call a big no-no and a bad coding practice.

--
With best regards,
VooDooMan
-
If you find my post helpful, please rate it and/or select it as a best answer where applies. Thank you.

Leave a Comment

Please sign in to add a comment. Not a member? Join today