popfd compile error on win64

popfd compile error on win64

This is what i am trying to achieve ::
I got eflags value into a context structure "ctx" using GetThreadContext Microsoft API
Now my intention is to set the eflags value in this structure to current thread
This is how i am doing

/*I got to know that we cannot directly modify eflags register,so....
1. I am getting the value in ctx structure to Eax register
2. Pushing the value in Eax register to stack
3. Poping the recently pushed value into eflags register */

    mov eax, dword ptr [ctx+0x044];   // as eflags is 32-bit I am using eax instead of rax
    push eax;
    popfd;

when i try to compile this, i am seeing below compile error

C:\1\stacktracing.c(81): (col. 5) error #13252: Unsupported instruction form in asm instruction push.
C:\1\stacktracing.c(82): (col. 5) error #13250: Opcode POPFD unsupported by architecture in asm instruction popfd.

1. Can't we push 32-bit eax  to a stack in 64-bit machine ....if yes, any alternative?
2. Can't we use popfd in 64-bit machine ?

Thanks & Regards
Naveen

34 post / 0 nuovi
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione

Do you use Intel or Microsoft C++ compiler on a 64-bit Windows platform?

I use icl of below version

[C:\1]icl
Intel(R) C++ Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.0.5.229 Build 20120731
Copyright (C) 1985-2011 Intel Corporation. All rights reserved.

icl: command line error: no files specified; for help type "icl /help"

Let me know if i am in the right path -- reg what i want to achieve or is there any drawback in the way i am doing

Do you use inline assembler in a C/C++ source file?

>>...Intel(R) C++ Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.0.5.229 Build 20120731...

I will verify Push Eax and Popfd with a version 13 of Intel C++ compiler ( 64-bit ).

Citazione:

Can't we push 32-bit eax  to a stack in 64-bit machine

No. Just as you cannot do push ax or pop ax in 32-bit ASM, and you cannot do push al or pop al in 16-bit ASM. The primary reason is that in 64-bit mode the stack has to be aligned on 8-byte boundaries.

Secondly, why are you bothering with eflags in 64-bit mode, instead of using rflags?

For definitive answers on these and related questions, refer to the Intel instruction set reference manuals:

http://download.intel.com/products/processor/manual/253666.pdf

http://download.intel.com/products/processor/manual/253667.pdf

@Sergey Kostrov

I couldn't get what you mean by "Do you use inline assembler in a C/C++ source file?"

I am writing assembly as below

_asm
{

//First restore eflags
//mov eflags, dword ptr [ctx+0x044];
mov eax, dword ptr [ctx+0x044];
push Rax;
popfq;

mov Rax,qword ptr [ctx+0x078] ;
mov Rcx,qword ptr [ctx+0x080];
mov Rdx,qword ptr [ctx+0x088] ;
mov Rbx,qword ptr [ctx+0x090];
mov Rsp,qword ptr [ctx+0x098] ;
mov Rbp,qword ptr [ctx+0x0a0];
mov Rsi,qword ptr [ctx+0x0a8] ;
mov Rdi,qword ptr [ctx+0x0b0];
mov R8,qword ptr [ctx+0x0b8] ;
mov R9,qword ptr [ctx+0x0c0];
mov R10,qword ptr [ctx+0x0c8] ;
mov R11,qword ptr [ctx+0x0d0];
mov R12,qword ptr [ctx+0x0d8] ;
mov R13,qword ptr [ctx+0x0e0];
mov R14,qword ptr [ctx+0x0e8] ;
mov R15,qword ptr [ctx+0x0f0];

mov cs, WORD ptr [ctx+0x038];
mov ds, WORD ptr [ctx+0x03a];
mov es, WORD ptr [ctx+0x03c];
mov fs, WORD ptr [ctx+0x03e];
mov gs, WORD ptr [ctx+0x040];
mov ss, WORD ptr [ctx+0x042];

//mov Rip,qword ptr [ctx+0x0f8];
jmp qword ptr [ctx+0x0f8];
done:
}

All general purpose registers are getting updated properly...but i am seeing issue with eflags and segment registers

@mecej4

I generated a .i file and saw that CONTEXT structure in win64 is having EFLAGS in it....that is why i am trying to restore eflags inplace of rflags

>>...I couldn't get what you mean by "Do you use inline assembler in a C/C++ source file?"

In general, C/C++ source files have extensions *.c or *.cpp. Source files for pure Assembler have extension *.asm.

OK. I am using inline assemby in "**.c" file

In my first post....you can see the file name in the error messages i posted.

www.quickthreadprogramming.com

>>OK. I am using inline assemby in "**.c" file
>>
>>In my first post....you can see the file name in the error messages i posted.

I will be able to verify your codes with Intel C++ Compiler XE 13.1.0.149 [ IA-32 & X64 ] ( Update 2 ). Unfortunately I don't use a 64-bit version of Intel C++ compiler version 12.x.

www.quickthreadprogramming.com

I didn't have any problems with a 64-bit Intel C++ compiler version 13.1.0.149 Build 20130118 ( Update 2 ). Here is my test case:

typedef struct tagCTX
{
__int64 reg[24];
} CTX;

int main( void )
{
CTX ctx = { 0x0 };

_asm
{
// First restore eflags
// mov eflags, dword ptr [ ctx+0x044 ]
mov eax, dword ptr [ ctx+0x044 ]

push Rax
popfq

mov Rax, QWORD PTR [ ctx+0x078 ]
mov Rcx, QWORD PTR [ ctx+0x080 ]
mov Rdx, QWORD PTR [ ctx+0x088 ]
mov Rbx, QWORD PTR [ ctx+0x090 ]
mov Rsp, QWORD PTR [ ctx+0x098 ]
mov Rbp, QWORD PTR [ ctx+0x0a0 ]
mov Rsi, QWORD PTR [ ctx+0x0a8 ]
mov Rdi, QWORD PTR [ ctx+0x0b0 ]
mov R8, QWORD PTR [ ctx+0x0b8 ]
mov R9, QWORD PTR [ ctx+0x0c0 ]
mov R10, QWORD PTR [ ctx+0x0c8 ]
mov R11, QWORD PTR [ ctx+0x0d0 ]
mov R12, QWORD PTR [ ctx+0x0d8 ]
mov R13, QWORD PTR [ ctx+0x0e0 ]
mov R14, QWORD PTR [ ctx+0x0e8 ]
mov R15, QWORD PTR [ ctx+0x0f0 ]

mov cs, WORD PTR [ ctx+0x038 ]
mov ds, WORD PTR [ ctx+0x03a ]
mov es, WORD PTR [ ctx+0x03c ]
mov fs, WORD PTR [ ctx+0x03e ]
mov gs, WORD PTR [ ctx+0x040 ]
mov ss, WORD PTR [ ctx+0x042 ]

mov Rip, QWORD PTR [ ctx+0x0f8 ]
// jmp QWORD PTR [ ctx+0x0f8 ]
Done:
}

return ( int )0;
}

For corectness, it should be

Citazione:

Sergey Kostrov ha scritto:

...

_asm
{
// First restore eflags
// mov rflags, qword ptr [ ctx+0x044 ]
mov rax, qword ptr [ ctx+0x044 ]

push Rax
popfq

...

...since upper half of rflags (reserved - do not use) is filled with leftover trash in upper half of RAX.

Indeed s/he should use pushfq/popfq pair, since there is no such instruction in long mode such as pushfd nor popfd. If there were such instructions, it would lead to stack pointer mis-alignment.

-- With best regards, VooDooMan - If you find my post helpful, please rate it and/or select it as a best answer where applies. Thank you.

>>...For corectness, it should be...

Marian,

Take a look at these two error messages:

>>...
>>C:\1\stacktracing.c(81): (col. 5) error #13252: Unsupported instruction form in asm instruction push.
>>C:\1\stacktracing.c(82): (col. 5) error #13250: Opcode POPFD unsupported by architecture in asm instruction popfd.
>>...

1. These two messages are very confusing and Naveen thinks that these two instructions are not supported on a 64-bit platform. Try to use the test case I've posted with Microsoft 64-bit C++ compiler and you will see similar error messages.

2. Then, we're trying to understand what is wrong with compilation of Naveen's codes with inline assembler because something else is really wrong. Microsoft's 64-bit C++ compiler doesn't support inline assembles at all (!) for 64-bit platforms. Intel 64-bit C++ compiler supports inline assembler for 64-bit platforms and I verified it with version 13. But, Naveen is using version 12 and I don't have any chance to verify it. It is possible that version 12 doesn't support it. So, If you have Intel 64-bit C++ compiler version 12 simply try to compile a test case I've posted. All the rest things, I mean execution at run-time, is Naveen's assignment.

Naveen,

Take a look at Intel Instruction Reference Guide because it has complete descriptions for these two instructions.

Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2 (2A, 2B & 2C): Instruction Set Reference, A-Z

Order Number: 325383-044US
August 2012

Page 794
...
Description

Pops a doubleword (POPFD) from the top of the stack (if the current operand-size attribute is 32) and stores the
value in the EFLAGS register, or pops a word from the top of the stack (if the operand-size attribute is 16) and
stores it in the lower 16 bits of the EFLAGS register (that is, the FLAGS register). These instructions reverse the
operation of the PUSHF/PUSHFD instructions.
The POPF (pop flags) and POPFD (pop flags double) mnemonics reference the same opcode. The POPF instruction
is intended for use when the operand-size attribute is 16; the POPFD instruction is intended for use when the
operand-size attribute is 32. Some assemblers may force the operand size to 16 for POPF and to 32 for POPFD.
Others may treat the mnemonics as synonyms (POPF/POPFD) and use the setting of the operand-size attribute to
determine the size of values to pop from the stack.
The effect of POPF/POPFD on the EFLAGS register changes, depending on the mode of operation. When the
processor is operating in protected mode at privilege level 0 (or in real-address mode, the equivalent to privilege
level 0), all non-reserved flags in the EFLAGS register except RF1, VIP, VIF, and VM may be modified. VIP, VIF and
VM remain unaffected.
When operating in protected mode with a privilege level greater than 0, but less than or equal to IOPL, all flags can
be modified except the IOPL field and VIP, VIF, and VM. Here, the IOPL flags are unaffected, the VIP and VIF flags
are cleared, and the VM flag is unaffected. The interrupt flag (IF) is altered only when executing at a level at least
as privileged as the IOPL. If a POPF/POPFD instruction is executed with insufficient privilege, an exception does not
occur but privileged bits do not change.
When operating in virtual-8086 mode, the IOPL must be equal to 3 to use POPF/POPFD instructions; VM, RF, IOPL,
VIP, and VIF are unaffected. If the IOPL is less than 3, POPF/POPFD causes a general-protection exception (#GP).
In 64-bit mode, use REX.W to pop the top of stack to RFLAGS. The mnemonic assigned is POPFQ (note that the 32-
bit operand is not encodable). POPFQ pops 64 bits from the stack, loads the lower 32 bits into RFLAGS, and zero
extends the upper bits of RFLAGS.
See Chapter 3 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1, for more information
about the EFLAGS registers.
...

Citazione:

Sergey Kostrov ha scritto:

Take a look at these two error messages:

>>...
>>C:\1\stacktracing.c(81): (col. 5) error #13252: Unsupported instruction form in asm instruction push.
>>C:\1\stacktracing.c(82): (col. 5) error #13250: Opcode POPFD unsupported by architecture in asm instruction popfd.
>>...

1. These two messages are very confusing and Naveen thinks that these two instructions are not supported on a 64-bit platform. Try to use the test case I've posted with Microsoft 64-bit C++ compiler and you will see similar error messages.

IMO, they are not confusing at all. It clearly states that "popfd" is not supported on amd64/IA-64 architecture running in 64-bit mode, which is indeed truth (see reference manual from Intel or AMD). But there is popfq for this purpose - others said this as well in this forum topic.

Citazione:

Sergey Kostrov ha scritto:

2. Then, we're trying to understand what is wrong with compilation of Naveen's codes with inline assembler because something else is really wrong. Microsoft's 64-bit C++ compiler doesn't support inline assembles at all (!) for 64-bit platforms.

Yes, MS compiler has not implemented 64-bit inline assembler. I know this fact, though, I am unable to compile the test case right now.

Citazione:

Sergey Kostrov ha scritto:

Intel 64-bit C++ compiler supports inline assembler for 64-bit platforms and I verified it with version 13. But, Naveen is using version 12 and I don't have any chance to verify it. It is possible that version 12 doesn't support it. So, If you have Intel 64-bit C++ compiler version 12 simply try to compile a test

Sorry, I missed the fact Naveen is using ICC 12. I was thinking about ICC 13, sorry for the noise. I don't have this version at the hand as well. And I will not in near future as well :-( .

Citazione:

Sergey Kostrov ha scritto:

case I've posted. All the rest things, I mean execution at run-time, is Naveen's assignment.

Of course.

Good luck Naveen. I hope ICC 12 has correctly implemented pushfq/popfq in 64-bit mode without a bug.

Once again, I am sorry for confusion.

-- With best regards, VooDooMan - If you find my post helpful, please rate it and/or select it as a best answer where applies. Thank you.

To be more precise, in amd64 instruction set, there are no such instructions as POPFD nor PUSHFD at all.

-- With best regards, VooDooMan - If you find my post helpful, please rate it and/or select it as a best answer where applies. Thank you.

Initially when i posted this...i used popfd......with this i saw compile error

Later i myself changed to popfq, with this compilation is going fine......but

1. eflags value is not getting updated

2. when i am executing segment registers related assembly, i am seeing this error message

(ccc.1464): Illegal instruction - code c000001d (!!! second chance !!!)
stacktracing!PrintStackTrc+0x2e7:
00000001`3fd4132d 668e0dc47c0000  mov     cs,word ptr [stacktracing!ctx+0x38 (00000001`3fd48ff8)] ds:00000001`3fd48ff8=0033

Citazione:

Naveen Tulabandula ha scritto:

Initially when i posted this...i used popfd......with this i saw compile error

Later i myself changed to popfq, with this compilation is going fine......but

1. eflags value is not getting updated

2. when i am executing segment registers related assembly, i am seeing this error message

(ccc.1464): Illegal instruction - code c000001d (!!! second chance !!!)
stacktracing!PrintStackTrc+0x2e7:
00000001`3fd4132d 668e0dc47c0000  mov     cs,word ptr [stacktracing!ctx+0x38 (00000001`3fd48ff8)] ds:00000001`3fd48ff8=0033

regarding [1.] are you sure you are restoring the CORRECT value stored before? make your checks in debugger.

[2.] in user mode (i.e. your application is application (ring 3), not kernel driver (ring 0 - supervisor), as far as I know you are not able to modify segment registers. Even more, you need not to restore them at all in this case, since they are always constant for given program instance thru its life, and they are not volatile - even OS will not change it. and *IF* OS wanted change them (but no OS do such hacks), it will use its own mechanisms to do it, via modifying TSS (task state segment).

-- With best regards, VooDooMan - If you find my post helpful, please rate it and/or select it as a best answer where applies. Thank you.

After call to GetThreadContext, i see the value in ctx structure as 0x100202 or something like that....later i manipuated the structure value as 0x64....and executed the above assembly......I expect 0x64 in eflags....but that is not the case.  I am using "r" command in windbg to see the value of eflags and other registers

oh, and regarding [2.] I forgot to mention that Windows DOES change FS segment register, each thread has its own one, to implement TLS (thread local storage). But as I wrote, Windows can change it, and does change it, and does it before thread is started, and does it via mentioned TSS structure change.

So you can just ignore changing (restoring) segment registers. Then, [2.] crash will disappear.

But I guess you want to implement lightwaeight threads, since you want to hack context by storing a restoring it to/from the structure. If my guess is right, then you should use "Fibers" to implement this (on Windows). I hope "pthreads" has its own implementation of fibers for other OS'es.

Could you please tell us your approach, what do you want to implement by your (above) code? Maybe you are walking the bad path, and maybe there is other approach for you problem you are trying to solve... and maybe much simpler.

-- With best regards, VooDooMan - If you find my post helpful, please rate it and/or select it as a best answer where applies. Thank you.

Citazione:

Naveen Tulabandula ha scritto:

After call to GetThreadContext, i see the value in ctx structure as 0x100202 or something like that....later i manipuated the structure value as 0x64....and executed the above assembly......I expect 0x64 in eflags....but that is not the case.  I am using "r" command in windbg to see the value of eflags and other registers

when manipulating FLAGS/EFLAGS/RFLAGS you sometimes just can't modify few bits of it, and if you do, you get crash. see http://en.wikipedia.org/wiki/FLAGS_register for layout of bits and then check reference manual for amd64/IA-64 which bits you CAN'T modify. E.g. IOPL 2 bits... if you modify them, then OS will kill the program... and there are few other bits you CAN'T modify.

-- With best regards, VooDooMan - If you find my post helpful, please rate it and/or select it as a best answer where applies. Thank you.

I am trying to trace the call stack when any AV or something happens. Initially I am using stackwalk64 API to get the call stack but this dbghelp function is not  thread safe......to make it thread safe, I am using synchronization mechanishms like mutex and ....Because of these overheads I feel by stack dumping mechanism is slow.....I want to enhance it....recently i got to know that RtlCaptureStackBackTrace function is available to get stack and it is thread safe.

Above is what i am trying to do.

This is how i am trying to achieve it

My sample application has multiple threads, in each iteration i will suspend one thread,get its context and change its Rip to execute a new function which is having call to  RtlCaptureStackBackTrace function, once i get the stack, i am planning to resume the thread execution at exact same point where it is suspended(using the context which i got earlier).

In above logic, i am suscessful upto getting the stack....after that when trying to resume at the exact point where it is suspended, i am seeing some crashes.......I am restoring context using setThreadcontext....but i was wondering if it is not able to restore all flags properly...hence i am writing my own stack restore using above assembly logic

Citazione:

Naveen Tulabandula ha scritto:

I am trying to trace the call stack when any AV or something happens. Initially I am using stackwalk64 API to get the call stack but this dbghelp function is not  thread safe......to make it thread safe, I am using synchronization mechanishms like mutex and ....Because of these overheads I feel by stack dumping mechanism is slow.....I want to enhance it....recently i got to know that RtlCaptureStackBackTrace function is available to get stack and it is thread safe.

Above is what i am trying to do.

This is how i am trying to achieve it

My sample application has multiple threads, in each iteration i will suspend one thread,get its context and change its Rip to execute a new function which is having call to  RtlCaptureStackBackTrace function, once i get the stack, i am planning to resume the thread execution at exact same point where it is suspended(using the context which i got earlier).

Okay, I don't have time ATM to think more about your problem you are trying to find solution for.

Citazione:

Naveen Tulabandula ha scritto:

In above logic, i am suscessful upto getting the stack....after that when trying to resume at the exact point where it is suspended, i am seeing some crashes.......I am restoring context using setThreadcontext....but i was wondering if it is not able to restore all flags properly...hence i am writing my own stack restore using above assembly logic

As I have said, do not modify flags, just restore them as they have been saved and use for it 64-bit version of flags, that is RFLAGS register (though it is the same as EFLAGS, but as per reference documentation, you SHOULD NOT modify upper bits (documentation is saying "these bits are reserved -> do NOT change them". if you change reserved bit, you either get 1. crash: non-zero or non-one bits are not physically present in CPU; 2: you change (possibly) undocumented flag and later behaviour is unpredictable...; 3. CPU will ignore your binary 1 bits, and store them as though they were zeroes. maybe these bits are all zeroes, but to retain future compatibility, maybe next version of CPU will use some bit of RFLAGS(64-bit version) previously as "reserved - do not modify". I was in like the (reserved) are all zeroes. but some of them could default to 1! PAY attention to undocumented bits, and restore them (via POPFQ) to be exactly as they were via PUSHFQ/context save. NB: there are few reserved bits in EFLAGS (16 and 32 bit version) as well!!! NB!!!

---

regarding "dbghelp" DLL thread (non-)safety... you always have a chance to add mutex or cirtical section in front of making debug core crash mini dump into your function to make sure there is no re-entrancy.

Please paste some (maybe pseudo-)code how you do crash (mini) dump. I am using dbghelp for minidump creation at crash for 5 years and I have never had seen threading problem with it, even though my projects are extremely (yea, like 200 threads simultaneously) threaded.

Good luck, and I am able to write you advices regarding "dbghelp" and it's API. Just append a new post to this topic (I don't want to make our private contact, since in opposite case I believe many other users will not profit from our mistakes and fixes of this thing. So I prefer this forum, rather than private e-mail conversation.).

-- With best regards, VooDooMan - If you find my post helpful, please rate it and/or select it as a best answer where applies. Thank you.

foot note: I have fixed som ambigious things in my last post via EDIT. PLS hit refresh in your browser.

-- With best regards, VooDooMan - If you find my post helpful, please rate it and/or select it as a best answer where applies. Thank you.

I did additional verification and here is a summary of different tests for different formats of PUSH-POP instructions with a 64-bit Intel C++ compiler version 13.1.0.149 Build 20130118 ( Update 2 ):

// 32-bit - OK
// 64-bit - OK
_asm
{
pushf
popf
}

// 32-bit - OK
// 64-bit - Error: Opcode PUSHFD unsupported by architecture in asm instruction...
// 64-bit - Error: Opcode POPFD unsupported by architecture in asm instruction...
_asm
{
pushfd
popfd
}

// 32-bit - Error: Unknown opcode PUSHFQ in asm instruction
// 32-bit - Error: Unknown opcode POPFQ in asm instruction
// 64-bit - OK
_asm
{
pushfq
popfq
}

// 32-bit - OK
// 64-bit - Error: Operand size mismatches its default size in asm instruction...
_asm
{
push ax
pop ax
}

// 32-bit - OK
// 64-bit - Error: Unsupported instruction form in asm instruction...
_asm
{
push eax
pop eax
}

// 32-bit - Error: label "rax" was referenced but not defined
// 64-bit - OK
_asm
{
push rax
pop rax
}

Hi Sergy

Thanks for verifying this....I think you verified for compilation error.....but did you check the functionality

#include <stdio.h>
main()
{
printf("start\n");
__asm
{
mov rax,20;
push rax;
popfq;
}
printf("Done\n");
}

Keep a break point on "start" and observe eflags value

keep a break point on "Done" and observe eflags value

By the time execution comes to "Done", I expect 0x14 i.e; 20 in eflags...which is not the case
 

Change 20 to 21 and observe the values again

>>...did you check the functionality...

No. Is there something wrong?

>>>(ccc.1464): Illegal instruction - code c000001d (!!! second chance !!!)
stacktracing!PrintStackTrc+0x2e7:
00000001`3fd4132d 668e0dc47c0000  mov     cs,word ptr [stacktracing!ctx+0x38 (00000001`3fd48ff8)] ds:00000001`3fd48ff8=0033>>>

Why are you trying to write to cs register?Your cpu has trigerred gp error.Those selectors are set by operating system and keeping constant during the constant context switching.The only way for user mode to to modify cs register is to use call and iret instructions.

It is unfortunate that MS Windows does not permit setting a segment register to reference a base address within a linear address of the VM of the process. (TLS is implemented this way). This would be handy even at the expense of an O/S call to modify the descriptor table. An app could request n descriptors for manipulation and specify the base VM address for each descriptor. Then subsequently have a fast path set of ?S: to any of the setup descriptors (GP fault of app when outside range). This is a topic for MS forum.

Jim Dempsey

www.quickthreadprogramming.com

>>>it will use its own mechanisms to do it, via modifying TSS (task state segment).>>>

IIRC Linux uses software task switching(it does not use saved current task context in TSS).

Naveen,

As far as I understood you have some run-time problem and this is already Not a problem with Intel C++ compiler. So, at that stage you need to be as specific as possible and I think a request to Intel Premier support could / needs to be done.

Did you take into account that bits 1, 3, 5, 15, and 22 through 31 of EFLAGS register are reserved?

"...Software should not use or depend on the states of any of these bits..."

Intel(R) 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture
Order Number: 253665-044US
August 2012

...
CHAPTER 3
...
3.4.3 EFLAGS Register ( Page 68 )
...
3.4.3.4 RFLAGS Register in 64-Bit Mode ( Page 71 )
...

Accedere per lasciare un commento.