Problem with certain use of VPID

Problem with certain use of VPID

Hello,

I'm writing a code using Intel-VT for particular purpose.
(the purpose is to run Intel-VT under the regular OS.)
In this code, only one VMCS is used (per CPU), and the VPID in it is changed frequently (when "VM exit" caused by MOV to CR3 occured) for performance gain.
However, such use of VPID makes OS blue screen.
When the part of the code, which may cause the problem, is removed, it works well.
Is my use of VPID (such as rewriting VPID) illegal in the specification?
Or, is it an errata of the CPU?

I used Corei7 920 for the verification.

The pseudo code is shown as follows.

__declspec(naked) void VMExitHandler() // this function is executed when VM exit occurred.
{
...
if (VMexitReason == EXIT_REASON_CR_ACCESS){
if (ExitQualification == MOV_to_CR3){
__vmwrite(GUEST_CR3, getMovRegValue(ExitQualification));
__vmwrite(HOST_CR3, getMovRegValue(ExitQualification));
__vmwrite(VIRTUAL_PROCESSOR_ID, (unsigned short)(__vmread(GUEST_CR3) >> 12))); // this line is to improve the performance. if this line was removed, this code works well.
__vmresume();
}
}
...
}

Thanks.

12 post / 0 nuovi
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione

Thanks for your patience. Am awaiting response from our experts.

David Ott

Comment from one of our engineers:

"According to Intel VT spec, the VPID value should be non-zero.

The (unsigned short)(__vmread(GUEST_CR3) >> 12) may lead to zero value, since it is truncated to 16 bit unsigned short, e.g. if bit 3~6 of __vmread(GUEST_CR3) is zero (e.g. 0xF0000FFF), then the VPID will be zero. And GP fault will happen and blue screen.

So probably he can check if such condition happens."

Thanks for your reply.

To confirm whether substituting 0 for VPID makes Windows OS blue screen, I inserted infinite loop before the line of VPID.
By this addition, OS may be stop just before VMWRITE which makes VPID 0. (interrupt flag is clear in VMexit Handler.) Therefore, at least, OS will not be blue screen by the illegal VMWRITE.
the code is as follows.

...
if (!((unsigned short)(__vmread(GUEST_CR3) >> 12))) for (;1;){} // added
__vmwrite(VIRTUAL_PROCESSOR_ID, (unsigned short)(__vmread(GUEST_CR3) >> 12));
__vmresume();
...

Consequently, OS became blue screen. (the reason of the errors is variable. one of that is shown as the bottom of this post.)
Therefore, writing VPID 0 may not be a cause of this behavior.

As another attempt, I inserted INVVPID before VMRESUME.
When the INVVPID type was 1 and target VPID was 0, OS went blue screen.
But, when the INVVPID type was 2, OS works well.
So, It is sure that the contents of TLB affect this problem.

error message sample:

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M (1000007e)

Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: 805535a1, The address that the exception occurred at
Arg3: f794e898, Exception Record Address
Arg4: f794e594, Context Record Address

Debugging Details:
------------------

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - "0x%08lx"

FAULTING_IP:
nt!ExAllocatePoolWithTag+66a
805535a1 894804 mov dword ptr [eax+4],ecx

EXCEPTION_RECORD: f794e898 -- (.exr 0xfffffffff794e898)
ExceptionAddress: 805535a1 (nt!ExAllocatePoolWithTag+0x0000066a)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 00000001
Parameter[1]: 00000004
Attempt to write to address 00000004

CONTEXT: f794e594 -- (.cxr 0xfffffffff794e594)
eax=00000000 ebx=8a2120a0 ecx=8a2128b0 edx=00000081 esi=e1a7a818 edi=000001ff
eip=805535a1 esp=f794e960 ebp=f794e9b4 iopl=0 nv up ei pl nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010202
nt!ExAllocatePoolWithTag+0x66a:
805535a1 894804 mov dword ptr [eax+4],ecx ds:0023:00000004=????????
Resetting default scope

CUSTOMER_CRASH_COUNT: 1

PROCESS_NAME: System

Once again awaiting a response from our experts. Thanks for your patience.

David Ott

Here is a summaryof the response I received:

VPID 0 is reserved.

According to the Software Developer's Manual, Vol3B -INVVPID does not workwith target VPID 0. CombiningINVVPID type 1 and VPID 0 would cause the VM tofail. INVVPID type 2 works well because it invalidates all translations of all non-zero VPID contexts.

See section 24.3 -http://www.intel.com/Assets/PDF/manual/253669.pdf

Thanks.

Let's confirm the issue.
Thisprogram oughtto work logically, but it doesn't. (I'm doubting CPU a little)
What I want to ask is about the problem which occurs when I am using one VMCS rewriting VPID frequently.
(Perhaps, TLB entries which are taggedwithold VPID value might become valid.)

Isn't it possible that such a problem happened?

Granted thatmy use of INVVPID type 1with VPID 0 is wrong,it seems sure that TLB entries exceptthem taggedwith 0 are sources of this problem, doesn't it?

Once again awaiting a response from our experts. Thanks for your patience.

David Ott

A couple questions:

o How do you guarantee VPID to be 16 bits in width?

o How do you flush previous EPT TLB?

David Ott

o How do you guarantee VPID to be 16 bits in width?

This code works as Windows driver, andVPID is attached per process.
Bits 27:12 of CR3 of new process is writtento VPID when context switch is occurred.
For example, CR3 : ABCDE000, VPID : BCDE.

Bits 31:28 of CR3is ignored as VPID.
Therefore, different CR3 such as ABCDE000 and FBCDE000 for examplemight conflict.
However, the problem occurs even when it seems that there is no conflicting value.

To confirm the problem further, I rewrited the code.
In this attempt, instead of using CR3, VPID value is incremented every time context switch occurred.
In this case, it is the same to TLB flash essentially.
But OS had still stopped instead of blue screen somehow.
the code is as follows.

// mov to CR3 handler
...
static unsigned short i=0;
i = ((i == 0) ? 1 : (i + 1));
__vmwrite(VIRTUAL_PROCESSOR_ID, i);
...

o How do you flush previous EPT TLB?

EPT featureis not used.
So, Ihave notdone something for EPT.

More comments from a team member on this subject:

In the updated code, I'm not sure if the VMM detects theVPID overflow. In case of overflow (which is very easy for 16 bits), thesystem may certainly crash without INVVPID. Perhaps you did this, but just not in the sample code.

Thanks.

Sorry, I was wrong.
I fixed the test source which increase VPID not to overflow.
It works fine, so CPU seems not bad.

When I use VPID as process ID, OS still become blue screen.
But now I feel this is not caused by CPU errata.
I try to find the cause.

Thanks for your patience.

Lascia un commento

Eseguire l'accesso per aggiungere un commento. Non siete membri? Iscriviti oggi