Paravirtualization is a technique for increasing the performance of virtualized systems by reducing the proportion of hardware resources that the virtual machine monitor (VMM) must dynamically emulate in software, relative to full virtualization scenarios. Traditional emulation typically involves binary translation, in which a software-based process within the VMM traps hardware calls from the guest OSs and translates them to make them compatible with the host OS. That translation requires computation that can introduce substantial processing overhead and decrease the overall performance and scalability of the environment.
Paravirtualization removes the need for binary translation by building a software interface into the VMM that presents the virtual machines (VMs) with appropriate drivers and other elements that take the place of the dynamically emulated hardware. While paravirtualization typically requires modification of guest operating systems (OSs), Intel VT enables Xen, VMware, and other virtualization environments to run many unmodified guest OSs.
Hardware Page Table Virtualization provides a hardware assist to memory virtualization, which includes the partitioning and allocation of physical memory among VMs. Memory virtualization causes VMs to see a contiguous address space, which is not actually contiguous within the underlying physical memory. The guest OS stores the mapping between virtual and physical memory addresses in page tables.
Because the guest OSs do not have native direct access to physical system memory, the VMM must perform another level of memory virtualization in order to accommodate multiple VMs simultaneously. That is, mapping must be performed within the VMM between the physical memory and the page tables in the guest OSs. In order to accelerate this additional layer of memory virtualization, both Intel and AMD have announced technologies to provide a hardware assist. Intel's is called Extended Page Tables (EPT), and AMD's is called Nested Page Tables (NPT). These two technologies are very similar at a conceptual level.
Intel Virtualization Technology for Directed I/O (Intel VT-d) supports re-mapping of direct memory access (DMA) transfers and device-generated interrupts, which helps to improve isolation of I/O devices. VMMs can use VT-d to directly assign an I/O resource to a specific VM. Thus, an unmodified guest OS can obtain direct access to that resource, without requiring the VMM to provide emulated device drivers for it. Moreover, if an I/O device has been assigned to particular VMs, that device is not accessible by other VMs, nor are the other VMs accessible by the device.
This virtualization of interrupts and DMA transfers prevents a device under the control of one VM from accessing the memory space controlled by another VM. Under VT-d, the I/O memory management unit (IOMMU) maintains a record of which physical memory regions are mapped to which I/O devices, allowing it to control access to those memory locations on the basis of which I/O device requests the access.
For a more in-depth discussion of VT-d and its potential benefits to s oftware products, see the article "Intel® Virtualization Technology for Directed I/O (VT-d): Enhancing Intel platforms for efficient virtualization of I/O devices."
Intel VT for x86-based Intel® Architecture (VT-x) provides a hardware assist for virtualizing the CPU and the memory subsystem in systems based on 32-bit Intel® processors or Intel® 64 architecture (formerly Intel® EM64T) that support Intel VT. For a more complete discussion of the architecture and processes that underlie this hardware assist, see the Intel VT Platform Technology Site.
For the most part, it is not necessary for application software to change in order to take direct advantage of hardware page table virtualization and VT-d. The following benefits are immediately available:
Because of the relative simplicity in achieving these benefits, software vendors should consider recommending to their customers that using Intel® architecture-based servers that support the latest versions of Intel VT as enterprise virtualization platforms automatically delivers benefits in terms of performance, scalability, reliability, and security.
Intel VT-d DMA remapping allows for reduction in VM exits for assigned devices. DMA requests specify a requester-ID and address, and remap hardware transforms the request to a physical memory access, using a software-programmed table structure in memory.
DMA remapping hardware enforces isolation by validating that the requester-ID is allowed to access the address and translates the device-provided address into a physical memory address. The hardware can cache frequently used translations in a TLB-like structure, and software may dynamically update remapping tables for efficient re-direction. DMA remapping is applicable to all DMA sources, and it works with existing device hardware.
Intel VT-d interrupt remapping allows for reductions in interrupt virtualization overhead for assigned devices. Interrupt requests specify a requester-ID and interrupt-ID, and remap hardware transforms these requests to a physical interrupt, using a software-programmed Interrupt Remap Table structure in memory.
Interrupt remapp ing hardware enforces isolation by validating that the interrupt-ID is from an allowed requester-ID and generates interrupts with attributes from the remap structure. The hardware can cache frequently used interrupt-remap structures, and software may dynamically update remap entries for efficient interrupt re-direction. Interrupt remapping is applicable to all interrupt sources, including legacy interrupts delivered through I/O APICs and message-signaled interrupts, including MSI, MSI-x, and MSI-v. This process works with existing device hardware.
In order to compare EPT with predecessor Intel VT-x (using shadow-page tables), it is first necessary to consider some terminology:
Guest-linear address: produced by guest software
Guest-physical address: translation of guest-linear address produced by page tables maintained (or desired) by guest operating system
Host-physical address: actual address used to access memory
Addressing with Shadow Page Tables
Addressing with EPT
Guest maintains guest page tables that map guest-linear to guest-physical
Guest maintains page tables that map guest-linear to guest-physical
VMM maintains active page tables that map guest-linear to host-physical
VMM maintains extended page tables that map guest-physical to host-physical
CPU uses only active page tables
CPU uses both sets of tables
Hardware page table virtualization eliminates exit overhead from guest and monitor MMU (Memory Management Unit) page faults, CR3 (Control Register 3) changes, and INVLPG (Invalidate Translation Look-Aside Buffer Entry), but adds overhead to the page walk and TLB (Translation Look-Aside Buffer) fill processes.
The balance between the impacts of hardware page table virtualization is such that some applications benefit more than others from its performance effects. Specifically, applications with high levels of process creation and memory allocation see the most benefit.
As mentioned above, performance benefits from hardware page virtualization are tied to the prevalence of VT-x exit transitions. Specifically, higher levels of these events in an a pplication workload approximately suggest higher likelihood for benefit from hardware page table virtualization.
VT-x exit transitions from the guest to monitor occur when guest code accesses or modifies privileged virtualized state, executes privileged instructions, or handles certain external events (such as external interrupts). Frequent exits are caused by page faults, external interrupts, control register reads/writes, and I/O instructions. Page fault exits include guest page table faults, MMU faults, APIC (Advanced Programmable Interrupt Controller) reads and writes, and device MMIO (memory-mapped I/O) reads and writes.
Using an instrumented VMM, it is possible to detect exit transitions that occur during code execution. By comparing this data between different builds of applications during development, it is possible to gain additional insight into the contributors to performance in virtual environments under Intel VT from hardware page table virtualization.
Enterprise administration often requires security and management agents to be placed on user machines that it is desirable to make inaccessible both to users and to unauthorized code. For example, restricting access to an intrusion-detection agent in this way could prevent it from being removed by the user or compromised by malicious code.
DMA mapping under VT-d makes it possible for agents to be placed in a dedicated service-partition VM, the memory pages of which are accessible only by specific DMA devices (such as NICs specified by IT). Thus, access to the service partition can be controlled by system administrators, effectively isolating the security and management agents from the user.
Some I/O devices have limited DMA addressability that prevents access to high memory. In order to copy I/O buffers into high memory, software may use bounce buffer techniques; a bounce buffer is a memory area used for the temporary storage of data that is copied between the device and a device-inaccessible memory area. Using this copying technique introduces significant overhead. System software using VT-d can use DMA remapping to overcome the device's addressability limitations, redirecting the data to high memory without resort to bounce buffer techniques.
This article is part of a series of guides that identify best practices for the use of common products and technologies with Intel VT to support virtualized enterprise workloads. The entire series is introduced in the companion article, "Intel® Virtualization Technology: Best Practices for Software Vendors," which provides a general introduction to virtualization best practices, as well as links to each guide in the series.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804