From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Nadav Har'El" Subject: [PATCH 0/31] nVMX: Nested VMX, v10 Date: Mon, 16 May 2011 22:43:54 +0300 Message-ID: <1305575004-nyh@il.ibm.com> Cc: gleb@redhat.com, avi@redhat.com To: kvm@vger.kernel.org Return-path: Received: from mtagate6.uk.ibm.com ([194.196.100.166]:43507 "EHLO mtagate6.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751085Ab1EPTn7 (ORCPT ); Mon, 16 May 2011 15:43:59 -0400 Received: from d06nrmr1307.portsmouth.uk.ibm.com (d06nrmr1307.portsmouth.uk.ibm.com [9.149.38.129]) by mtagate6.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p4GJhvnW009636 for ; Mon, 16 May 2011 19:43:57 GMT Received: from d06av02.portsmouth.uk.ibm.com (d06av02.portsmouth.uk.ibm.com [9.149.37.228]) by d06nrmr1307.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p4GJhvb42379912 for ; Mon, 16 May 2011 20:43:57 +0100 Received: from d06av02.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p4GJhuQR011934 for ; Mon, 16 May 2011 13:43:56 -0600 Sender: kvm-owner@vger.kernel.org List-ID: Hi, This is the tenth iteration of the nested VMX patch set. Improvements in this version over the previous one include: * Fix the code which did not fully maintain a list of all VMCSs loaded on each CPU. (Avi, this was the big thing that bothered you in the previous version). * Add nested-entry-time (L1->L2) verification of control fields of vmcs12 - procbased, pinbased, entry, exit and secondary controls - compared to the capability MSRs which we advertise to L1. The values we advertise (and verify during entry) are stored in variables, and theoretically can be modified to reduce the capabilities given to L1 (although there's no API for that yet). * Explain the external-interrupt injection (patch 23) more accurately. Also got rid of the mysterious "is_interrupt" flag to nested_vmx_vmexit(). * Fix incorrect VMCS_LINK_POINTER merging; Now we always set it to -1 (as the spec suggests), and fail nested entry if vmcs12's isn't -1 (with exit qualification 4 - section see 23.7). * Store idt_vectoring_info and related fields in vmcs12, instead of new vmx->nested fields, between exit and entry. I still *haven't* done the complete rewrite of the idt_vectoring_info handling that Gleb requested. And fixed two bugs reported by real users (hooray!) from this mailing list: * Fix bug where sometimes NMIs headed for L0 were also injected into L1. Thanks to Abel Gordon for investigating this bug. * Removed incorrect test of guest mov-SS block during entry, which prevented L2 from running for one tester. I removed this test (rather than correcting it), as the processor will do exactly the same test anyway when L0 runs L2, and entry failure at that time will be returned to L1 as its entry failure. This version doesn't yet include a fix for the missing VMPTRLD check that Marcello sent to the list just a few minutes ago. This new set of patches applies to the current KVM trunk (I checked with 6f1bd0daae731ff07f4755b4f56730a6e4a3c1cb). If you wish, you can also check out an already-patched version of KVM from branch "nvmx10" of the repository: git://github.com/nyh/kvm-nested-vmx.git About nested VMX: ----------------- The following 31 patches implement nested VMX support. This feature enables a guest to use the VMX APIs in order to run its own nested guests. In other words, it allows running hypervisors (that use VMX) under KVM. Multiple guest hypervisors can be run concurrently, and each of those can in turn host multiple guests. The theory behind this work, our implementation, and its performance characteristics were presented in OSDI 2010 (the USENIX Symposium on Operating Systems Design and Implementation). Our paper was titled "The Turtles Project: Design and Implementation of Nested Virtualization", and was awarded "Jay Lepreau Best Paper". The paper is available online, at: http://www.usenix.org/events/osdi10/tech/full_papers/Ben-Yehuda.pdf This patch set does not include all the features described in the paper. In particular, this patch set is missing nested EPT (L1 can't use EPT and must use shadow page tables). It is also missing some features required to run VMWare hypervisors as a guest. These missing features will be sent as follow-on patchs. Running nested VMX: ------------------ The nested VMX feature is currently disabled by default. It must be explicitly enabled with the "nested=1" option to the kvm-intel module. No modifications are required to user space (qemu). However, qemu's default emulated CPU type (qemu64) does not list the "VMX" CPU feature, so it must be explicitly enabled, by giving qemu one of the following options: -cpu host (emulated CPU has all features of the real CPU) -cpu qemu64,+vmx (add just the vmx feature to a named CPU type) This version was only tested with KVM (64-bit) as a guest hypervisor, and Linux as a nested guest. Patch statistics: ----------------- Documentation/kvm/nested-vmx.txt | 243 ++ arch/x86/include/asm/kvm_host.h | 2 arch/x86/include/asm/msr-index.h | 12 arch/x86/include/asm/vmx.h | 39 arch/x86/kvm/svm.c | 6 arch/x86/kvm/vmx.c | 2658 ++++++++++++++++++++++++++++- arch/x86/kvm/x86.c | 11 arch/x86/kvm/x86.h | 8 8 files changed, 2884 insertions(+), 95 deletions(-) -- Nadav Har'El IBM Haifa Research Lab