All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00 of 20] NestedVMX support
@ 2011-06-02  8:57 Eddie Dong
  2011-06-02  8:57 ` [PATCH 01 of 20] pre-cleanup1: Extend nhvm_vmcx_guest_intercepts_trap to include errcode to Eddie Dong
                   ` (20 more replies)
  0 siblings, 21 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel


This patch series enable the nestedvmx support.

patch 1-2 is a pre-cleanup.
patch 3 is for nested vmx data structure
patch 4 for nestedhvm ops
patch 5, 7,8,9, 10, 11 is for VMX instruction emulation
patch 6 for virtual VMCS data structure and access APIs.
patch 12 for VMCS switching
Patch 13 for vmreseume/launch emulation
patch 14 for shadow control VMCS fields
patch 15 for n1/n2 guest VMCS switch 
patch 16 for interrupt/exceptions
patch 17 for nested vm exit
patch 18 for lazy FPU and patch 19 VMXE bits in CR4
patch 20 for MSR handling and capability


Thanks, Eddie

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 01 of 20] pre-cleanup1: Extend nhvm_vmcx_guest_intercepts_trap to include errcode to
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02  8:57 ` [PATCH 02 of 20] pre-cleanup2: Move IDT_VECTORING processing code out of intr_assist Eddie Dong
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003600 -28800
# Node ID 332616c43f52f85893f41537e9e6324a3fa01a57
# Parent  0c446850d85e654dfde039a0a1a5acd4e6b3c278
pre-cleanup1: Extend nhvm_vmcx_guest_intercepts_trap to include errcode to
assist decision of TRAP_page_fault in VMX.

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 0c446850d85e -r 332616c43f52 xen/arch/x86/hvm/hvm.c
--- a/xen/arch/x86/hvm/hvm.c	Wed May 11 12:58:04 2011 +0100
+++ b/xen/arch/x86/hvm/hvm.c	Thu Jun 02 16:33:20 2011 +0800
@@ -1152,7 +1152,7 @@ void hvm_inject_exception(unsigned int t
         return;
     }
 
-    if ( nhvm_vmcx_guest_intercepts_trap(v, trapnr) )
+    if ( nhvm_vmcx_guest_intercepts_trap(v, trapnr, errcode) )
     {
         enum nestedhvm_vmexits nsret;
 
@@ -4175,10 +4175,10 @@ uint32_t nhvm_vcpu_asid(struct vcpu *v)
     return -EOPNOTSUPP;
 }
 
-int nhvm_vmcx_guest_intercepts_trap(struct vcpu *v, unsigned int trap)
+int nhvm_vmcx_guest_intercepts_trap(struct vcpu *v, unsigned int trap, int errcode)
 {
     if (hvm_funcs.nhvm_vmcx_guest_intercepts_trap)
-        return hvm_funcs.nhvm_vmcx_guest_intercepts_trap(v, trap);
+        return hvm_funcs.nhvm_vmcx_guest_intercepts_trap(v, trap, errcode);
     return -EOPNOTSUPP;
 }
 
diff -r 0c446850d85e -r 332616c43f52 xen/arch/x86/hvm/svm/nestedsvm.c
--- a/xen/arch/x86/hvm/svm/nestedsvm.c	Wed May 11 12:58:04 2011 +0100
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c	Thu Jun 02 16:33:20 2011 +0800
@@ -895,7 +895,7 @@ nsvm_vmcb_guest_intercepts_exitcode(stru
 }
 
 int
-nsvm_vmcb_guest_intercepts_trap(struct vcpu *v, unsigned int trapnr)
+nsvm_vmcb_guest_intercepts_trap(struct vcpu *v, unsigned int trapnr, int errcode)
 {
     return nsvm_vmcb_guest_intercepts_exitcode(v,
         guest_cpu_user_regs(), VMEXIT_EXCEPTION_DE + trapnr);
diff -r 0c446850d85e -r 332616c43f52 xen/include/asm-x86/hvm/hvm.h
--- a/xen/include/asm-x86/hvm/hvm.h	Wed May 11 12:58:04 2011 +0100
+++ b/xen/include/asm-x86/hvm/hvm.h	Thu Jun 02 16:33:20 2011 +0800
@@ -164,7 +164,8 @@ struct hvm_function_table {
     uint64_t (*nhvm_vcpu_guestcr3)(struct vcpu *v);
     uint64_t (*nhvm_vcpu_hostcr3)(struct vcpu *v);
     uint32_t (*nhvm_vcpu_asid)(struct vcpu *v);
-    int (*nhvm_vmcx_guest_intercepts_trap)(struct vcpu *v, unsigned int trapnr);
+    int (*nhvm_vmcx_guest_intercepts_trap)(struct vcpu *v, 
+                               unsigned int trapnr, int errcode);
 
     bool_t (*nhvm_vmcx_hap_enabled)(struct vcpu *v);
 
@@ -443,7 +444,8 @@ uint64_t nhvm_vcpu_hostcr3(struct vcpu *
 uint32_t nhvm_vcpu_asid(struct vcpu *v);
 
 /* returns true, when l1 guest intercepts the specified trap */
-int nhvm_vmcx_guest_intercepts_trap(struct vcpu *v, unsigned int trapnr);
+int nhvm_vmcx_guest_intercepts_trap(struct vcpu *v, 
+                                    unsigned int trapnr, int errcode);
 
 /* returns true when l1 guest wants to use hap to run l2 guest */
 bool_t nhvm_vmcx_hap_enabled(struct vcpu *v);
diff -r 0c446850d85e -r 332616c43f52 xen/include/asm-x86/hvm/svm/nestedsvm.h
--- a/xen/include/asm-x86/hvm/svm/nestedsvm.h	Wed May 11 12:58:04 2011 +0100
+++ b/xen/include/asm-x86/hvm/svm/nestedsvm.h	Thu Jun 02 16:33:20 2011 +0800
@@ -114,7 +114,8 @@ uint64_t nsvm_vcpu_hostcr3(struct vcpu *
 uint32_t nsvm_vcpu_asid(struct vcpu *v);
 int nsvm_vmcb_guest_intercepts_exitcode(struct vcpu *v,
     struct cpu_user_regs *regs, uint64_t exitcode);
-int nsvm_vmcb_guest_intercepts_trap(struct vcpu *v, unsigned int trapnr);
+int nsvm_vmcb_guest_intercepts_trap(struct vcpu *v, unsigned int trapnr,
+                                    int errcode);
 bool_t nsvm_vmcb_hap_enabled(struct vcpu *v);
 enum hvm_intblk nsvm_intr_blocked(struct vcpu *v);

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 02 of 20] pre-cleanup2: Move IDT_VECTORING processing code out of intr_assist
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
  2011-06-02  8:57 ` [PATCH 01 of 20] pre-cleanup1: Extend nhvm_vmcx_guest_intercepts_trap to include errcode to Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02  8:57 ` [PATCH 03 of 20] Add data structure for nestedvmx Eddie Dong
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003600 -28800
# Node ID ce6ed8ca4ebd2f2fb96627e61f7d2ef737e7193d
# Parent  332616c43f52f85893f41537e9e6324a3fa01a57
pre-cleanup2: Move IDT_VECTORING processing code out of intr_assist.

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 332616c43f52 -r ce6ed8ca4ebd xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -2098,6 +2098,33 @@ static int vmx_handle_eoi_write(void)
     return 0;
 }
 
+static void vmx_idtv_reinject(unsigned long idtv_info)
+{
+
+    /* Event delivery caused this intercept? Queue for redelivery. */
+    if ( unlikely(idtv_info & INTR_INFO_VALID_MASK) )
+    {
+        if ( hvm_event_needs_reinjection((idtv_info>>8)&7, idtv_info&0xff) )
+        {
+            /* See SDM 3B 25.7.1.1 and .2 for info about masking resvd bits. */
+            __vmwrite(VM_ENTRY_INTR_INFO,
+                      idtv_info & ~INTR_INFO_RESVD_BITS_MASK);
+            if ( idtv_info & INTR_INFO_DELIVER_CODE_MASK )
+                __vmwrite(VM_ENTRY_EXCEPTION_ERROR_CODE,
+                          __vmread(IDT_VECTORING_ERROR_CODE));
+        }
+
+        /*
+         * Clear NMI-blocking interruptibility info if an NMI delivery faulted.
+         * Re-delivery will re-set it (see SDM 3B 25.7.1.2).
+         */
+        if ( (idtv_info & INTR_INFO_INTR_TYPE_MASK) == (X86_EVENTTYPE_NMI<<8) )
+            __vmwrite(GUEST_INTERRUPTIBILITY_INFO,
+                      __vmread(GUEST_INTERRUPTIBILITY_INFO) &
+                      ~VMX_INTR_SHADOW_NMI);
+    }
+}
+
 asmlinkage void vmx_vmexit_handler(struct cpu_user_regs *regs)
 {
     unsigned int exit_reason, idtv_info, intr_info = 0, vector = 0;
@@ -2187,30 +2214,9 @@ asmlinkage void vmx_vmexit_handler(struc
 
     hvm_maybe_deassert_evtchn_irq();
 
-    /* Event delivery caused this intercept? Queue for redelivery. */
     idtv_info = __vmread(IDT_VECTORING_INFO);
-    if ( unlikely(idtv_info & INTR_INFO_VALID_MASK) &&
-         (exit_reason != EXIT_REASON_TASK_SWITCH) )
-    {
-        if ( hvm_event_needs_reinjection((idtv_info>>8)&7, idtv_info&0xff) )
-        {
-            /* See SDM 3B 25.7.1.1 and .2 for info about masking resvd bits. */
-            __vmwrite(VM_ENTRY_INTR_INFO,
-                      idtv_info & ~INTR_INFO_RESVD_BITS_MASK);
-            if ( idtv_info & INTR_INFO_DELIVER_CODE_MASK )
-                __vmwrite(VM_ENTRY_EXCEPTION_ERROR_CODE,
-                          __vmread(IDT_VECTORING_ERROR_CODE));
-        }
-
-        /*
-         * Clear NMI-blocking interruptibility info if an NMI delivery faulted.
-         * Re-delivery will re-set it (see SDM 3B 25.7.1.2).
-         */
-        if ( (idtv_info & INTR_INFO_INTR_TYPE_MASK) == (X86_EVENTTYPE_NMI<<8) )
-            __vmwrite(GUEST_INTERRUPTIBILITY_INFO,
-                      __vmread(GUEST_INTERRUPTIBILITY_INFO) &
-                      ~VMX_INTR_SHADOW_NMI);
-    }
+    if ( exit_reason != EXIT_REASON_TASK_SWITCH )
+        vmx_idtv_reinject(idtv_info);
 
     switch ( exit_reason )
     {

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 03 of 20] Add data structure for nestedvmx
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
  2011-06-02  8:57 ` [PATCH 01 of 20] pre-cleanup1: Extend nhvm_vmcx_guest_intercepts_trap to include errcode to Eddie Dong
  2011-06-02  8:57 ` [PATCH 02 of 20] pre-cleanup2: Move IDT_VECTORING processing code out of intr_assist Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02  8:57 ` [PATCH 04 of 20] Add APIs for nestedhvm_ops Eddie Dong
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003600 -28800
# Node ID 4bbf0eaec85c764c7872d1cfc1c59c419dfabe0a
# Parent  ce6ed8ca4ebd2f2fb96627e61f7d2ef737e7193d
Add data structure for nestedvmx

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r ce6ed8ca4ebd -r 4bbf0eaec85c xen/include/asm-x86/hvm/vcpu.h
--- a/xen/include/asm-x86/hvm/vcpu.h	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/include/asm-x86/hvm/vcpu.h	Thu Jun 02 16:33:20 2011 +0800
@@ -24,6 +24,7 @@
 #include <asm/hvm/io.h>
 #include <asm/hvm/vlapic.h>
 #include <asm/hvm/vmx/vmcs.h>
+#include <asm/hvm/vmx/vvmx.h>
 #include <asm/hvm/svm/vmcb.h>
 #include <asm/hvm/svm/nestedsvm.h>
 #include <asm/mtrr.h>
@@ -57,6 +58,7 @@ struct nestedvcpu {
     /* SVM/VMX arch specific */
     union {
         struct nestedsvm nsvm;
+        struct nestedvmx nvmx;
     } u;
 
     bool_t nv_flushp2m; /* True, when p2m table must be flushed */
diff -r ce6ed8ca4ebd -r 4bbf0eaec85c xen/include/asm-x86/hvm/vmx/vvmx.h
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
@@ -0,0 +1,38 @@
+
+/*
+ * vvmx.h: Support virtual VMX for nested virtualization.
+ *
+ * Copyright (c) 2010, Intel Corporation.
+ * Author: Qing He <qing.he@intel.com>
+ *         Eddie Dong <eddie.dong@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ */
+#ifndef __ASM_X86_HVM_VVMX_H__
+#define __ASM_X86_HVM_VVMX_H__
+
+struct nestedvmx {
+    paddr_t    vmxon_region_pa;
+    void       *iobitmap[2];		/* map (va) of L1 guest I/O bitmap */
+    /* deferred nested interrupt */
+    struct {
+        unsigned long intr_info;
+        u32           error_code;
+    } intr;
+};
+
+#define vcpu_2_nvmx(v)	(vcpu_nestedhvm(v).u.nvmx)
+#endif /* __ASM_X86_HVM_VVMX_H__ */
+

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 04 of 20] Add APIs for nestedhvm_ops
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (2 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 03 of 20] Add data structure for nestedvmx Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02  8:57 ` [PATCH 05 of 20] Emulation of guest VMXON/OFF instruction Eddie Dong
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003600 -28800
# Node ID 4e094881883f10f94575a6f69194a2393e16b7d1
# Parent  4bbf0eaec85c764c7872d1cfc1c59c419dfabe0a
Add APIs for nestedhvm_ops.

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 4bbf0eaec85c -r 4e094881883f xen/arch/x86/hvm/hvm.c
--- a/xen/arch/x86/hvm/hvm.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/hvm.c	Thu Jun 02 16:33:20 2011 +0800
@@ -3502,7 +3502,7 @@ long do_hvm_op(unsigned long op, XEN_GUE
                 /* Remove the check below once we have
                  * shadow-on-shadow.
                  */
-                if ( !paging_mode_hap(d) && a.value )
+                if ( cpu_has_svm && !paging_mode_hap(d) && a.value )
                     rc = -EINVAL;
                 /* Set up NHVM state for any vcpus that are already up */
                 if ( !d->arch.hvm_domain.params[HVM_PARAM_NESTEDHVM] )
diff -r 4bbf0eaec85c -r 4e094881883f xen/arch/x86/hvm/vmx/Makefile
--- a/xen/arch/x86/hvm/vmx/Makefile	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/Makefile	Thu Jun 02 16:33:20 2011 +0800
@@ -4,3 +4,4 @@ obj-y += realmode.o
 obj-y += vmcs.o
 obj-y += vmx.o
 obj-y += vpmu_core2.o
+obj-y += vvmx.o
diff -r 4bbf0eaec85c -r 4e094881883f xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -1407,7 +1407,13 @@ static struct hvm_function_table __read_
     .invlpg_intercept     = vmx_invlpg_intercept,
     .set_uc_mode          = vmx_set_uc_mode,
     .set_info_guest       = vmx_set_info_guest,
-    .set_rdtsc_exiting    = vmx_set_rdtsc_exiting
+    .set_rdtsc_exiting    = vmx_set_rdtsc_exiting,
+    .nhvm_vcpu_initialise = nvmx_vcpu_initialise,
+    .nhvm_vcpu_destroy    = nvmx_vcpu_destroy,
+    .nhvm_vcpu_reset      = nvmx_vcpu_reset,
+    .nhvm_vcpu_guestcr3   = nvmx_vcpu_guestcr3,
+    .nhvm_vcpu_hostcr3    = nvmx_vcpu_hostcr3,
+    .nhvm_vcpu_asid       = nvmx_vcpu_asid
 };
 
 struct hvm_function_table * __init start_vmx(void)
diff -r 4bbf0eaec85c -r 4e094881883f xen/arch/x86/hvm/vmx/vvmx.c
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -0,0 +1,93 @@
+/*
+ * vvmx.c: Support virtual VMX for nested virtualization.
+ *
+ * Copyright (c) 2010, Intel Corporation.
+ * Author: Qing He <qing.he@intel.com>
+ *         Eddie Dong <eddie.dong@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ */
+
+#include <xen/config.h>
+#include <asm/types.h>
+#include <asm/p2m.h>
+#include <asm/hvm/vmx/vmx.h>
+#include <asm/hvm/vmx/vvmx.h>
+
+int nvmx_vcpu_initialise(struct vcpu *v)
+{
+    struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+
+    nvcpu->nv_n2vmcx = alloc_xenheap_page();
+    if ( !nvcpu->nv_n2vmcx )
+    {
+        gdprintk(XENLOG_ERR, "nest: allocation for shadow vmcs failed\n");
+	goto out;
+    }
+    nvmx->vmxon_region_pa = 0;
+    nvcpu->nv_vvmcx = NULL;
+    nvcpu->nv_vvmcxaddr = VMCX_EADDR;
+    nvmx->intr.intr_info = 0;
+    nvmx->intr.error_code = 0;
+    nvmx->iobitmap[0] = NULL;
+    nvmx->iobitmap[1] = NULL;
+    return 0;
+out:
+    return -ENOMEM;
+}
+ 
+void nvmx_vcpu_destroy(struct vcpu *v)
+{
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+
+    if ( nvcpu->nv_n2vmcx ) {
+        __vmpclear(virt_to_maddr(nvcpu->nv_n2vmcx));
+        free_xenheap_page(nvcpu->nv_n2vmcx);
+        nvcpu->nv_n2vmcx = NULL;
+    }
+    if ( nvcpu->nv_vvmcx ) {
+        unmap_domain_page_global(nvcpu->nv_vvmcx);
+        nvcpu->nv_vvmcx == NULL;
+    }
+    nvcpu->nv_vvmcxaddr = VMCX_EADDR;
+}
+ 
+int nvmx_vcpu_reset(struct vcpu *v)
+{
+    return 0;
+}
+
+uint64_t nvmx_vcpu_guestcr3(struct vcpu *v)
+{
+    /* TODO */
+    ASSERT(0);
+    return 0;
+}
+
+uint64_t nvmx_vcpu_hostcr3(struct vcpu *v)
+{
+    /* TODO */
+    ASSERT(0);
+    return 0;
+}
+
+uint32_t nvmx_vcpu_asid(struct vcpu *v)
+{
+    /* TODO */
+    ASSERT(0);
+    return 0;
+}
+
diff -r 4bbf0eaec85c -r 4e094881883f xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
@@ -34,5 +34,13 @@ struct nestedvmx {
 };
 
 #define vcpu_2_nvmx(v)	(vcpu_nestedhvm(v).u.nvmx)
+
+int nvmx_vcpu_initialise(struct vcpu *v);
+void nvmx_vcpu_destroy(struct vcpu *v);
+int nvmx_vcpu_reset(struct vcpu *v);
+uint64_t nvmx_vcpu_guestcr3(struct vcpu *v);
+uint64_t nvmx_vcpu_hostcr3(struct vcpu *v);
+uint32_t nvmx_vcpu_asid(struct vcpu *v);
+
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 05 of 20] Emulation of guest VMXON/OFF instruction
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (3 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 04 of 20] Add APIs for nestedhvm_ops Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02 14:36   ` Tim Deegan
  2011-06-02  8:57 ` [PATCH 06 of 20] Define structure and access APIs for virtual VMCS Eddie Dong
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003600 -28800
# Node ID c8812151acfd6d9468f3407bc6a1a278cd764567
# Parent  4e094881883f10f94575a6f69194a2393e16b7d1
Emulation of guest VMXON/OFF instruction.

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 4e094881883f -r c8812151acfd xen/arch/x86/hvm/vmx/Makefile
--- a/xen/arch/x86/hvm/vmx/Makefile	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/Makefile	Thu Jun 02 16:33:20 2011 +0800
@@ -5,3 +5,4 @@ obj-y += vmcs.o
 obj-y += vmx.o
 obj-y += vpmu_core2.o
 obj-y += vvmx.o
+obj-y += vvmx.o
diff -r 4e094881883f -r c8812151acfd xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -2434,6 +2434,16 @@ asmlinkage void vmx_vmexit_handler(struc
         break;
     }
 
+    case EXIT_REASON_VMXOFF:
+        if ( nvmx_handle_vmxoff(regs) == X86EMUL_OKAY )
+            update_guest_eip();
+        break;
+
+    case EXIT_REASON_VMXON:
+        if ( nvmx_handle_vmxon(regs) == X86EMUL_OKAY )
+            update_guest_eip();
+        break;
+
     case EXIT_REASON_MWAIT_INSTRUCTION:
     case EXIT_REASON_MONITOR_INSTRUCTION:
     case EXIT_REASON_VMCLEAR:
@@ -2443,8 +2453,6 @@ asmlinkage void vmx_vmexit_handler(struc
     case EXIT_REASON_VMREAD:
     case EXIT_REASON_VMRESUME:
     case EXIT_REASON_VMWRITE:
-    case EXIT_REASON_VMXOFF:
-    case EXIT_REASON_VMXON:
     case EXIT_REASON_GETSEC:
     case EXIT_REASON_INVEPT:
     case EXIT_REASON_INVVPID:
diff -r 4e094881883f -r c8812151acfd xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -91,3 +91,228 @@ uint32_t nvmx_vcpu_asid(struct vcpu *v)
     return 0;
 }
 
+enum x86_segment sreg_to_index[] = {
+    [VMX_SREG_ES] = x86_seg_es,
+    [VMX_SREG_CS] = x86_seg_cs,
+    [VMX_SREG_SS] = x86_seg_ss,
+    [VMX_SREG_DS] = x86_seg_ds,
+    [VMX_SREG_FS] = x86_seg_fs,
+    [VMX_SREG_GS] = x86_seg_gs,
+};
+
+struct vmx_inst_decoded {
+#define VMX_INST_MEMREG_TYPE_MEMORY 0
+#define VMX_INST_MEMREG_TYPE_REG    1
+    int type;
+    union {
+        struct {
+            unsigned long mem;
+            unsigned int  len;
+        };
+        enum vmx_regs_enc reg1;
+    };
+
+    enum vmx_regs_enc reg2;
+};
+
+enum vmx_ops_result {
+    VMSUCCEED,
+    VMFAIL_VALID,
+    VMFAIL_INVALID,
+};
+
+#define CASE_GET_REG(REG, reg)      \
+    case VMX_REG_ ## REG: value = regs->reg; break
+
+static unsigned long reg_read(struct cpu_user_regs *regs,
+                              enum vmx_regs_enc index)
+{
+    unsigned long value = 0;
+
+    switch ( index ) {
+    CASE_GET_REG(RAX, eax);
+    CASE_GET_REG(RCX, ecx);
+    CASE_GET_REG(RDX, edx);
+    CASE_GET_REG(RBX, ebx);
+    CASE_GET_REG(RBP, ebp);
+    CASE_GET_REG(RSI, esi);
+    CASE_GET_REG(RDI, edi);
+    CASE_GET_REG(RSP, esp);
+#ifdef CONFIG_X86_64
+    CASE_GET_REG(R8, r8);
+    CASE_GET_REG(R9, r9);
+    CASE_GET_REG(R10, r10);
+    CASE_GET_REG(R11, r11);
+    CASE_GET_REG(R12, r12);
+    CASE_GET_REG(R13, r13);
+    CASE_GET_REG(R14, r14);
+    CASE_GET_REG(R15, r15);
+#endif
+    default:
+        break;
+    }
+
+    return value;
+}
+
+static int vmx_inst_check_privilege(struct cpu_user_regs *regs, int vmxop_check)
+{
+    struct vcpu *v = current;
+    struct segment_register cs;
+
+    hvm_get_segment_register(v, x86_seg_cs, &cs);
+
+    if ( vmxop_check )
+    {
+        if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) ||
+             !(v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_VMXE) )
+            goto invalid_op;
+    }
+    else if ( !vcpu_2_nvmx(v).vmxon_region_pa )
+        goto invalid_op;
+
+    if ( (regs->eflags & X86_EFLAGS_VM) ||
+         (hvm_long_mode_enabled(v) && cs.attr.fields.l == 0) )
+        goto invalid_op;
+    /* TODO: check vmx operation mode */
+
+    if ( (cs.sel & 3) > 0 )
+        goto gp_fault;
+
+    return X86EMUL_OKAY;
+
+invalid_op:
+    gdprintk(XENLOG_ERR, "vmx_inst_check_privilege: invalid_op\n");
+    hvm_inject_exception(TRAP_invalid_op, 0, 0);
+    return X86EMUL_EXCEPTION;
+
+gp_fault:
+    gdprintk(XENLOG_ERR, "vmx_inst_check_privilege: gp_fault\n");
+    hvm_inject_exception(TRAP_gp_fault, 0, 0);
+    return X86EMUL_EXCEPTION;
+}
+
+static int decode_vmx_inst(struct cpu_user_regs *regs,
+                           struct vmx_inst_decoded *decode,
+                           unsigned long *poperandS, int vmxon_check)
+{
+    struct vcpu *v = current;
+    union vmx_inst_info info;
+    struct segment_register seg;
+    unsigned long base, index, seg_base, disp, offset;
+    int scale, size;
+
+    if ( vmx_inst_check_privilege(regs, vmxon_check) != X86EMUL_OKAY )
+        return X86EMUL_EXCEPTION;
+
+    info.word = __vmread(VMX_INSTRUCTION_INFO);
+
+    if ( info.fields.memreg ) {
+        decode->type = VMX_INST_MEMREG_TYPE_REG;
+        decode->reg1 = info.fields.reg1;
+        if ( poperandS != NULL )
+            *poperandS = reg_read(regs, decode->reg1);
+    }
+    else
+    {
+        decode->type = VMX_INST_MEMREG_TYPE_MEMORY;
+        hvm_get_segment_register(v, sreg_to_index[info.fields.segment], &seg);
+        /* TODO: segment type check */
+        seg_base = seg.base;
+
+        base = info.fields.base_reg_invalid ? 0 :
+            reg_read(regs, info.fields.base_reg);
+
+        index = info.fields.index_reg_invalid ? 0 :
+            reg_read(regs, info.fields.index_reg);
+
+        scale = 1 << info.fields.scaling;
+
+        disp = __vmread(EXIT_QUALIFICATION);
+
+        size = 1 << (info.fields.addr_size + 1);
+
+        offset = base + index * scale + disp;
+        if ( (offset > seg.limit || offset + size > seg.limit) &&
+            (!hvm_long_mode_enabled(v) || info.fields.segment == VMX_SREG_GS) )
+            goto gp_fault;
+
+        if ( poperandS != NULL &&
+             hvm_copy_from_guest_virt(poperandS, seg_base + offset, size, 0)
+                  != HVMCOPY_okay )
+            return X86EMUL_EXCEPTION;
+        decode->mem = seg_base + offset;
+        decode->len = size;
+    }
+
+    decode->reg2 = info.fields.reg2;
+
+    return X86EMUL_OKAY;
+
+gp_fault:
+    hvm_inject_exception(TRAP_gp_fault, 0, 0);
+    return X86EMUL_EXCEPTION;
+}
+
+static void vmreturn(struct cpu_user_regs *regs, enum vmx_ops_result ops_res)
+{
+    unsigned long eflags = regs->eflags;
+    unsigned long mask = X86_EFLAGS_CF | X86_EFLAGS_PF | X86_EFLAGS_AF |
+                         X86_EFLAGS_ZF | X86_EFLAGS_SF | X86_EFLAGS_OF;
+
+    eflags &= ~mask;
+
+    switch ( ops_res ) {
+    case VMSUCCEED:
+        break;
+    case VMFAIL_VALID:
+        /* TODO: error number, useful for guest VMM debugging */
+        eflags |= X86_EFLAGS_ZF;
+        break;
+    case VMFAIL_INVALID:
+    default:
+        eflags |= X86_EFLAGS_CF;
+        break;
+    }
+
+    regs->eflags = eflags;
+}
+
+/*
+ * VMX instructions handling
+ */
+
+int nvmx_handle_vmxon(struct cpu_user_regs *regs)
+{
+    struct vcpu *v=current;
+    struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
+    struct vmx_inst_decoded decode;
+    unsigned long gpa = 0;
+    int rc;
+
+    rc = decode_vmx_inst(regs, &decode, &gpa, 1);
+    if ( rc != X86EMUL_OKAY )
+        return rc;
+
+    nvmx->vmxon_region_pa = gpa;
+    vmreturn(regs, VMSUCCEED);
+
+    return X86EMUL_OKAY;
+}
+
+int nvmx_handle_vmxoff(struct cpu_user_regs *regs)
+{
+    struct vcpu *v=current;
+    struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
+    int rc;
+
+    rc = vmx_inst_check_privilege(regs, 0);
+    if ( rc != X86EMUL_OKAY )
+        return rc;
+
+    nvmx->vmxon_region_pa = 0;
+
+    vmreturn(regs, VMSUCCEED);
+    return X86EMUL_OKAY;
+}
+
diff -r 4e094881883f -r c8812151acfd xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
@@ -35,6 +35,58 @@ struct nestedvmx {
 
 #define vcpu_2_nvmx(v)	(vcpu_nestedhvm(v).u.nvmx)
 
+/*
+ * Encode of VMX instructions base on Table 24-11 & 24-12 of SDM 3B
+ */
+
+enum vmx_regs_enc {
+    VMX_REG_RAX,
+    VMX_REG_RCX,
+    VMX_REG_RDX,
+    VMX_REG_RBX,
+    VMX_REG_RSP,
+    VMX_REG_RBP,
+    VMX_REG_RSI,
+    VMX_REG_RDI,
+#ifdef CONFIG_X86_64
+    VMX_REG_R8,
+    VMX_REG_R9,
+    VMX_REG_R10,
+    VMX_REG_R11,
+    VMX_REG_R12,
+    VMX_REG_R13,
+    VMX_REG_R14,
+    VMX_REG_R15,
+#endif
+};
+
+enum vmx_sregs_enc {
+    VMX_SREG_ES,
+    VMX_SREG_CS,
+    VMX_SREG_SS,
+    VMX_SREG_DS,
+    VMX_SREG_FS,
+    VMX_SREG_GS,
+};
+
+union vmx_inst_info {
+    struct {
+        unsigned int scaling           :2; /* bit 0-1 */
+        unsigned int __rsvd0           :1; /* bit 2 */
+        unsigned int reg1              :4; /* bit 3-6 */
+        unsigned int addr_size         :3; /* bit 7-9 */
+        unsigned int memreg            :1; /* bit 10 */
+        unsigned int __rsvd1           :4; /* bit 11-14 */
+        unsigned int segment           :3; /* bit 15-17 */
+        unsigned int index_reg         :4; /* bit 18-21 */
+        unsigned int index_reg_invalid :1; /* bit 22 */
+        unsigned int base_reg          :4; /* bit 23-26 */
+        unsigned int base_reg_invalid  :1; /* bit 27 */
+        unsigned int reg2              :4; /* bit 28-31 */
+    } fields;
+    u32 word;
+};
+
 int nvmx_vcpu_initialise(struct vcpu *v);
 void nvmx_vcpu_destroy(struct vcpu *v);
 int nvmx_vcpu_reset(struct vcpu *v);
@@ -42,5 +94,7 @@ uint64_t nvmx_vcpu_guestcr3(struct vcpu 
 uint64_t nvmx_vcpu_hostcr3(struct vcpu *v);
 uint32_t nvmx_vcpu_asid(struct vcpu *v);
 
+int nvmx_handle_vmxon(struct cpu_user_regs *regs);
+int nvmx_handle_vmxoff(struct cpu_user_regs *regs);
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 06 of 20] Define structure and access APIs for virtual VMCS
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (4 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 05 of 20] Emulation of guest VMXON/OFF instruction Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02  8:57 ` [PATCH 07 of 20] Emulation of guest vmptrld Eddie Dong
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003600 -28800
# Node ID 8264b01b476b1b695727f78d92ab0ce553aa7516
# Parent  c8812151acfd6d9468f3407bc6a1a278cd764567
Define structure and access APIs for virtual VMCS.


Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r c8812151acfd -r 8264b01b476b xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -124,6 +124,84 @@ enum vmx_ops_result {
 #define CASE_GET_REG(REG, reg)      \
     case VMX_REG_ ## REG: value = regs->reg; break
 
+static int vvmcs_offset(u32 width, u32 type, u32 index)
+{
+    int offset;
+
+    offset = (index & 0x1f) | type << 5 | width << 7;
+
+    if ( offset == 0 )    /* vpid */
+        offset = 0x3f;
+
+    return offset;
+}
+
+u64 __get_vvmcs(void *vvmcs, u32 vmcs_encoding)
+{
+    union vmcs_encoding enc;
+    u64 *content = (u64 *) vvmcs;
+    int offset;
+    u64 res;
+
+    enc.word = vmcs_encoding;
+    offset = vvmcs_offset(enc.width, enc.type, enc.index);
+    res = content[offset];
+
+    switch ( enc.width ) {
+    case VVMCS_WIDTH_16:
+        res &= 0xffff;
+        break;
+   case VVMCS_WIDTH_64:
+        if ( enc.access_type )
+            res >>= 32;
+        break;
+    case VVMCS_WIDTH_32:
+        res &= 0xffffffff;
+        break;
+    case VVMCS_WIDTH_NATURAL:
+    default:
+        break;
+    }
+
+    return res;
+}
+
+void __set_vvmcs(void *vvmcs, u32 vmcs_encoding, u64 val)
+{
+    union vmcs_encoding enc;
+    u64 *content = (u64 *) vvmcs;
+    int offset;
+    u64 res;
+
+    enc.word = vmcs_encoding;
+    offset = vvmcs_offset(enc.width, enc.type, enc.index);
+    res = content[offset];
+
+    switch ( enc.width ) {
+    case VVMCS_WIDTH_16:
+        res = val & 0xffff;
+        break;
+    case VVMCS_WIDTH_64:
+        if ( enc.access_type )
+        {
+            res &= 0xffffffff;
+            res |= val << 32;
+        }
+        else
+            res = val;
+        break;
+    case VVMCS_WIDTH_32:
+        res = val & 0xffffffff;
+        break;
+    case VVMCS_WIDTH_NATURAL:
+    default:
+        res = val;
+        break;
+    }
+
+    content[offset] = res;
+}
+
 static unsigned long reg_read(struct cpu_user_regs *regs,
                               enum vmx_regs_enc index)
 {
diff -r c8812151acfd -r 8264b01b476b xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
@@ -96,5 +96,61 @@ uint32_t nvmx_vcpu_asid(struct vcpu *v);
 
 int nvmx_handle_vmxon(struct cpu_user_regs *regs);
 int nvmx_handle_vmxoff(struct cpu_user_regs *regs);
+/*
+ * Virtual VMCS layout
+ *
+ * Since physical VMCS layout is unknown, a custom layout is used
+ * for virtual VMCS seen by guest. It occupies a 4k page, and the
+ * field is offset by an 9-bit offset into u64[], The offset is as
+ * follow, which means every <width, type> pair has a max of 32
+ * fields available.
+ *
+ *             9       7      5               0
+ *             --------------------------------
+ *     offset: | width | type |     index     |
+ *             --------------------------------
+ *
+ * Also, since the lower range <width=0, type={0,1}> has only one
+ * field: VPID, it is moved to a higher offset (63), and leaves the
+ * lower range to non-indexed field like VMCS revision.
+ *
+ */
+
+#define VVMCS_REVISION 0x40000001u
+
+struct vvmcs_header {
+    u32 revision;
+    u32 abort;
+};
+
+union vmcs_encoding {
+    struct {
+        u32 access_type : 1;
+        u32 index : 9;
+        u32 type : 2;
+        u32 rsv1 : 1;
+        u32 width : 2;
+        u32 rsv2 : 17;
+    };
+    u32 word;
+};
+
+enum vvmcs_encoding_width {
+    VVMCS_WIDTH_16 = 0,
+    VVMCS_WIDTH_64,
+    VVMCS_WIDTH_32,
+    VVMCS_WIDTH_NATURAL,
+};
+
+enum vvmcs_encoding_type {
+    VVMCS_TYPE_CONTROL = 0,
+    VVMCS_TYPE_RO,
+    VVMCS_TYPE_GSTATE,
+    VVMCS_TYPE_HSTATE,
+};
+
+u64 __get_vvmcs(void *vvmcs, u32 vmcs_encoding);
+void __set_vvmcs(void *vvmcs, u32 vmcs_encoding, u64 val);
+
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 07 of 20] Emulation of guest vmptrld
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (5 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 06 of 20] Define structure and access APIs for virtual VMCS Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02 14:45   ` Tim Deegan
  2011-06-02  8:57 ` [PATCH 08 of 20] Emulation of guest VMPTRST Eddie Dong
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003600 -28800
# Node ID 4dad232d7fc3bd62979a1b442d989fe0ca4baafe
# Parent  8264b01b476b1b695727f78d92ab0ce553aa7516
Emulation of guest vmptrld

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 8264b01b476b -r 4dad232d7fc3 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -2444,11 +2444,15 @@ asmlinkage void vmx_vmexit_handler(struc
             update_guest_eip();
         break;
 
+    case EXIT_REASON_VMPTRLD:
+        if ( nvmx_handle_vmptrld(regs) == X86EMUL_OKAY )
+            update_guest_eip();
+        break;
+
     case EXIT_REASON_MWAIT_INSTRUCTION:
     case EXIT_REASON_MONITOR_INSTRUCTION:
     case EXIT_REASON_VMCLEAR:
     case EXIT_REASON_VMLAUNCH:
-    case EXIT_REASON_VMPTRLD:
     case EXIT_REASON_VMPTRST:
     case EXIT_REASON_VMREAD:
     case EXIT_REASON_VMRESUME:
diff -r 8264b01b476b -r 4dad232d7fc3 xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -356,6 +356,41 @@ static void vmreturn(struct cpu_user_reg
     regs->eflags = eflags;
 }
 
+static void __map_io_bitmap(struct vcpu *v, u64 vmcs_reg)
+{
+    struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
+    unsigned long gpa;
+    unsigned long mfn;
+    p2m_type_t p2mt;
+
+    if ( vmcs_reg == IO_BITMAP_A )
+    {
+        if (nvmx->iobitmap[0]) {
+            unmap_domain_page_global(nvmx->iobitmap[0]);
+        }
+        gpa = __get_vvmcs(vcpu_nestedhvm(v).nv_vvmcx, IO_BITMAP_A);
+        mfn = mfn_x(gfn_to_mfn(p2m_get_hostp2m(v->domain),
+                              gpa >> PAGE_SHIFT, &p2mt));
+        nvmx->iobitmap[0] = map_domain_page_global(mfn);
+    }
+    else if ( vmcs_reg == IO_BITMAP_B )
+    {
+        if (nvmx->iobitmap[1]) {
+            unmap_domain_page_global(nvmx->iobitmap[1]);
+        }
+        gpa = __get_vvmcs(vcpu_nestedhvm(v).nv_vvmcx, IO_BITMAP_B);
+        mfn = mfn_x(gfn_to_mfn(p2m_get_hostp2m(v->domain),
+                               gpa >> PAGE_SHIFT, &p2mt));
+        nvmx->iobitmap[1] = map_domain_page_global(mfn);
+    }
+}
+
+static inline void map_io_bitmap_all(struct vcpu *v)
+{
+   __map_io_bitmap (v, IO_BITMAP_A);
+   __map_io_bitmap (v, IO_BITMAP_B);
+}
+
 /*
  * VMX instructions handling
  */
@@ -364,6 +399,7 @@ int nvmx_handle_vmxon(struct cpu_user_re
 {
     struct vcpu *v=current;
     struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
     struct vmx_inst_decoded decode;
     unsigned long gpa = 0;
     int rc;
@@ -372,7 +408,22 @@ int nvmx_handle_vmxon(struct cpu_user_re
     if ( rc != X86EMUL_OKAY )
         return rc;
 
+    if ( nvmx->vmxon_region_pa )
+        gdprintk(XENLOG_WARNING, 
+                 "vmxon again: orig %lx new %lx\n",
+                 nvmx->vmxon_region_pa, gpa);
+
     nvmx->vmxon_region_pa = gpa;
+
+    /*
+     * `fork' the host vmcs to shadow_vmcs
+     * vmcs_lock is not needed since we are on current
+     */
+    nvcpu->nv_n1vmcx = v->arch.hvm_vmx.vmcs;
+    __vmpclear(virt_to_maddr(v->arch.hvm_vmx.vmcs));
+    memcpy(nvcpu->nv_n2vmcx, v->arch.hvm_vmx.vmcs, PAGE_SIZE);
+    __vmptrld(virt_to_maddr(v->arch.hvm_vmx.vmcs));
+    v->arch.hvm_vmx.launched = 0;
     vmreturn(regs, VMSUCCEED);
 
     return X86EMUL_OKAY;
@@ -394,3 +445,38 @@ int nvmx_handle_vmxoff(struct cpu_user_r
     return X86EMUL_OKAY;
 }
 
+int nvmx_handle_vmptrld(struct cpu_user_regs *regs)
+{
+    struct vcpu *v = current;
+    struct vmx_inst_decoded decode;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    unsigned long gpa = 0;
+    unsigned long mfn;
+    p2m_type_t p2mt;
+    int rc;
+
+    rc = decode_vmx_inst(regs, &decode, &gpa, 0);
+    if ( rc != X86EMUL_OKAY )
+        return rc;
+
+    if ( gpa == vcpu_2_nvmx(v).vmxon_region_pa || gpa & 0xfff )
+    {
+        vmreturn(regs, VMFAIL_INVALID);
+        goto out;
+    }
+
+    if ( nvcpu->nv_vvmcxaddr == VMCX_EADDR )
+    {
+        mfn = mfn_x(gfn_to_mfn(p2m_get_hostp2m(v->domain),
+                               gpa >> PAGE_SHIFT, &p2mt));
+        nvcpu->nv_vvmcx = map_domain_page_global(mfn);
+        nvcpu->nv_vvmcxaddr = gpa;
+        map_io_bitmap_all (v);
+    }
+
+    vmreturn(regs, VMSUCCEED);
+
+out:
+    return X86EMUL_OKAY;
+}
+
diff -r 8264b01b476b -r 4dad232d7fc3 xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
@@ -152,5 +152,8 @@ enum vvmcs_encoding_type {
 u64 __get_vvmcs(void *vvmcs, u32 vmcs_encoding);
 void __set_vvmcs(void *vvmcs, u32 vmcs_encoding, u64 val);
 
+void nvmx_destroy_vmcs(struct vcpu *v);
+int nvmx_handle_vmptrld(struct cpu_user_regs *regs);
+
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 08 of 20] Emulation of guest VMPTRST
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (6 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 07 of 20] Emulation of guest vmptrld Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02  8:57 ` [PATCH 09 of 20] Emulation of guest VMCLEAR Eddie Dong
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003600 -28800
# Node ID 54332433d873777e57e6ac47ee841a2a96c2f543
# Parent  4dad232d7fc3bd62979a1b442d989fe0ca4baafe
Emulation of guest VMPTRST

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 4dad232d7fc3 -r 54332433d873 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -2449,11 +2449,15 @@ asmlinkage void vmx_vmexit_handler(struc
             update_guest_eip();
         break;
 
+    case EXIT_REASON_VMPTRST:
+        if ( nvmx_handle_vmptrst(regs) == X86EMUL_OKAY )
+            update_guest_eip();
+        break;
+
     case EXIT_REASON_MWAIT_INSTRUCTION:
     case EXIT_REASON_MONITOR_INSTRUCTION:
     case EXIT_REASON_VMCLEAR:
     case EXIT_REASON_VMLAUNCH:
-    case EXIT_REASON_VMPTRST:
     case EXIT_REASON_VMREAD:
     case EXIT_REASON_VMRESUME:
     case EXIT_REASON_VMWRITE:
diff -r 4dad232d7fc3 -r 54332433d873 xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -480,3 +480,25 @@ out:
     return X86EMUL_OKAY;
 }
 
+int nvmx_handle_vmptrst(struct cpu_user_regs *regs)
+{
+    struct vcpu *v = current;
+    struct vmx_inst_decoded decode;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    unsigned long gpa = 0;
+    int rc;
+
+    rc = decode_vmx_inst(regs, &decode, &gpa, 0);
+    if ( rc != X86EMUL_OKAY )
+        return rc;
+
+    gpa = nvcpu->nv_vvmcxaddr;
+
+    rc = hvm_copy_to_guest_virt(decode.mem, &gpa, decode.len, 0);
+    if ( rc != HVMCOPY_okay )
+        return X86EMUL_EXCEPTION;
+
+    vmreturn(regs, VMSUCCEED);
+    return X86EMUL_OKAY;
+}
+
diff -r 4dad232d7fc3 -r 54332433d873 xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
@@ -154,6 +154,7 @@ void __set_vvmcs(void *vvmcs, u32 vmcs_e
 
 void nvmx_destroy_vmcs(struct vcpu *v);
 int nvmx_handle_vmptrld(struct cpu_user_regs *regs);
+int nvmx_handle_vmptrst(struct cpu_user_regs *regs);
 
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 09 of 20] Emulation of guest VMCLEAR
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (7 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 08 of 20] Emulation of guest VMPTRST Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02  8:57 ` [PATCH 10 of 20] Emulation of guest VMWRITE Eddie Dong
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003600 -28800
# Node ID 35cc736e8a75a0a349790871232f8761ceae41be
# Parent  54332433d873777e57e6ac47ee841a2a96c2f543
Emulation of guest VMCLEAR

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 54332433d873 -r 35cc736e8a75 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -2444,6 +2444,11 @@ asmlinkage void vmx_vmexit_handler(struc
             update_guest_eip();
         break;
 
+    case EXIT_REASON_VMCLEAR:
+        if ( nvmx_handle_vmclear(regs) == X86EMUL_OKAY )
+            update_guest_eip();
+        break;
+ 
     case EXIT_REASON_VMPTRLD:
         if ( nvmx_handle_vmptrld(regs) == X86EMUL_OKAY )
             update_guest_eip();
@@ -2456,7 +2461,6 @@ asmlinkage void vmx_vmexit_handler(struc
 
     case EXIT_REASON_MWAIT_INSTRUCTION:
     case EXIT_REASON_MONITOR_INSTRUCTION:
-    case EXIT_REASON_VMCLEAR:
     case EXIT_REASON_VMLAUNCH:
     case EXIT_REASON_VMREAD:
     case EXIT_REASON_VMRESUME:
diff -r 54332433d873 -r 35cc736e8a75 xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -356,6 +356,14 @@ static void vmreturn(struct cpu_user_reg
     regs->eflags = eflags;
 }
 
+static void __clear_current_vvmcs(struct vcpu *v)
+{
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    
+    if ( nvcpu->nv_n2vmcx )
+        __vmpclear(virt_to_maddr(nvcpu->nv_n2vmcx));
+}
+
 static void __map_io_bitmap(struct vcpu *v, u64 vmcs_reg)
 {
     struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
@@ -391,6 +399,26 @@ static inline void map_io_bitmap_all(str
    __map_io_bitmap (v, IO_BITMAP_B);
 }
 
+static void nvmx_purge_vvmcs(struct vcpu *v)
+{
+    struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+
+    __clear_current_vvmcs(v);
+    if ( nvcpu->nv_vvmcxaddr != VMCX_EADDR )
+        unmap_domain_page_global(nvcpu->nv_vvmcx);
+    nvcpu->nv_vvmcx == NULL;
+    nvcpu->nv_vvmcxaddr = VMCX_EADDR;
+    if ( nvmx->iobitmap[0] ) {
+        unmap_domain_page_global(nvmx->iobitmap[0]);
+        nvmx->iobitmap[0] = NULL;
+    }
+    if ( nvmx->iobitmap[1] ) {
+        unmap_domain_page_global(nvmx->iobitmap[1]);
+        nvmx->iobitmap[1] = NULL;
+    }
+}
+
 /*
  * VMX instructions handling
  */
@@ -439,6 +467,7 @@ int nvmx_handle_vmxoff(struct cpu_user_r
     if ( rc != X86EMUL_OKAY )
         return rc;
 
+    nvmx_purge_vvmcs(v);
     nvmx->vmxon_region_pa = 0;
 
     vmreturn(regs, VMSUCCEED);
@@ -465,6 +494,9 @@ int nvmx_handle_vmptrld(struct cpu_user_
         goto out;
     }
 
+    if ( nvcpu->nv_vvmcxaddr != gpa )
+        nvmx_purge_vvmcs(v);
+
     if ( nvcpu->nv_vvmcxaddr == VMCX_EADDR )
     {
         mfn = mfn_x(gfn_to_mfn(p2m_get_hostp2m(v->domain),
@@ -502,3 +534,37 @@ int nvmx_handle_vmptrst(struct cpu_user_
     return X86EMUL_OKAY;
 }
 
+int nvmx_handle_vmclear(struct cpu_user_regs *regs)
+{
+    struct vcpu *v = current;
+    struct vmx_inst_decoded decode;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    unsigned long gpa = 0;
+    int rc;
+
+    rc = decode_vmx_inst(regs, &decode, &gpa, 0);
+    if ( rc != X86EMUL_OKAY )
+        return rc;
+
+    if ( gpa & 0xfff )
+    {
+        vmreturn(regs, VMFAIL_INVALID);
+        goto out;
+    }
+
+    if ( gpa != nvcpu->nv_vvmcxaddr && nvcpu->nv_vvmcxaddr != VMCX_EADDR )
+    {
+        gdprintk(XENLOG_WARNING, 
+                 "vmclear gpa %lx not the same with current vmcs %lx\n",
+                 gpa, nvcpu->nv_vvmcxaddr);
+        vmreturn(regs, VMSUCCEED);
+        goto out;
+    }
+    nvmx_purge_vvmcs(v);
+
+    vmreturn(regs, VMSUCCEED);
+
+out:
+    return X86EMUL_OKAY;
+}
+
diff -r 54332433d873 -r 35cc736e8a75 xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
@@ -155,6 +155,7 @@ void __set_vvmcs(void *vvmcs, u32 vmcs_e
 void nvmx_destroy_vmcs(struct vcpu *v);
 int nvmx_handle_vmptrld(struct cpu_user_regs *regs);
 int nvmx_handle_vmptrst(struct cpu_user_regs *regs);
+int nvmx_handle_vmclear(struct cpu_user_regs *regs);
 
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 10 of 20] Emulation of guest VMWRITE
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (8 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 09 of 20] Emulation of guest VMCLEAR Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02  8:57 ` [PATCH 11 of 20] Emulation of guest VMREAD Eddie Dong
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003600 -28800
# Node ID 16e0e95f457e9b3f8ff0528c8f2b0f88b1c41109
# Parent  35cc736e8a75a0a349790871232f8761ceae41be
Emulation of guest VMWRITE

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 35cc736e8a75 -r 16e0e95f457e xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -2459,12 +2459,16 @@ asmlinkage void vmx_vmexit_handler(struc
             update_guest_eip();
         break;
 
+    case EXIT_REASON_VMWRITE:
+        if ( nvmx_handle_vmwrite(regs) == X86EMUL_OKAY )
+            update_guest_eip();
+        break;
+
     case EXIT_REASON_MWAIT_INSTRUCTION:
     case EXIT_REASON_MONITOR_INSTRUCTION:
     case EXIT_REASON_VMLAUNCH:
     case EXIT_REASON_VMREAD:
     case EXIT_REASON_VMRESUME:
-    case EXIT_REASON_VMWRITE:
     case EXIT_REASON_GETSEC:
     case EXIT_REASON_INVEPT:
     case EXIT_REASON_INVVPID:
diff -r 35cc736e8a75 -r 16e0e95f457e xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -568,3 +568,27 @@ out:
     return X86EMUL_OKAY;
 }
 
+int nvmx_handle_vmwrite(struct cpu_user_regs *regs)
+{
+    struct vcpu *v = current;
+    struct vmx_inst_decoded decode;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    u64 operandS, vmcs_encoding;
+
+    if ( decode_vmx_inst(regs, &decode, &operandS, 0)
+             != X86EMUL_OKAY )
+        return X86EMUL_EXCEPTION;
+
+    vmcs_encoding = reg_read(regs, decode.reg2);
+    __set_vvmcs(nvcpu->nv_vvmcx, vmcs_encoding, operandS);
+
+    if ( vmcs_encoding == IO_BITMAP_A || vmcs_encoding == IO_BITMAP_A_HIGH )
+        __map_io_bitmap (v, IO_BITMAP_A);
+    else if ( vmcs_encoding == IO_BITMAP_B || 
+              vmcs_encoding == IO_BITMAP_B_HIGH )
+        __map_io_bitmap (v, IO_BITMAP_B);
+
+    vmreturn(regs, VMSUCCEED);
+    return X86EMUL_OKAY;
+}
+
diff -r 35cc736e8a75 -r 16e0e95f457e xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
@@ -156,6 +156,7 @@ void nvmx_destroy_vmcs(struct vcpu *v);
 int nvmx_handle_vmptrld(struct cpu_user_regs *regs);
 int nvmx_handle_vmptrst(struct cpu_user_regs *regs);
 int nvmx_handle_vmclear(struct cpu_user_regs *regs);
+int nvmx_handle_vmwrite(struct cpu_user_regs *regs);
 
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 11 of 20] Emulation of guest VMREAD
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (9 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 10 of 20] Emulation of guest VMWRITE Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02  8:57 ` [PATCH 12 of 20] Add APIs to switch n1/n2 VMCS Eddie Dong
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003600 -28800
# Node ID 4631a951120093ade781c4f4542741266b615576
# Parent  16e0e95f457e9b3f8ff0528c8f2b0f88b1c41109
Emulation of guest VMREAD

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 16e0e95f457e -r 4631a9511200 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -2459,6 +2459,11 @@ asmlinkage void vmx_vmexit_handler(struc
             update_guest_eip();
         break;
 
+    case EXIT_REASON_VMREAD:
+        if ( nvmx_handle_vmread(regs) == X86EMUL_OKAY )
+            update_guest_eip();
+        break;
+ 
     case EXIT_REASON_VMWRITE:
         if ( nvmx_handle_vmwrite(regs) == X86EMUL_OKAY )
             update_guest_eip();
@@ -2467,7 +2472,6 @@ asmlinkage void vmx_vmexit_handler(struc
     case EXIT_REASON_MWAIT_INSTRUCTION:
     case EXIT_REASON_MONITOR_INSTRUCTION:
     case EXIT_REASON_VMLAUNCH:
-    case EXIT_REASON_VMREAD:
     case EXIT_REASON_VMRESUME:
     case EXIT_REASON_GETSEC:
     case EXIT_REASON_INVEPT:
diff -r 16e0e95f457e -r 4631a9511200 xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
@@ -121,6 +121,8 @@ enum vmx_ops_result {
     VMFAIL_INVALID,
 };
 
+#define CASE_SET_REG(REG, reg)      \
+    case VMX_REG_ ## REG: regs->reg = value; break
 #define CASE_GET_REG(REG, reg)      \
     case VMX_REG_ ## REG: value = regs->reg; break
 
@@ -233,6 +235,32 @@ static unsigned long reg_read(struct cpu
     return value;
 }
 
+static void reg_write(struct cpu_user_regs *regs,
+                      enum vmx_regs_enc index,
+                      unsigned long value)
+{
+    switch ( index ) {
+    CASE_SET_REG(RAX, eax);
+    CASE_SET_REG(RCX, ecx);
+    CASE_SET_REG(RDX, edx);
+    CASE_SET_REG(RBX, ebx);
+    CASE_SET_REG(RBP, ebp);
+    CASE_SET_REG(RSI, esi);
+    CASE_SET_REG(RDI, edi);
+    CASE_SET_REG(RSP, esp);
+    CASE_SET_REG(R8, r8);
+    CASE_SET_REG(R9, r9);
+    CASE_SET_REG(R10, r10);
+    CASE_SET_REG(R11, r11);
+    CASE_SET_REG(R12, r12);
+    CASE_SET_REG(R13, r13);
+    CASE_SET_REG(R14, r14);
+    CASE_SET_REG(R15, r15);
+    default:
+        break;
+    }
+}
+
 static int vmx_inst_check_privilege(struct cpu_user_regs *regs, int vmxop_check)
 {
     struct vcpu *v = current;
@@ -568,6 +596,35 @@ out:
     return X86EMUL_OKAY;
 }
 
+int nvmx_handle_vmread(struct cpu_user_regs *regs)
+{
+    struct vcpu *v = current;
+    struct vmx_inst_decoded decode;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    u64 value = 0;
+    int rc;
+
+    rc = decode_vmx_inst(regs, &decode, NULL, 0);
+    if ( rc != X86EMUL_OKAY )
+        return rc;
+
+    value = __get_vvmcs(nvcpu->nv_vvmcx, reg_read(regs, decode.reg2));
+
+    switch ( decode.type ) {
+    case VMX_INST_MEMREG_TYPE_MEMORY:
+        rc = hvm_copy_to_guest_virt(decode.mem, &value, decode.len, 0);
+        if ( rc != HVMCOPY_okay )
+            return X86EMUL_EXCEPTION;
+        break;
+    case VMX_INST_MEMREG_TYPE_REG:
+        reg_write(regs, decode.reg1, value);
+        break;
+    }
+
+    vmreturn(regs, VMSUCCEED);
+    return X86EMUL_OKAY;
+}
+
 int nvmx_handle_vmwrite(struct cpu_user_regs *regs)
 {
     struct vcpu *v = current;
diff -r 16e0e95f457e -r 4631a9511200 xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:20 2011 +0800
@@ -156,6 +156,7 @@ void nvmx_destroy_vmcs(struct vcpu *v);
 int nvmx_handle_vmptrld(struct cpu_user_regs *regs);
 int nvmx_handle_vmptrst(struct cpu_user_regs *regs);
 int nvmx_handle_vmclear(struct cpu_user_regs *regs);
+int nvmx_handle_vmread(struct cpu_user_regs *regs);
 int nvmx_handle_vmwrite(struct cpu_user_regs *regs);
 
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 12 of 20] Add APIs to switch n1/n2 VMCS
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (10 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 11 of 20] Emulation of guest VMREAD Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02 14:50   ` Tim Deegan
  2011-06-02  8:57 ` [PATCH 13 of 20] Emulation of VMRESUME/VMLAUNCH Eddie Dong
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003601 -28800
# Node ID 62cc6c7516e010ef673c75bba83f901785b063d5
# Parent  4631a951120093ade781c4f4542741266b615576
Add APIs to switch n1/n2 VMCS.

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 4631a9511200 -r 62cc6c7516e0 xen/arch/x86/hvm/vmx/vmcs.c
--- a/xen/arch/x86/hvm/vmx/vmcs.c	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmcs.c	Thu Jun 02 16:33:21 2011 +0800
@@ -669,6 +669,38 @@ void vmx_disable_intercept_for_msr(struc
     }
 }
 
+/*
+ * Switch VMCS between layer 1 & 2 guest
+ */
+void vmx_vmcs_switch(struct vcpu *v,
+                             struct vmcs_struct *from,
+                             struct vmcs_struct *to)
+{
+    /* no foreign access */
+    if ( unlikely(v != current) )
+        return;
+
+    if ( unlikely(current->arch.hvm_vmx.vmcs != from) )
+        return;
+
+    spin_lock(&v->arch.hvm_vmx.vmcs_lock);
+
+    __vmpclear(virt_to_maddr(from));
+    __vmptrld(virt_to_maddr(to));
+
+    v->arch.hvm_vmx.vmcs = to;
+    v->arch.hvm_vmx.launched = 0;
+    this_cpu(current_vmcs) = to;
+
+    if ( v->arch.hvm_vmx.hostenv_migrated )
+    {
+        v->arch.hvm_vmx.hostenv_migrated = 0;
+        vmx_set_host_env(v);
+    }
+
+    spin_unlock(&v->arch.hvm_vmx.vmcs_lock);
+}
+
 static int construct_vmcs(struct vcpu *v)
 {
     struct domain *d = v->domain;
@@ -1078,6 +1110,13 @@ void vmx_do_resume(struct vcpu *v)
         hvm_migrate_timers(v);
         hvm_migrate_pirqs(v);
         vmx_set_host_env(v);
+        /*
+         * Both n1 VMCS and n2 VMCS need to update the host environment after 
+         * VCPU migration. The environment of current VMCS is updated in place,
+         * but the action of another VMCS is deferred till it is switched in.
+         */
+        v->arch.hvm_vmx.hostenv_migrated = 1;
+
         hvm_asid_flush_vcpu(v);
     }
 
diff -r 4631a9511200 -r 62cc6c7516e0 xen/include/asm-x86/hvm/vmx/vmcs.h
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h	Thu Jun 02 16:33:20 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h	Thu Jun 02 16:33:21 2011 +0800
@@ -123,6 +123,7 @@ struct arch_vmx_struct {
     struct segment_register vm86_saved_seg[x86_seg_tr + 1];
     /* Remember EFLAGS while in virtual 8086 mode */
     uint32_t             vm86_saved_eflags;
+    int                  hostenv_migrated;
 };
 
 int vmx_create_vmcs(struct vcpu *v);
@@ -390,6 +391,9 @@ int vmx_read_guest_msr(u32 msr, u64 *val
 int vmx_write_guest_msr(u32 msr, u64 val);
 int vmx_add_guest_msr(u32 msr);
 int vmx_add_host_load_msr(u32 msr);
+void vmx_vmcs_switch(struct vcpu *v,
+                      struct vmcs_struct *from,
+                      struct vmcs_struct *to);
 
 #endif /* ASM_X86_HVM_VMX_VMCS_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 13 of 20] Emulation of VMRESUME/VMLAUNCH
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (11 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 12 of 20] Add APIs to switch n1/n2 VMCS Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02  8:57 ` [PATCH 14 of 20] Extend VMCS control fields for n2 guest Eddie Dong
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003601 -28800
# Node ID 279a27a3b1a90380c8fa579e87835cb58a8f4aac
# Parent  62cc6c7516e010ef673c75bba83f901785b063d5
Emulation of VMRESUME/VMLAUNCH

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 62cc6c7516e0 -r 279a27a3b1a9 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:21 2011 +0800
@@ -2175,6 +2175,11 @@ asmlinkage void vmx_vmexit_handler(struc
     /* Now enable interrupts so it's safe to take locks. */
     local_irq_enable();
 
+    /* XXX: This looks ugly, but we need a mechanism to ensure
+     * any pending vmresume has really happened
+     */
+    vcpu_nestedhvm(v).nv_vmswitch_in_progress = 0;
+
     if ( unlikely(exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) )
         return vmx_failed_vmentry(exit_reason, regs);
 
@@ -2469,10 +2474,18 @@ asmlinkage void vmx_vmexit_handler(struc
             update_guest_eip();
         break;
 
+    case EXIT_REASON_VMLAUNCH:
+        if ( nvmx_handle_vmlaunch(regs) == X86EMUL_OKAY )
+            update_guest_eip();
+        break;
+
+    case EXIT_REASON_VMRESUME:
+        if ( nvmx_handle_vmresume(regs) == X86EMUL_OKAY )
+            update_guest_eip();
+        break;
+
     case EXIT_REASON_MWAIT_INSTRUCTION:
     case EXIT_REASON_MONITOR_INSTRUCTION:
-    case EXIT_REASON_VMLAUNCH:
-    case EXIT_REASON_VMRESUME:
     case EXIT_REASON_GETSEC:
     case EXIT_REASON_INVEPT:
     case EXIT_REASON_INVVPID:
diff -r 62cc6c7516e0 -r 279a27a3b1a9 xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
@@ -261,6 +261,13 @@ static void reg_write(struct cpu_user_re
     }
 }
 
+static inline u32 __n2_exec_control(struct vcpu *v)
+{
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+
+    return __get_vvmcs(nvcpu->nv_vvmcx, CPU_BASED_VM_EXEC_CONTROL);
+}
+
 static int vmx_inst_check_privilege(struct cpu_user_regs *regs, int vmxop_check)
 {
     struct vcpu *v = current;
@@ -502,6 +509,34 @@ int nvmx_handle_vmxoff(struct cpu_user_r
     return X86EMUL_OKAY;
 }
 
+int nvmx_handle_vmresume(struct cpu_user_regs *regs)
+{
+    struct vcpu *v = current;
+    struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    int rc;
+
+    rc = vmx_inst_check_privilege(regs, 0);
+    if ( rc != X86EMUL_OKAY )
+        return rc;
+
+    /* check VMCS is valid and IO BITMAP is set */
+    if ( (nvcpu->nv_vvmcxaddr != VMCX_EADDR) &&
+            ((nvmx->iobitmap[0] && nvmx->iobitmap[1]) ||
+            !(__n2_exec_control(v) & CPU_BASED_ACTIVATE_IO_BITMAP) ) )
+        nvcpu->nv_vmentry_pending = 1;
+    else
+        vmreturn(regs, VMFAIL_INVALID);
+
+    return X86EMUL_OKAY;
+}
+
+int nvmx_handle_vmlaunch(struct cpu_user_regs *regs)
+{
+    /* TODO: check for initial launch/resume */
+    return nvmx_handle_vmresume(regs);
+}
+
 int nvmx_handle_vmptrld(struct cpu_user_regs *regs)
 {
     struct vcpu *v = current;
diff -r 62cc6c7516e0 -r 279a27a3b1a9 xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:21 2011 +0800
@@ -158,6 +158,8 @@ int nvmx_handle_vmptrst(struct cpu_user_
 int nvmx_handle_vmclear(struct cpu_user_regs *regs);
 int nvmx_handle_vmread(struct cpu_user_regs *regs);
 int nvmx_handle_vmwrite(struct cpu_user_regs *regs);
+int nvmx_handle_vmresume(struct cpu_user_regs *regs);
+int nvmx_handle_vmlaunch(struct cpu_user_regs *regs);
 
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 14 of 20] Extend VMCS control fields for n2 guest
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (12 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 13 of 20] Emulation of VMRESUME/VMLAUNCH Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02  8:57 ` [PATCH 15 of 20] Switch shadow/virtual VMCS between n1/n2 guests Eddie Dong
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003601 -28800
# Node ID aacbe98da103be572c9f96d6c85788f74f574117
# Parent  279a27a3b1a90380c8fa579e87835cb58a8f4aac
Extend VMCS control fields for n2 guest

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 279a27a3b1a9 -r aacbe98da103 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:21 2011 +0800
@@ -54,6 +54,7 @@
 #include <asm/xenoprof.h>
 #include <asm/debugger.h>
 #include <asm/apic.h>
+#include <asm/hvm/nestedhvm.h>
 
 enum handler_return { HNDL_done, HNDL_unhandled, HNDL_exception_raised };
 
@@ -361,18 +362,28 @@ long_mode_do_msr_write(unsigned int msr,
 
 void vmx_update_cpu_exec_control(struct vcpu *v)
 {
-    __vmwrite(CPU_BASED_VM_EXEC_CONTROL, v->arch.hvm_vmx.exec_control);
+    if ( nestedhvm_vcpu_in_guestmode(v) )
+        nvmx_update_exec_control(v, v->arch.hvm_vmx.exec_control);
+    else
+        __vmwrite(CPU_BASED_VM_EXEC_CONTROL, v->arch.hvm_vmx.exec_control);
 }
 
 static void vmx_update_secondary_exec_control(struct vcpu *v)
 {
-    __vmwrite(SECONDARY_VM_EXEC_CONTROL,
-              v->arch.hvm_vmx.secondary_exec_control);
+    if ( nestedhvm_vcpu_in_guestmode(v) )
+        nvmx_update_secondary_exec_control(v,
+            v->arch.hvm_vmx.secondary_exec_control);
+    else
+        __vmwrite(SECONDARY_VM_EXEC_CONTROL,
+                  v->arch.hvm_vmx.secondary_exec_control);
 }
 
 void vmx_update_exception_bitmap(struct vcpu *v)
 {
-    __vmwrite(EXCEPTION_BITMAP, v->arch.hvm_vmx.exception_bitmap);
+    if ( nestedhvm_vcpu_in_guestmode(v) )
+        nvmx_update_exception_bitmap(v, v->arch.hvm_vmx.exception_bitmap);
+    else
+        __vmwrite(EXCEPTION_BITMAP, v->arch.hvm_vmx.exception_bitmap);
 }
 
 static int vmx_guest_x86_mode(struct vcpu *v)
diff -r 279a27a3b1a9 -r aacbe98da103 xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
@@ -25,6 +25,7 @@
 #include <asm/p2m.h>
 #include <asm/hvm/vmx/vmx.h>
 #include <asm/hvm/vmx/vvmx.h>
+#include <asm/hvm/nestedhvm.h>
 
 int nvmx_vcpu_initialise(struct vcpu *v)
 {
@@ -391,6 +392,93 @@ static void vmreturn(struct cpu_user_reg
     regs->eflags = eflags;
 }
 
+/*
+ * Nested VMX uses "strict" condition to exit from 
+ * L2 guest if either L1 VMM or L0 VMM expect to exit.
+ */
+static inline u32 __shadow_control(struct vcpu *v,
+                                 unsigned int field,
+                                 u32 host_value)
+{
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+
+    return (u32) __get_vvmcs(nvcpu->nv_vvmcx, field) | host_value;
+}
+
+static void set_shadow_control(struct vcpu *v,
+                               unsigned int field,
+                               u32 host_value)
+{
+    __vmwrite(field, __shadow_control(v, field, host_value));
+}
+
+unsigned long *_shadow_io_bitmap(struct vcpu *v)
+{
+    struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
+    int port80, portED;
+    u8 *bitmap;
+
+    bitmap = nvmx->iobitmap[0];
+    port80 = bitmap[0x80 >> 3] & (1 << (0x80 & 0x7)) ? 1 : 0;
+    portED = bitmap[0xed >> 3] & (1 << (0xed & 0x7)) ? 1 : 0;
+
+    return nestedhvm_vcpu_iomap_get(port80, portED);
+}
+
+void nvmx_update_exec_control(struct vcpu *v, unsigned long host_cntrl)
+{
+#define PIO_CNTRL_BITS    ( CPU_BASED_ACTIVATE_IO_BITMAP         \
+             | CPU_BASED_UNCOND_IO_EXITING)
+    u32 pio_cntrl = PIO_CNTRL_BITS;
+    unsigned long *bitmap; 
+    u32 shadow_cntrl;
+ 
+    shadow_cntrl = __n2_exec_control(v);
+    pio_cntrl &= shadow_cntrl;
+    /* Enforce the removed features */
+#define REMOVED_EXEC_CONTROL_BITS (CPU_BASED_TPR_SHADOW          \
+             | CPU_BASED_ACTIVATE_MSR_BITMAP                     \
+             | CPU_BASED_ACTIVATE_SECONDARY_CONTROLS             \
+             | CPU_BASED_ACTIVATE_IO_BITMAP                      \
+             | CPU_BASED_UNCOND_IO_EXITING)
+    shadow_cntrl &= ~REMOVED_EXEC_CONTROL_BITS;
+    shadow_cntrl |= host_cntrl;
+    if ( pio_cntrl == CPU_BASED_UNCOND_IO_EXITING ) {
+        /* L1 VMM intercepts all I/O instructions */
+        shadow_cntrl |= CPU_BASED_UNCOND_IO_EXITING;
+        shadow_cntrl &= ~CPU_BASED_ACTIVATE_IO_BITMAP;
+    }
+    else {
+        /* Use IO_BITMAP in shadow */
+        if ( pio_cntrl == 0 ) {
+            /* 
+             * L1 VMM doesn't intercept IO instruction.
+             * Use host configuration and reset IO_BITMAP
+             */
+            bitmap = hvm_io_bitmap;
+        }
+        else {
+            /* use IO bitmap */
+            bitmap = _shadow_io_bitmap(v);
+        }
+        __vmwrite(IO_BITMAP_A, virt_to_maddr(bitmap));
+        __vmwrite(IO_BITMAP_B, virt_to_maddr(bitmap) + PAGE_SIZE);
+    }
+
+    __vmwrite(CPU_BASED_VM_EXEC_CONTROL, shadow_cntrl);
+}
+
+void nvmx_update_secondary_exec_control(struct vcpu *v,
+                                            unsigned long value)
+{
+    set_shadow_control(v, SECONDARY_VM_EXEC_CONTROL, value);
+}
+
+void nvmx_update_exception_bitmap(struct vcpu *v, unsigned long value)
+{
+    set_shadow_control(v, EXCEPTION_BITMAP, value);
+}
+
 static void __clear_current_vvmcs(struct vcpu *v)
 {
     struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
diff -r 279a27a3b1a9 -r aacbe98da103 xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:21 2011 +0800
@@ -161,5 +161,10 @@ int nvmx_handle_vmwrite(struct cpu_user_
 int nvmx_handle_vmresume(struct cpu_user_regs *regs);
 int nvmx_handle_vmlaunch(struct cpu_user_regs *regs);
 
+void nvmx_update_exec_control(struct vcpu *v, unsigned long value);
+void nvmx_update_secondary_exec_control(struct vcpu *v,
+                                        unsigned long value);
+void nvmx_update_exception_bitmap(struct vcpu *v, unsigned long value);
+
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 15 of 20] Switch shadow/virtual VMCS between n1/n2 guests
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (13 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 14 of 20] Extend VMCS control fields for n2 guest Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02 14:56   ` Tim Deegan
  2011-06-02 14:58   ` Tim Deegan
  2011-06-02  8:57 ` [PATCH 16 of 20] interrupt/exception handling for n2 guest Eddie Dong
                   ` (5 subsequent siblings)
  20 siblings, 2 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003601 -28800
# Node ID bd15acfc9b822ccf27b5c7603e600e5e11733907
# Parent  aacbe98da103be572c9f96d6c85788f74f574117
Switch shadow/virtual VMCS between n1/n2 guests.

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r aacbe98da103 -r bd15acfc9b82 xen/arch/x86/hvm/vmx/entry.S
--- a/xen/arch/x86/hvm/vmx/entry.S	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/entry.S	Thu Jun 02 16:33:21 2011 +0800
@@ -119,6 +119,7 @@ vmx_asm_vmexit_handler:
 .globl vmx_asm_do_vmentry
 vmx_asm_do_vmentry:
         call vmx_intr_assist
+        call nvmx_switch_guest
 
         get_current(bx)
         cli
diff -r aacbe98da103 -r bd15acfc9b82 xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
@@ -474,6 +474,48 @@ void nvmx_update_secondary_exec_control(
     set_shadow_control(v, SECONDARY_VM_EXEC_CONTROL, value);
 }
 
+static void nvmx_update_pin_control(struct vcpu *v,
+					unsigned long host_cntrl)
+{
+    u32 shadow_cntrl;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+
+#define REMOVED_PIN_CONTROL_BITS (PIN_BASED_PREEMPT_TIMER)
+    shadow_cntrl = __get_vvmcs(nvcpu->nv_vvmcx, PIN_BASED_VM_EXEC_CONTROL);
+    shadow_cntrl &= ~REMOVED_PIN_CONTROL_BITS;
+    shadow_cntrl |= host_cntrl;
+    __vmwrite(PIN_BASED_VM_EXEC_CONTROL, shadow_cntrl);
+}
+
+static void nvmx_update_exit_control(struct vcpu *v,
+					unsigned long host_cntrl)
+{
+    u32 shadow_cntrl;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+
+#define REMOVED_EXIT_CONTROL_BITS    ((1<<2) |           \
+                (VM_EXIT_SAVE_GUEST_PAT) |               \
+                (VM_EXIT_SAVE_GUEST_EFER) |              \
+                (VM_EXIT_SAVE_PREEMPT_TIMER))
+    shadow_cntrl = __get_vvmcs(nvcpu->nv_vvmcx, VM_EXIT_CONTROLS);
+    shadow_cntrl &= ~REMOVED_EXIT_CONTROL_BITS;
+    shadow_cntrl |= host_cntrl;
+    __vmwrite(VM_EXIT_CONTROLS, shadow_cntrl);
+}
+
+static void nvmx_update_entry_control(struct vcpu *v)
+{
+    u32 shadow_cntrl;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+
+    /* VM_ENTRY_CONTROLS: enforce removed features */
+#define REMOVED_ENTRY_CONTROL_BITS (VM_ENTRY_LOAD_GUEST_PAT \
+            | VM_ENTRY_LOAD_GUEST_EFER)
+    shadow_cntrl = __get_vvmcs(nvcpu->nv_vvmcx, VM_ENTRY_CONTROLS);
+    shadow_cntrl &= ~REMOVED_ENTRY_CONTROL_BITS;
+    __vmwrite(VM_ENTRY_CONTROLS, shadow_cntrl);
+}
+
 void nvmx_update_exception_bitmap(struct vcpu *v, unsigned long value)
 {
     set_shadow_control(v, EXCEPTION_BITMAP, value);
@@ -543,6 +585,361 @@ static void nvmx_purge_vvmcs(struct vcpu
 }
 
 /*
+ * Context synchronized between shadow and virtual VMCS.
+ */
+static unsigned long vmcs_gstate_field[] = {
+    /* 16 BITS */
+    GUEST_ES_SELECTOR,
+    GUEST_CS_SELECTOR,
+    GUEST_SS_SELECTOR,
+    GUEST_DS_SELECTOR,
+    GUEST_FS_SELECTOR,
+    GUEST_GS_SELECTOR,
+    GUEST_LDTR_SELECTOR,
+    GUEST_TR_SELECTOR,
+    /* 64 BITS */
+    VMCS_LINK_POINTER,
+    GUEST_IA32_DEBUGCTL,
+#ifndef CONFIG_X86_64
+    VMCS_LINK_POINTER_HIGH,
+    GUEST_IA32_DEBUGCTL_HIGH,
+#endif
+    /* 32 BITS */
+    GUEST_ES_LIMIT,
+    GUEST_CS_LIMIT,
+    GUEST_SS_LIMIT,
+    GUEST_DS_LIMIT,
+    GUEST_FS_LIMIT,
+    GUEST_GS_LIMIT,
+    GUEST_LDTR_LIMIT,
+    GUEST_TR_LIMIT,
+    GUEST_GDTR_LIMIT,
+    GUEST_IDTR_LIMIT,
+    GUEST_ES_AR_BYTES,
+    GUEST_CS_AR_BYTES,
+    GUEST_SS_AR_BYTES,
+    GUEST_DS_AR_BYTES,
+    GUEST_FS_AR_BYTES,
+    GUEST_GS_AR_BYTES,
+    GUEST_LDTR_AR_BYTES,
+    GUEST_TR_AR_BYTES,
+    GUEST_INTERRUPTIBILITY_INFO,
+    GUEST_ACTIVITY_STATE,
+    GUEST_SYSENTER_CS,
+    /* natural */
+    GUEST_ES_BASE,
+    GUEST_CS_BASE,
+    GUEST_SS_BASE,
+    GUEST_DS_BASE,
+    GUEST_FS_BASE,
+    GUEST_GS_BASE,
+    GUEST_LDTR_BASE,
+    GUEST_TR_BASE,
+    GUEST_GDTR_BASE,
+    GUEST_IDTR_BASE,
+    GUEST_DR7,
+    /*
+     * Following guest states are in local cache (cpu_user_regs)
+     GUEST_RSP,
+     GUEST_RIP,
+     */
+    GUEST_RFLAGS,
+    GUEST_PENDING_DBG_EXCEPTIONS,
+    GUEST_SYSENTER_ESP,
+    GUEST_SYSENTER_EIP,
+};
+
+/*
+ * Context: shadow -> virtual VMCS
+ */
+static unsigned long vmcs_ro_field[] = {
+    GUEST_PHYSICAL_ADDRESS,
+    VM_INSTRUCTION_ERROR,
+    VM_EXIT_REASON,
+    VM_EXIT_INTR_INFO,
+    VM_EXIT_INTR_ERROR_CODE,
+    IDT_VECTORING_INFO,
+    IDT_VECTORING_ERROR_CODE,
+    VM_EXIT_INSTRUCTION_LEN,
+    VMX_INSTRUCTION_INFO,
+    EXIT_QUALIFICATION,
+    GUEST_LINEAR_ADDRESS
+};
+
+static struct vmcs_host_to_guest {
+    unsigned long host_field;
+    unsigned long guest_field;
+} vmcs_h2g_field[] = {
+    {HOST_ES_SELECTOR, GUEST_ES_SELECTOR},
+    {HOST_CS_SELECTOR, GUEST_CS_SELECTOR},
+    {HOST_SS_SELECTOR, GUEST_SS_SELECTOR},
+    {HOST_DS_SELECTOR, GUEST_DS_SELECTOR},
+    {HOST_FS_SELECTOR, GUEST_FS_SELECTOR},
+    {HOST_GS_SELECTOR, GUEST_GS_SELECTOR},
+    {HOST_TR_SELECTOR, GUEST_TR_SELECTOR},
+    {HOST_SYSENTER_CS, GUEST_SYSENTER_CS},
+    {HOST_FS_BASE, GUEST_FS_BASE},
+    {HOST_GS_BASE, GUEST_GS_BASE},
+    {HOST_TR_BASE, GUEST_TR_BASE},
+    {HOST_GDTR_BASE, GUEST_GDTR_BASE},
+    {HOST_IDTR_BASE, GUEST_IDTR_BASE},
+    {HOST_SYSENTER_ESP, GUEST_SYSENTER_ESP},
+    {HOST_SYSENTER_EIP, GUEST_SYSENTER_EIP},
+};
+
+static void vvmcs_to_shadow(void *vvmcs, unsigned int field)
+{
+    u64 value;
+
+    value = __get_vvmcs(vvmcs, field);
+    __vmwrite(field, value);
+}
+
+static void shadow_to_vvmcs(void *vvmcs, unsigned int field)
+{
+    u64 value;
+    int rc;
+
+    value = __vmread_safe(field, &rc);
+    if ( !rc )
+        __set_vvmcs(vvmcs, field, value);
+}
+
+static void load_shadow_control(struct vcpu *v)
+{
+    /* TODO: Make sure the shadow control doesn't set the bits 
+     * L0 VMM doesn't handle.
+     */
+
+    /*
+     * Set shadow controls:  PIN_BASED, CPU_BASED, EXIT, ENTRY
+     * and EXCEPTION
+     * Enforce the removed features
+     */
+    nvmx_update_pin_control(v, vmx_pin_based_exec_control);
+    vmx_update_cpu_exec_control(v);
+    nvmx_update_exit_control(v, vmx_vmexit_control);
+    nvmx_update_entry_control(v);
+    vmx_update_exception_bitmap(v);
+}
+
+static void load_shadow_guest_state(struct vcpu *v)
+{
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    void *vvmcs = nvcpu->nv_vvmcx;
+    int i;
+
+    /* vvmcs.gstate to shadow vmcs.gstate */
+    for ( i = 0; i < ARRAY_SIZE(vmcs_gstate_field); i++ )
+        vvmcs_to_shadow(vvmcs, vmcs_gstate_field[i]);
+
+    hvm_set_cr0(__get_vvmcs(vvmcs, GUEST_CR0));
+    hvm_set_cr4(__get_vvmcs(vvmcs, GUEST_CR4));
+    hvm_set_cr3(__get_vvmcs(vvmcs, GUEST_CR3));
+
+    vvmcs_to_shadow(vvmcs, VM_ENTRY_INTR_INFO);
+    vvmcs_to_shadow(vvmcs, VM_ENTRY_EXCEPTION_ERROR_CODE);
+    vvmcs_to_shadow(vvmcs, VM_ENTRY_INSTRUCTION_LEN);
+
+    /* XXX: should refer to GUEST_HOST_MASK of both L0 and L1 */
+    vvmcs_to_shadow(vvmcs, CR0_READ_SHADOW);
+    vvmcs_to_shadow(vvmcs, CR4_READ_SHADOW);
+    vvmcs_to_shadow(vvmcs, CR0_GUEST_HOST_MASK);
+    vvmcs_to_shadow(vvmcs, CR4_GUEST_HOST_MASK);
+
+    /* TODO: PDPTRs for nested ept */
+    /* TODO: CR3 target control */
+}
+
+static void virtual_vmentry(struct cpu_user_regs *regs)
+{
+    struct vcpu *v = current;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    void *vvmcs = nvcpu->nv_vvmcx;
+#ifdef __x86_64__
+    unsigned long lm_l1, lm_l2;
+#endif
+
+    vmx_vmcs_switch(v, v->arch.hvm_vmx.vmcs, nvcpu->nv_n2vmcx);
+
+    nestedhvm_vcpu_enter_guestmode(v);
+    nvcpu->nv_vmentry_pending = 0;
+    nvcpu->nv_vmswitch_in_progress = 1;
+
+#ifdef __x86_64__
+    /*
+     * EFER handling:
+     * hvm_set_efer won't work if CR0.PG = 1, so we change the value
+     * directly to make hvm_long_mode_enabled(v) work in L2.
+     * An additional update_paging_modes is also needed if
+     * there is 32/64 switch. v->arch.hvm_vcpu.guest_efer doesn't
+     * need to be saved, since its value on vmexit is determined by
+     * L1 exit_controls
+     */
+    lm_l1 = !!hvm_long_mode_enabled(v);
+    lm_l2 = !!(__get_vvmcs(vvmcs, VM_ENTRY_CONTROLS) &
+                           VM_ENTRY_IA32E_MODE);
+
+    if ( lm_l2 )
+        v->arch.hvm_vcpu.guest_efer |= EFER_LMA | EFER_LME;
+    else
+        v->arch.hvm_vcpu.guest_efer &= ~(EFER_LMA | EFER_LME);
+#endif
+
+    load_shadow_control(v);
+    load_shadow_guest_state(v);
+
+#ifdef __x86_64__
+    if ( lm_l1 != lm_l2 )
+    {
+        paging_update_paging_modes(v);
+    }
+#endif
+
+    regs->rip = __get_vvmcs(vvmcs, GUEST_RIP);
+    regs->rsp = __get_vvmcs(vvmcs, GUEST_RSP);
+    regs->rflags = __get_vvmcs(vvmcs, GUEST_RFLAGS);
+
+    /* TODO: EPT_POINTER */
+}
+
+static void sync_vvmcs_guest_state(struct vcpu *v, struct cpu_user_regs *regs)
+{
+    int i;
+    unsigned long mask;
+    unsigned long cr;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    void *vvmcs = nvcpu->nv_vvmcx;
+
+    /* copy shadow vmcs.gstate back to vvmcs.gstate */
+    for ( i = 0; i < ARRAY_SIZE(vmcs_gstate_field); i++ )
+        shadow_to_vvmcs(vvmcs, vmcs_gstate_field[i]);
+    /* RIP, RSP are in user regs */
+    __set_vvmcs(vvmcs, GUEST_RIP, regs->rip);
+    __set_vvmcs(vvmcs, GUEST_RSP, regs->rsp);
+
+    /* SDM 20.6.6: L2 guest execution may change GUEST CR0/CR4 */
+    mask = __get_vvmcs(vvmcs, CR0_GUEST_HOST_MASK);
+    if ( ~mask )
+    {
+        cr = __get_vvmcs(vvmcs, GUEST_CR0);
+        cr = (cr & mask) | (__vmread(GUEST_CR4) & ~mask);
+        __set_vvmcs(vvmcs, GUEST_CR0, cr);
+    }
+
+    mask = __get_vvmcs(vvmcs, CR4_GUEST_HOST_MASK);
+    if ( ~mask )
+    {
+        cr = __get_vvmcs(vvmcs, GUEST_CR4);
+        cr = (cr & mask) | (__vmread(GUEST_CR4) & ~mask);
+        __set_vvmcs(vvmcs, GUEST_CR4, cr);
+    }
+
+    /* CR3 sync if exec doesn't want cr3 load exiting: i.e. nested EPT */
+    if ( !(__n2_exec_control(v) & CPU_BASED_CR3_LOAD_EXITING) )
+        shadow_to_vvmcs(vvmcs, GUEST_CR3);
+}
+
+static void sync_vvmcs_ro(struct vcpu *v)
+{
+    int i;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+
+    for ( i = 0; i < ARRAY_SIZE(vmcs_ro_field); i++ )
+        shadow_to_vvmcs(nvcpu->nv_vvmcx, vmcs_ro_field[i]);
+}
+
+static void load_vvmcs_host_state(struct vcpu *v)
+{
+    int i;
+    u64 r;
+    void *vvmcs = vcpu_nestedhvm(v).nv_vvmcx;
+
+    for ( i = 0; i < ARRAY_SIZE(vmcs_h2g_field); i++ )
+    {
+        r = __get_vvmcs(vvmcs, vmcs_h2g_field[i].host_field);
+        __vmwrite(vmcs_h2g_field[i].guest_field, r);
+    }
+
+    hvm_set_cr0(__get_vvmcs(vvmcs, HOST_CR0));
+    hvm_set_cr4(__get_vvmcs(vvmcs, HOST_CR4));
+    hvm_set_cr3(__get_vvmcs(vvmcs, HOST_CR3));
+
+    __set_vvmcs(vvmcs, VM_ENTRY_INTR_INFO, 0);
+}
+
+static void virtual_vmexit(struct cpu_user_regs *regs)
+{
+    struct vcpu *v = current;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+#ifdef __x86_64__
+    unsigned long lm_l1, lm_l2;
+#endif
+
+    sync_vvmcs_ro(v);
+    sync_vvmcs_guest_state(v, regs);
+
+    vmx_vmcs_switch(v, v->arch.hvm_vmx.vmcs, nvcpu->nv_n1vmcx);
+
+    nestedhvm_vcpu_exit_guestmode(v);
+    nvcpu->nv_vmexit_pending = 0;
+
+#ifdef __x86_64__
+    lm_l2 = !!hvm_long_mode_enabled(v);
+    lm_l1 = !!(__get_vvmcs(nvcpu->nv_vvmcx, VM_EXIT_CONTROLS) &
+                           VM_EXIT_IA32E_MODE);
+
+    if ( lm_l1 )
+        v->arch.hvm_vcpu.guest_efer |= EFER_LMA | EFER_LME;
+    else
+        v->arch.hvm_vcpu.guest_efer &= ~(EFER_LMA | EFER_LME);
+#endif
+
+    vmx_update_cpu_exec_control(v);
+    vmx_update_exception_bitmap(v);
+
+    load_vvmcs_host_state(v);
+
+#ifdef __x86_64__
+    if ( lm_l1 != lm_l2 )
+        paging_update_paging_modes(v);
+#endif
+
+    regs->rip = __get_vvmcs(nvcpu->nv_vvmcx, HOST_RIP);
+    regs->rsp = __get_vvmcs(nvcpu->nv_vvmcx, HOST_RSP);
+    regs->rflags = __vmread(GUEST_RFLAGS);
+
+    vmreturn(regs, VMSUCCEED);
+}
+
+asmlinkage void nvmx_switch_guest(void)
+{
+    struct vcpu *v = current;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
+
+    /*
+     * a softirq may interrupt us between a virtual vmentry is
+     * just handled and the true vmentry. If during this window,
+     * a L1 virtual interrupt causes another virtual vmexit, we
+     * cannot let that happen or VM_ENTRY_INTR_INFO will be lost.
+     */
+    if ( unlikely(nvcpu->nv_vmswitch_in_progress) )
+        return;
+
+    if ( nestedhvm_vcpu_in_guestmode(v) && nvcpu->nv_vmexit_pending )
+    {
+        local_irq_enable();
+        virtual_vmexit(regs);
+    }
+    else if ( !nestedhvm_vcpu_in_guestmode(v) && nvcpu->nv_vmentry_pending )
+    {
+        local_irq_enable();
+        virtual_vmentry(regs);
+    }
+}
+
+/*
  * VMX instructions handling
  */
 
diff -r aacbe98da103 -r bd15acfc9b82 xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:21 2011 +0800
@@ -165,6 +165,7 @@ void nvmx_update_exec_control(struct vcp
 void nvmx_update_secondary_exec_control(struct vcpu *v,
                                         unsigned long value);
 void nvmx_update_exception_bitmap(struct vcpu *v, unsigned long value);
+asmlinkage void nvmx_switch_guest(void);
 
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 16 of 20] interrupt/exception handling for n2 guest
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (14 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 15 of 20] Switch shadow/virtual VMCS between n1/n2 guests Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02  8:57 ` [PATCH 17 of 20] VM exit handler of n2-guest Eddie Dong
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003601 -28800
# Node ID f14f451a780e60e920c057e44fa1bc3ee40495a7
# Parent  bd15acfc9b822ccf27b5c7603e600e5e11733907
interrupt/exception handling for n2 guest

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r bd15acfc9b82 -r f14f451a780e xen/arch/x86/hvm/vmx/intr.c
--- a/xen/arch/x86/hvm/vmx/intr.c	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/intr.c	Thu Jun 02 16:33:21 2011 +0800
@@ -35,6 +35,7 @@
 #include <asm/hvm/vmx/vmcs.h>
 #include <asm/hvm/vpic.h>
 #include <asm/hvm/vlapic.h>
+#include <asm/hvm/nestedhvm.h>
 #include <public/hvm/ioreq.h>
 #include <asm/hvm/trace.h>
 
@@ -109,6 +110,102 @@ static void enable_intr_window(struct vc
     }
 }
 
+/*
+ * Injecting interrupts for nested virtualization
+ *
+ *  When injecting virtual interrupts (originated from L0), there are
+ *  two major possibilities, within L1 context and within L2 context
+ *   1. L1 context (in_nesting == 0)
+ *     Everything is the same as without nested, check RFLAGS.IF to
+ *     see if the injection can be done, using VMCS to inject the
+ *     interrupt
+ *
+ *   2. L2 context (in_nesting == 1)
+ *     Causes a virtual VMExit, RFLAGS.IF is ignored, whether to ack
+ *     irq according to intr_ack_on_exit, shouldn't block normally,
+ *     except for:
+ *    a. context transition
+ *     interrupt needs to be blocked at virtual VMEntry time
+ *    b. L2 idtv reinjection
+ *     if L2 idtv is handled within L0 (e.g. L0 shadow page fault),
+ *     it needs to be reinjected without exiting to L1, interrupt
+ *     injection should be blocked as well at this point.
+ *
+ *  Unfortunately, interrupt blocking in L2 won't work with simple
+ *  intr_window_open (which depends on L2's IF). To solve this,
+ *  the following algorithm can be used:
+ *   v->arch.hvm_vmx.exec_control.VIRTUAL_INTR_PENDING now denotes
+ *   only L0 control, physical control may be different from it.
+ *       - if in L1, it behaves normally, intr window is written
+ *         to physical control as it is
+ *       - if in L2, replace it to MTF (or NMI window) if possible
+ *       - if MTF/NMI window is not used, intr window can still be
+ *         used but may have negative impact on interrupt performance.
+ */
+
+enum hvm_intblk nvmx_intr_blocked(struct vcpu *v)
+{
+    int r = hvm_intblk_none;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+
+    if ( nestedhvm_vcpu_in_guestmode(v) )
+    {
+        if ( nvcpu->nv_vmexit_pending ||
+             nvcpu->nv_vmswitch_in_progress ||
+             (__vmread(VM_ENTRY_INTR_INFO) & INTR_INFO_VALID_MASK) )
+            r = hvm_intblk_rflags_ie;
+    }
+    else if ( nvcpu->nv_vmentry_pending )
+        r = hvm_intblk_rflags_ie;
+
+    return r;
+}
+
+static int nvmx_intr_intercept(struct vcpu *v, struct hvm_intack intack)
+{
+    u32 exit_ctrl;
+
+    /*
+     * TODO:
+     *   - if L1 intr-window exiting == 0
+     *   - vNMI
+     */
+
+    if ( nvmx_intr_blocked(v) != hvm_intblk_none )
+    {
+        enable_intr_window(v, intack);
+        return 1;
+    }
+
+    if ( nestedhvm_vcpu_in_guestmode(v) )
+    {
+        if ( intack.source == hvm_intsrc_pic ||
+                 intack.source == hvm_intsrc_lapic )
+        {
+            vmx_inject_extint(intack.vector);
+
+            exit_ctrl = __get_vvmcs(vcpu_nestedhvm(v).nv_vvmcx,
+                            VM_EXIT_CONTROLS);
+            if ( exit_ctrl & VM_EXIT_ACK_INTR_ON_EXIT )
+            {
+                /* for now, duplicate the ack path in vmx_intr_assist */
+                hvm_vcpu_ack_pending_irq(v, intack);
+                pt_intr_post(v, intack);
+
+                intack = hvm_vcpu_has_pending_irq(v);
+                if ( unlikely(intack.source != hvm_intsrc_none) )
+                    enable_intr_window(v, intack);
+            }
+            else
+                enable_intr_window(v, intack);
+
+            return 1;
+        }
+    }
+
+    return 0;
+}
+
 asmlinkage void vmx_intr_assist(void)
 {
     struct hvm_intack intack;
@@ -132,6 +229,9 @@ asmlinkage void vmx_intr_assist(void)
         if ( likely(intack.source == hvm_intsrc_none) )
             goto out;
 
+        if ( unlikely(nvmx_intr_intercept(v, intack)) )
+            goto out;
+
         intblk = hvm_interrupt_blocked(v, intack);
         if ( intblk == hvm_intblk_tpr )
         {
diff -r bd15acfc9b82 -r f14f451a780e xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:21 2011 +0800
@@ -1243,6 +1243,31 @@ void ept_sync_domain(struct domain *d)
                      __ept_sync_domain, d, 1);
 }
 
+void nvmx_enqueue_n2_exceptions(struct vcpu *v, 
+            unsigned long intr_fields, int error_code)
+{
+    struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
+
+    if ( !(nvmx->intr.intr_info & INTR_INFO_VALID_MASK) ) {
+        /* enqueue the exception till the VMCS switch back to L1 */
+        nvmx->intr.intr_info = intr_fields;
+        nvmx->intr.error_code = error_code;
+        vcpu_nestedhvm(v).nv_vmexit_pending = 1;
+        return;
+    }
+    else
+        gdprintk(XENLOG_ERR, "Double Fault on Nested Guest: exception %lx %x"
+                 "on %lx %x\n", intr_fields, error_code,
+                 nvmx->intr.intr_info, nvmx->intr.error_code);
+}
+
+static int nvmx_vmexit_exceptions(struct vcpu *v, unsigned int trapnr,
+                      int errcode, unsigned long cr2)
+{
+    nvmx_enqueue_n2_exceptions(v, trapnr, errcode);
+    return NESTEDHVM_VMEXIT_DONE;
+}
+
 static void __vmx_inject_exception(int trap, int type, int error_code)
 {
     unsigned long intr_fields;
@@ -1272,11 +1297,16 @@ static void __vmx_inject_exception(int t
 
 void vmx_inject_hw_exception(int trap, int error_code)
 {
-    unsigned long intr_info = __vmread(VM_ENTRY_INTR_INFO);
+    unsigned long intr_info;
     struct vcpu *curr = current;
 
     int type = X86_EVENTTYPE_HW_EXCEPTION;
 
+    if ( nestedhvm_vcpu_in_guestmode(curr) )
+        intr_info = vcpu_2_nvmx(curr).intr.intr_info;
+    else
+        intr_info = __vmread(VM_ENTRY_INTR_INFO);
+
     switch ( trap )
     {
     case TRAP_debug:
@@ -1308,7 +1338,16 @@ void vmx_inject_hw_exception(int trap, i
             error_code = 0;
     }
 
-    __vmx_inject_exception(trap, type, error_code);
+    if ( nestedhvm_vcpu_in_guestmode(curr) &&
+         nvmx_intercepts_exception(curr, trap, error_code) )
+    {
+        nvmx_enqueue_n2_exceptions (curr, 
+            INTR_INFO_VALID_MASK | (type<<8) | trap,
+            error_code); 
+        return;
+    }
+    else
+        __vmx_inject_exception(trap, type, error_code);
 
     if ( trap == TRAP_page_fault )
         HVMTRACE_LONG_2D(PF_INJECT, error_code,
@@ -1319,12 +1358,38 @@ void vmx_inject_hw_exception(int trap, i
 
 void vmx_inject_extint(int trap)
 {
+    struct vcpu *v = current;
+    u32    pin_based_cntrl;
+
+    if ( nestedhvm_vcpu_in_guestmode(v) ) {
+        pin_based_cntrl = __get_vvmcs(vcpu_nestedhvm(v).nv_vvmcx, 
+                                     PIN_BASED_VM_EXEC_CONTROL);
+        if ( pin_based_cntrl && PIN_BASED_EXT_INTR_MASK ) {
+            nvmx_enqueue_n2_exceptions (v, 
+               INTR_INFO_VALID_MASK | (X86_EVENTTYPE_EXT_INTR<<8) | trap,
+               HVM_DELIVER_NO_ERROR_CODE);
+            return;
+        }
+    }
     __vmx_inject_exception(trap, X86_EVENTTYPE_EXT_INTR,
                            HVM_DELIVER_NO_ERROR_CODE);
 }
 
 void vmx_inject_nmi(void)
 {
+    struct vcpu *v = current;
+    u32    pin_based_cntrl;
+
+    if ( nestedhvm_vcpu_in_guestmode(v) ) {
+        pin_based_cntrl = __get_vvmcs(vcpu_nestedhvm(v).nv_vvmcx, 
+                                     PIN_BASED_VM_EXEC_CONTROL);
+        if ( pin_based_cntrl && PIN_BASED_NMI_EXITING ) {
+            nvmx_enqueue_n2_exceptions (v, 
+               INTR_INFO_VALID_MASK | (X86_EVENTTYPE_NMI<<8) | TRAP_nmi,
+               HVM_DELIVER_NO_ERROR_CODE);
+            return;
+        }
+    }
     __vmx_inject_exception(2, X86_EVENTTYPE_NMI,
                            HVM_DELIVER_NO_ERROR_CODE);
 }
@@ -1424,7 +1489,10 @@ static struct hvm_function_table __read_
     .nhvm_vcpu_reset      = nvmx_vcpu_reset,
     .nhvm_vcpu_guestcr3   = nvmx_vcpu_guestcr3,
     .nhvm_vcpu_hostcr3    = nvmx_vcpu_hostcr3,
-    .nhvm_vcpu_asid       = nvmx_vcpu_asid
+    .nhvm_vcpu_asid       = nvmx_vcpu_asid,
+    .nhvm_vmcx_guest_intercepts_trap = nvmx_intercepts_exception,
+    .nhvm_vcpu_vmexit_trap = nvmx_vmexit_exceptions,
+    .nhvm_intr_blocked    = nvmx_intr_blocked
 };
 
 struct hvm_function_table * __init start_vmx(void)
@@ -2237,7 +2305,8 @@ asmlinkage void vmx_vmexit_handler(struc
     hvm_maybe_deassert_evtchn_irq();
 
     idtv_info = __vmread(IDT_VECTORING_INFO);
-    if ( exit_reason != EXIT_REASON_TASK_SWITCH )
+    if ( !nestedhvm_vcpu_in_guestmode(v) && 
+         exit_reason != EXIT_REASON_TASK_SWITCH )
         vmx_idtv_reinject(idtv_info);
 
     switch ( exit_reason )
@@ -2585,6 +2654,9 @@ asmlinkage void vmx_vmexit_handler(struc
         domain_crash(v->domain);
         break;
     }
+
+    if ( nestedhvm_vcpu_in_guestmode(v) )
+        nvmx_idtv_handling();
 }
 
 asmlinkage void vmx_vmenter_helper(void)
diff -r bd15acfc9b82 -r f14f451a780e xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
@@ -392,6 +392,27 @@ static void vmreturn(struct cpu_user_reg
     regs->eflags = eflags;
 }
 
+int nvmx_intercepts_exception(struct vcpu *v, unsigned int trap,
+                               int error_code)
+{
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    u32 exception_bitmap, pfec_match=0, pfec_mask=0;
+    int r;
+
+    ASSERT ( trap < 32 );
+
+    exception_bitmap = __get_vvmcs(nvcpu->nv_vvmcx, EXCEPTION_BITMAP);
+    r = exception_bitmap & (1 << trap) ? 1: 0;
+
+    if ( trap == TRAP_page_fault ) {
+        pfec_match = __get_vvmcs(nvcpu->nv_vvmcx, PAGE_FAULT_ERROR_CODE_MATCH);
+        pfec_mask  = __get_vvmcs(nvcpu->nv_vvmcx, PAGE_FAULT_ERROR_CODE_MASK);
+        if ( (error_code & pfec_mask) != pfec_match )
+            r = !r;
+    }
+    return r;
+}
+
 /*
  * Nested VMX uses "strict" condition to exit from 
  * L2 guest if either L1 VMM or L0 VMM expect to exit.
@@ -465,6 +486,7 @@ void nvmx_update_exec_control(struct vcp
         __vmwrite(IO_BITMAP_B, virt_to_maddr(bitmap) + PAGE_SIZE);
     }
 
+    /* TODO: change L0 intr window to MTF or NMI window */
     __vmwrite(CPU_BASED_VM_EXEC_CONTROL, shadow_cntrl);
 }
 
@@ -868,6 +890,42 @@ static void load_vvmcs_host_state(struct
     __set_vvmcs(vvmcs, VM_ENTRY_INTR_INFO, 0);
 }
 
+static void sync_exception_state(struct vcpu *v)
+{
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
+
+    if ( !(nvmx->intr.intr_info & INTR_INFO_VALID_MASK) )
+        return;
+
+    switch ( nvmx->intr.intr_info & INTR_INFO_INTR_TYPE_MASK )
+    {
+    case X86_EVENTTYPE_EXT_INTR:
+        /* rename exit_reason to EXTERNAL_INTERRUPT */
+        __set_vvmcs(nvcpu->nv_vvmcx, VM_EXIT_REASON,
+                    EXIT_REASON_EXTERNAL_INTERRUPT);
+        __set_vvmcs(nvcpu->nv_vvmcx, EXIT_QUALIFICATION, 0);
+        __set_vvmcs(nvcpu->nv_vvmcx, VM_EXIT_INTR_INFO,
+                    nvmx->intr.intr_info);
+        break;
+
+    case X86_EVENTTYPE_HW_EXCEPTION:
+    case X86_EVENTTYPE_SW_INTERRUPT:
+    case X86_EVENTTYPE_SW_EXCEPTION:
+        /* throw to L1 */
+        __set_vvmcs(nvcpu->nv_vvmcx, VM_EXIT_INTR_INFO,
+                    nvmx->intr.intr_info);
+        __set_vvmcs(nvcpu->nv_vvmcx, VM_EXIT_INTR_ERROR_CODE,
+                    nvmx->intr.error_code);
+        break;
+    case X86_EVENTTYPE_NMI:
+    default:
+        gdprintk(XENLOG_ERR, "Exception state %lx not handled\n",
+               nvmx->intr.intr_info); 
+        break;
+    }
+}
+
 static void virtual_vmexit(struct cpu_user_regs *regs)
 {
     struct vcpu *v = current;
@@ -878,6 +936,7 @@ static void virtual_vmexit(struct cpu_us
 
     sync_vvmcs_ro(v);
     sync_vvmcs_guest_state(v, regs);
+    sync_exception_state(v);
 
     vmx_vmcs_switch(v, v->arch.hvm_vmx.vmcs, nvcpu->nv_n1vmcx);
 
@@ -1169,3 +1228,40 @@ int nvmx_handle_vmwrite(struct cpu_user_
     return X86EMUL_OKAY;
 }
 
+void nvmx_idtv_handling(void)
+{
+    struct vcpu *v = current;
+    struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    unsigned int idtv_info = __vmread(IDT_VECTORING_INFO);
+
+    if ( likely(!(idtv_info & INTR_INFO_VALID_MASK)) )
+        return;
+
+    /*
+     * If L0 can solve the fault that causes idt vectoring, it should
+     * be reinjected, otherwise, pass to L1.
+     */
+    if ( (__vmread(VM_EXIT_REASON) != EXIT_REASON_EPT_VIOLATION &&
+          !(nvmx->intr.intr_info & INTR_INFO_VALID_MASK)) ||
+         (__vmread(VM_EXIT_REASON) == EXIT_REASON_EPT_VIOLATION &&
+          !nvcpu->nv_vmexit_pending) )
+    {
+        __vmwrite(VM_ENTRY_INTR_INFO, idtv_info & ~INTR_INFO_RESVD_BITS_MASK);
+        if ( idtv_info & INTR_INFO_DELIVER_CODE_MASK )
+           __vmwrite(VM_ENTRY_EXCEPTION_ERROR_CODE,
+                        __vmread(IDT_VECTORING_ERROR_CODE));
+        /*
+         * SDM 23.2.4, if L1 tries to inject a software interrupt
+         * and the delivery fails, VM_EXIT_INSTRUCTION_LEN receives
+         * the value of previous VM_ENTRY_INSTRUCTION_LEN.
+         *
+         * This means EXIT_INSTRUCTION_LEN is always valid here, for
+         * software interrupts both injected by L1, and generated in L2.
+         */
+        __vmwrite(VM_ENTRY_INSTRUCTION_LEN, __vmread(VM_EXIT_INSTRUCTION_LEN));
+   }
+
+    /* TODO: NMI */
+}
+
diff -r bd15acfc9b82 -r f14f451a780e xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:21 2011 +0800
@@ -93,6 +93,9 @@ int nvmx_vcpu_reset(struct vcpu *v);
 uint64_t nvmx_vcpu_guestcr3(struct vcpu *v);
 uint64_t nvmx_vcpu_hostcr3(struct vcpu *v);
 uint32_t nvmx_vcpu_asid(struct vcpu *v);
+enum hvm_intblk nvmx_intr_blocked(struct vcpu *v);
+int nvmx_intercepts_exception(struct vcpu *v, 
+                              unsigned int trap, int error_code);
 
 int nvmx_handle_vmxon(struct cpu_user_regs *regs);
 int nvmx_handle_vmxoff(struct cpu_user_regs *regs);
@@ -166,6 +169,7 @@ void nvmx_update_secondary_exec_control(
                                         unsigned long value);
 void nvmx_update_exception_bitmap(struct vcpu *v, unsigned long value);
 asmlinkage void nvmx_switch_guest(void);
+void nvmx_idtv_handling(void);
 
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 17 of 20] VM exit handler of n2-guest
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (15 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 16 of 20] interrupt/exception handling for n2 guest Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02 14:59   ` Tim Deegan
  2011-06-02  8:57 ` [PATCH 18 of 20] Lazy FPU for n2 guest Eddie Dong
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003601 -28800
# Node ID 24d4d7d3e4c44c8dc61f464bca9aae57480dfe75
# Parent  f14f451a780e60e920c057e44fa1bc3ee40495a7
VM exit handler of n2-guest

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r f14f451a780e -r 24d4d7d3e4c4 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:21 2011 +0800
@@ -943,6 +943,10 @@ static void vmx_set_segment_register(str
 static void vmx_set_tsc_offset(struct vcpu *v, u64 offset)
 {
     vmx_vmcs_enter(v);
+
+    if ( nestedhvm_vcpu_in_guestmode(v) )
+        offset += nvmx_get_tsc_offset(v);
+
     __vmwrite(TSC_OFFSET, offset);
 #if defined (__i386__)
     __vmwrite(TSC_OFFSET_HIGH, offset >> 32);
@@ -2258,6 +2262,11 @@ asmlinkage void vmx_vmexit_handler(struc
      * any pending vmresume has really happened
      */
     vcpu_nestedhvm(v).nv_vmswitch_in_progress = 0;
+    if ( nestedhvm_vcpu_in_guestmode(v) )
+    {
+        if ( nvmx_n2_vmexit_handler(regs, exit_reason) )
+            goto out;
+    }
 
     if ( unlikely(exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) )
         return vmx_failed_vmentry(exit_reason, regs);
@@ -2655,6 +2664,7 @@ asmlinkage void vmx_vmexit_handler(struc
         break;
     }
 
+out:
     if ( nestedhvm_vcpu_in_guestmode(v) )
         nvmx_idtv_handling();
 }
diff -r f14f451a780e -r 24d4d7d3e4c4 xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
@@ -288,13 +288,19 @@ static int vmx_inst_check_privilege(stru
     if ( (regs->eflags & X86_EFLAGS_VM) ||
          (hvm_long_mode_enabled(v) && cs.attr.fields.l == 0) )
         goto invalid_op;
-    /* TODO: check vmx operation mode */
+    else if ( nestedhvm_vcpu_in_guestmode(v) )
+        goto vmexit;
 
     if ( (cs.sel & 3) > 0 )
         goto gp_fault;
 
     return X86EMUL_OKAY;
 
+vmexit:
+    gdprintk(XENLOG_ERR, "vmx_inst_check_privilege: vmexit\n");
+    vcpu_nestedhvm(v).nv_vmexit_pending = 1;
+    return X86EMUL_EXCEPTION;
+    
 invalid_op:
     gdprintk(XENLOG_ERR, "vmx_inst_check_privilege: invalid_op\n");
     hvm_inject_exception(TRAP_invalid_op, 0, 0);
@@ -606,6 +612,18 @@ static void nvmx_purge_vvmcs(struct vcpu
     }
 }
 
+u64 nvmx_get_tsc_offset(struct vcpu *v)
+{
+    u64 offset = 0;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+
+    if ( __get_vvmcs(nvcpu->nv_vvmcx, CPU_BASED_VM_EXEC_CONTROL) &
+         CPU_BASED_USE_TSC_OFFSETING )
+        offset = __get_vvmcs(nvcpu->nv_vvmcx, TSC_OFFSET);
+
+    return offset;
+}
+
 /*
  * Context synchronized between shadow and virtual VMCS.
  */
@@ -759,6 +777,8 @@ static void load_shadow_guest_state(stru
     hvm_set_cr4(__get_vvmcs(vvmcs, GUEST_CR4));
     hvm_set_cr3(__get_vvmcs(vvmcs, GUEST_CR3));
 
+    hvm_funcs.set_tsc_offset(v, v->arch.hvm_vcpu.cache_tsc_offset);
+
     vvmcs_to_shadow(vvmcs, VM_ENTRY_INTR_INFO);
     vvmcs_to_shadow(vvmcs, VM_ENTRY_EXCEPTION_ERROR_CODE);
     vvmcs_to_shadow(vvmcs, VM_ENTRY_INSTRUCTION_LEN);
@@ -887,6 +907,8 @@ static void load_vvmcs_host_state(struct
     hvm_set_cr4(__get_vvmcs(vvmcs, HOST_CR4));
     hvm_set_cr3(__get_vvmcs(vvmcs, HOST_CR3));
 
+    hvm_funcs.set_tsc_offset(v, v->arch.hvm_vcpu.cache_tsc_offset);
+
     __set_vvmcs(vvmcs, VM_ENTRY_INTR_INFO, 0);
 }
 
@@ -1265,3 +1287,252 @@ void nvmx_idtv_handling(void)
     /* TODO: NMI */
 }
 
+/*
+ * L2 VMExit handling
+ *    return 1: Done or skip the normal layer 0 hypervisor process.
+ *              Typically it requires layer 1 hypervisor processing
+ *              or it may be already processed here.
+ *           0: Require the normal layer 0 process.
+ */
+int nvmx_n2_vmexit_handler(struct cpu_user_regs *regs,
+                               unsigned int exit_reason)
+{
+    struct vcpu *v = current;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
+    u32 ctrl;
+    u16 port;
+    u8 *bitmap;
+
+    nvcpu->nv_vmexit_pending = 0;
+    nvmx->intr.intr_info = 0;
+    nvmx->intr.error_code = 0;
+
+    switch (exit_reason) {
+    case EXIT_REASON_EXCEPTION_NMI:
+    {
+        u32 intr_info = __vmread(VM_EXIT_INTR_INFO);
+        u32 valid_mask = (X86_EVENTTYPE_HW_EXCEPTION << 8) |
+                         INTR_INFO_VALID_MASK;
+        u64 exec_bitmap;
+        int vector = intr_info & INTR_INFO_VECTOR_MASK;
+
+        /*
+         * decided by L0 and L1 exception bitmap, if the vetor is set by
+         * both, L0 has priority on #PF, L1 has priority on others
+         */
+        if ( vector == TRAP_page_fault )
+        {
+            if ( paging_mode_hap(v->domain) )
+                nvcpu->nv_vmexit_pending = 1;
+        }
+        else if ( (intr_info & valid_mask) == valid_mask )
+        {
+            exec_bitmap =__get_vvmcs(nvcpu->nv_vvmcx, EXCEPTION_BITMAP);
+
+            if ( exec_bitmap & (1 << vector) )
+                nvcpu->nv_vmexit_pending = 1;
+        }
+        break;
+    }
+
+    case EXIT_REASON_WBINVD:
+    case EXIT_REASON_EPT_VIOLATION:
+    case EXIT_REASON_EPT_MISCONFIG:
+    case EXIT_REASON_EXTERNAL_INTERRUPT:
+        /* pass to L0 handler */
+        break;
+
+    case VMX_EXIT_REASONS_FAILED_VMENTRY:
+    case EXIT_REASON_TRIPLE_FAULT:
+    case EXIT_REASON_TASK_SWITCH:
+    case EXIT_REASON_CPUID:
+    case EXIT_REASON_MSR_READ:
+    case EXIT_REASON_MSR_WRITE:
+    case EXIT_REASON_VMCALL:
+    case EXIT_REASON_VMCLEAR:
+    case EXIT_REASON_VMLAUNCH:
+    case EXIT_REASON_VMPTRLD:
+    case EXIT_REASON_VMPTRST:
+    case EXIT_REASON_VMREAD:
+    case EXIT_REASON_VMRESUME:
+    case EXIT_REASON_VMWRITE:
+    case EXIT_REASON_VMXOFF:
+    case EXIT_REASON_VMXON:
+    case EXIT_REASON_INVEPT:
+        /* inject to L1 */
+        nvcpu->nv_vmexit_pending = 1;
+        break;
+    case EXIT_REASON_IO_INSTRUCTION:
+        ctrl = __n2_exec_control(v);
+        if ( ctrl & CPU_BASED_ACTIVATE_IO_BITMAP )
+        {
+            port = __vmread(EXIT_QUALIFICATION) >> 16;
+            bitmap = nvmx->iobitmap[port >> 15];
+            if ( bitmap[(port <<1) >> 4] & (1 << (port & 0x7)) )
+                nvcpu->nv_vmexit_pending = 1;
+        }
+        else if ( ctrl & CPU_BASED_UNCOND_IO_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+        break;
+
+    case EXIT_REASON_PENDING_VIRT_INTR:
+    {
+        ctrl = v->arch.hvm_vmx.exec_control;
+
+        /*
+         * if both open intr/nmi window, L0 has priority.
+         *
+         * Note that this is not strictly correct, in L2 context,
+         * L0's intr/nmi window flag should be replaced to MTF,
+         * causing an imediate VMExit, but MTF may not be available
+         * on all hardware.
+         */
+        if ( !(ctrl & CPU_BASED_VIRTUAL_INTR_PENDING) )
+            nvcpu->nv_vmexit_pending = 1;
+
+        break;
+    }
+    case EXIT_REASON_PENDING_VIRT_NMI:
+    {
+        ctrl = v->arch.hvm_vmx.exec_control;
+
+        if ( !(ctrl & CPU_BASED_VIRTUAL_NMI_PENDING) )
+            nvcpu->nv_vmexit_pending = 1;
+
+        break;
+    }
+
+    /* L1 has priority handling several other types of exits */
+    case EXIT_REASON_HLT:
+    {
+        ctrl = __n2_exec_control(v);
+
+        if ( ctrl & CPU_BASED_HLT_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+
+        break;
+    }
+
+    case EXIT_REASON_RDTSC:
+    {
+        ctrl = __n2_exec_control(v);
+
+        if ( ctrl & CPU_BASED_RDTSC_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+        else
+        {
+            uint64_t tsc;
+
+            /*
+             * special handler is needed if L1 doesn't intercept rdtsc,
+             * avoiding changing guest_tsc and messing up timekeeping in L1
+             */
+            tsc = hvm_get_guest_tsc(v);
+            tsc += __get_vvmcs(nvcpu->nv_vvmcx, TSC_OFFSET);
+            regs->eax = (uint32_t)tsc;
+            regs->edx = (uint32_t)(tsc >> 32);
+
+            return 1;
+        }
+
+        break;
+    }
+
+    case EXIT_REASON_RDPMC:
+    {
+        ctrl = __n2_exec_control(v);
+
+        if ( ctrl & CPU_BASED_RDPMC_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+
+        break;
+    }
+
+    case EXIT_REASON_MWAIT_INSTRUCTION:
+    {
+        ctrl = __n2_exec_control(v);
+
+        if ( ctrl & CPU_BASED_MWAIT_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+
+        break;
+    }
+
+    case EXIT_REASON_PAUSE_INSTRUCTION:
+    {
+        ctrl = __n2_exec_control(v);
+
+        if ( ctrl & CPU_BASED_PAUSE_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+
+        break;
+    }
+
+    case EXIT_REASON_MONITOR_INSTRUCTION:
+    {
+        ctrl = __n2_exec_control(v);
+
+        if ( ctrl & CPU_BASED_MONITOR_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+
+        break;
+    }
+
+    case EXIT_REASON_DR_ACCESS:
+    {
+        ctrl = __n2_exec_control(v);
+
+        if ( ctrl & CPU_BASED_MOV_DR_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+
+        break;
+    }
+
+    case EXIT_REASON_INVLPG:
+    {
+        ctrl = __n2_exec_control(v);
+
+        if ( ctrl & CPU_BASED_INVLPG_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+
+        break;
+    }
+
+    case EXIT_REASON_CR_ACCESS:
+    {
+        u64 exit_qualification = __vmread(EXIT_QUALIFICATION);
+        int cr = exit_qualification & 15;
+        int write = (exit_qualification >> 4) & 3;
+        u32 mask = 0;
+
+        /* also according to guest exec_control */
+        ctrl = __n2_exec_control(v);
+
+        if ( cr == 3 )
+        {
+            mask = write? CPU_BASED_CR3_STORE_EXITING:
+                          CPU_BASED_CR3_LOAD_EXITING;
+            if ( ctrl & mask )
+                nvcpu->nv_vmexit_pending = 1;
+        }
+        else if ( cr == 8 )
+        {
+            mask = write? CPU_BASED_CR8_STORE_EXITING:
+                          CPU_BASED_CR8_LOAD_EXITING;
+            if ( ctrl & mask )
+                nvcpu->nv_vmexit_pending = 1;
+        }
+        else  /* CR0, CR4, CLTS, LMSW */
+            nvcpu->nv_vmexit_pending = 1;
+
+        break;
+    }
+    default:
+        gdprintk(XENLOG_WARNING, "Unknown nested vmexit reason %x.\n",
+                 exit_reason);
+    }
+
+    return ( nvcpu->nv_vmexit_pending == 1 );
+}
+
diff -r f14f451a780e -r 24d4d7d3e4c4 xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:21 2011 +0800
@@ -170,6 +170,9 @@ void nvmx_update_secondary_exec_control(
 void nvmx_update_exception_bitmap(struct vcpu *v, unsigned long value);
 asmlinkage void nvmx_switch_guest(void);
 void nvmx_idtv_handling(void);
+u64 nvmx_get_tsc_offset(struct vcpu *v);
+int nvmx_n2_vmexit_handler(struct cpu_user_regs *regs,
+                          unsigned int exit_reason);
 
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 18 of 20] Lazy FPU for n2 guest
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (16 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 17 of 20] VM exit handler of n2-guest Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02  8:57 ` [PATCH 19 of 20] Add VMXE bits in virtual CR4 Eddie Dong
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003601 -28800
# Node ID 0cedbe9214c1632a0f1816d8b6d7442dc5f40065
# Parent  24d4d7d3e4c44c8dc61f464bca9aae57480dfe75
Lazy FPU for n2 guest

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 24d4d7d3e4c4 -r 0cedbe9214c1 xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
@@ -842,6 +842,9 @@ static void virtual_vmentry(struct cpu_u
     regs->rsp = __get_vvmcs(vvmcs, GUEST_RSP);
     regs->rflags = __get_vvmcs(vvmcs, GUEST_RFLAGS);
 
+    /* updating host cr0 to sync TS bit */
+    __vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0);
+
     /* TODO: EPT_POINTER */
 }
 
@@ -990,6 +993,9 @@ static void virtual_vmexit(struct cpu_us
     regs->rsp = __get_vvmcs(nvcpu->nv_vvmcx, HOST_RSP);
     regs->rflags = __vmread(GUEST_RFLAGS);
 
+    /* updating host cr0 to sync TS bit */
+    __vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0);
+
     vmreturn(regs, VMSUCCEED);
 }
 
@@ -1319,13 +1325,18 @@ int nvmx_n2_vmexit_handler(struct cpu_us
 
         /*
          * decided by L0 and L1 exception bitmap, if the vetor is set by
-         * both, L0 has priority on #PF, L1 has priority on others
+         * both, L0 has priority on #PF and #NM, L1 has priority on others
          */
         if ( vector == TRAP_page_fault )
         {
             if ( paging_mode_hap(v->domain) )
                 nvcpu->nv_vmexit_pending = 1;
         }
+        else if ( vector == TRAP_no_device )
+        {
+            if ( v->fpu_dirtied )
+                nvcpu->nv_vmexit_pending = 1;
+        }
         else if ( (intr_info & valid_mask) == valid_mask )
         {
             exec_bitmap =__get_vvmcs(nvcpu->nv_vvmcx, EXCEPTION_BITMAP);

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 19 of 20] Add VMXE bits in virtual CR4
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (17 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 18 of 20] Lazy FPU for n2 guest Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02 15:01   ` Tim Deegan
  2011-06-02  8:57 ` [PATCH 20 of 20] n2 MSR handling and capability exposure Eddie Dong
  2011-06-02 14:33 ` [PATCH 00 of 20] NestedVMX support Tim Deegan
  20 siblings, 1 reply; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003601 -28800
# Node ID c046b25135205ff58c0b729c0b94cd920cdbb7e2
# Parent  0cedbe9214c1632a0f1816d8b6d7442dc5f40065
Add VMXE bits in virtual CR4

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 0cedbe9214c1 -r c046b2513520 xen/include/asm-x86/cpufeature.h
--- a/xen/include/asm-x86/cpufeature.h	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/include/asm-x86/cpufeature.h	Thu Jun 02 16:33:21 2011 +0800
@@ -216,6 +216,8 @@
 
 #define cpu_has_svm		boot_cpu_has(X86_FEATURE_SVM)
 
+#define cpu_has_vmx		boot_cpu_has(X86_FEATURE_VMXE)
+
 #endif /* __ASM_I386_CPUFEATURE_H */
 
 /* 
diff -r 0cedbe9214c1 -r c046b2513520 xen/include/asm-x86/hvm/hvm.h
--- a/xen/include/asm-x86/hvm/hvm.h	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/include/asm-x86/hvm/hvm.h	Thu Jun 02 16:33:21 2011 +0800
@@ -313,6 +313,8 @@ static inline int hvm_do_pmu_interrupt(s
         X86_CR4_DE  | X86_CR4_PSE | X86_CR4_PAE |       \
         X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE |       \
         X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT |           \
+	((nestedhvm_enabled((_v)->domain) &&            \
+          cpu_has_vmx) ? X86_CR4_VMXE : 0)  |       	\
         (xsave_enabled(_v) ? X86_CR4_OSXSAVE : 0))))
 
 /* These exceptions must always be intercepted. */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 20 of 20] n2 MSR handling and capability exposure
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (18 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 19 of 20] Add VMXE bits in virtual CR4 Eddie Dong
@ 2011-06-02  8:57 ` Eddie Dong
  2011-06-02 15:07   ` Tim Deegan
  2011-06-02 14:33 ` [PATCH 00 of 20] NestedVMX support Tim Deegan
  20 siblings, 1 reply; 45+ messages in thread
From: Eddie Dong @ 2011-06-02  8:57 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307003601 -28800
# Node ID ee55fa0471a6b72569b567286ae264bc1dcdbb4b
# Parent  c046b25135205ff58c0b729c0b94cd920cdbb7e2
n2 MSR handling and capability exposure

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r c046b2513520 -r ee55fa0471a6 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 02 16:33:21 2011 +0800
@@ -1778,8 +1778,11 @@ static int vmx_msr_read_intercept(unsign
         *msr_content |= (u64)__vmread(GUEST_IA32_DEBUGCTL_HIGH) << 32;
 #endif
         break;
-    case MSR_IA32_VMX_BASIC...MSR_IA32_VMX_PROCBASED_CTLS2:
-        goto gp_fault;
+    case IA32_FEATURE_CONTROL_MSR:
+    case MSR_IA32_VMX_BASIC...MSR_IA32_VMX_TRUE_ENTRY_CTLS:
+        if ( !nvmx_msr_read_intercept(msr, msr_content) )
+            goto gp_fault;
+        break;
     case MSR_IA32_MISC_ENABLE:
         rdmsrl(MSR_IA32_MISC_ENABLE, *msr_content);
         /* Debug Trace Store is not supported. */
@@ -1940,8 +1943,11 @@ static int vmx_msr_write_intercept(unsig
 
         break;
     }
-    case MSR_IA32_VMX_BASIC...MSR_IA32_VMX_PROCBASED_CTLS2:
-        goto gp_fault;
+    case IA32_FEATURE_CONTROL_MSR:
+    case MSR_IA32_VMX_BASIC...MSR_IA32_VMX_TRUE_ENTRY_CTLS:
+        if ( !nvmx_msr_write_intercept(msr, msr_content) )
+            goto gp_fault;
+        break;
     default:
         if ( vpmu_do_wrmsr(msr, msr_content) )
             return X86EMUL_OKAY;
diff -r c046b2513520 -r ee55fa0471a6 xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:21 2011 +0800
@@ -1256,6 +1256,94 @@ int nvmx_handle_vmwrite(struct cpu_user_
     return X86EMUL_OKAY;
 }
 
+/*
+ * Capability reporting
+ */
+int nvmx_msr_read_intercept(unsigned int msr, u64 *msr_content)
+{
+    u32 eax, edx;
+    u64 data = 0;
+    int r = 1;
+    u32 mask = 0;
+
+    if ( !nestedhvm_enabled(current->domain) )
+        return 0;
+
+    switch (msr) {
+    case MSR_IA32_VMX_BASIC:
+        rdmsr(msr, eax, edx);
+        data = edx;
+        data = (data & ~0x1fff) | 0x1000;     /* request 4KB for guest VMCS */
+        data &= ~(1 << 23);                   /* disable TRUE_xxx_CTLS */
+        data = (data << 32) | VVMCS_REVISION; /* VVMCS revision */
+        break;
+    case MSR_IA32_VMX_PINBASED_CTLS:
+#define REMOVED_PIN_CONTROL_CAP (PIN_BASED_PREEMPT_TIMER)
+        rdmsr(msr, eax, edx);
+        data = edx;
+        data = (data << 32) | eax;
+        break;
+    case MSR_IA32_VMX_PROCBASED_CTLS:
+        rdmsr(msr, eax, edx);
+#define REMOVED_EXEC_CONTROL_CAP (CPU_BASED_TPR_SHADOW \
+            | CPU_BASED_ACTIVATE_MSR_BITMAP            \
+            | CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)
+        data = edx & ~REMOVED_EXEC_CONTROL_CAP;
+        data = (data << 32) | eax;
+        break;
+    case MSR_IA32_VMX_EXIT_CTLS:
+        rdmsr(msr, eax, edx);
+#define REMOVED_EXIT_CONTROL_CAP (VM_EXIT_SAVE_GUEST_PAT \
+            | VM_EXIT_LOAD_HOST_PAT                      \
+            | VM_EXIT_SAVE_GUEST_EFER                    \
+            | VM_EXIT_LOAD_HOST_EFER                     \
+            | VM_EXIT_SAVE_PREEMPT_TIMER)
+        data = edx & ~REMOVED_EXIT_CONTROL_CAP;
+        data = (data << 32) | eax;
+        break;
+    case MSR_IA32_VMX_ENTRY_CTLS:
+        rdmsr(msr, eax, edx);
+#define REMOVED_ENTRY_CONTROL_CAP (VM_ENTRY_LOAD_GUEST_PAT \
+            | VM_ENTRY_LOAD_GUEST_EFER)
+        data = edx & ~REMOVED_ENTRY_CONTROL_CAP;
+        data = (data << 32) | eax;
+        break;
+    case MSR_IA32_VMX_PROCBASED_CTLS2:
+        mask = 0;
+
+        rdmsr(msr, eax, edx);
+        data = edx & mask;
+        data = (data << 32) | eax;
+        break;
+
+    /* pass through MSRs */
+    case IA32_FEATURE_CONTROL_MSR:
+    case MSR_IA32_VMX_MISC:
+    case MSR_IA32_VMX_CR0_FIXED0:
+    case MSR_IA32_VMX_CR0_FIXED1:
+    case MSR_IA32_VMX_CR4_FIXED0:
+    case MSR_IA32_VMX_CR4_FIXED1:
+    case MSR_IA32_VMX_VMCS_ENUM:
+        rdmsr(msr, eax, edx);
+        data = edx;
+        data = (data << 32) | eax;
+        break;
+
+    default:
+        r = 0;
+        break;
+    }
+
+    *msr_content = data;
+    return r;
+}
+
+int nvmx_msr_write_intercept(unsigned int msr, u64 msr_content)
+{
+    /* silently ignore for now */
+    return 1;
+}
+
 void nvmx_idtv_handling(void)
 {
     struct vcpu *v = current;
diff -r c046b2513520 -r ee55fa0471a6 xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:21 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 02 16:33:21 2011 +0800
@@ -163,6 +163,10 @@ int nvmx_handle_vmread(struct cpu_user_r
 int nvmx_handle_vmwrite(struct cpu_user_regs *regs);
 int nvmx_handle_vmresume(struct cpu_user_regs *regs);
 int nvmx_handle_vmlaunch(struct cpu_user_regs *regs);
+int nvmx_msr_read_intercept(unsigned int msr,
+                                u64 *msr_content);
+int nvmx_msr_write_intercept(unsigned int msr,
+                                 u64 msr_content);
 
 void nvmx_update_exec_control(struct vcpu *v, unsigned long value);
 void nvmx_update_secondary_exec_control(struct vcpu *v,

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00 of 20] NestedVMX support
  2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
                   ` (19 preceding siblings ...)
  2011-06-02  8:57 ` [PATCH 20 of 20] n2 MSR handling and capability exposure Eddie Dong
@ 2011-06-02 14:33 ` Tim Deegan
  2011-06-03  5:47   ` Dong, Eddie
  20 siblings, 1 reply; 45+ messages in thread
From: Tim Deegan @ 2011-06-02 14:33 UTC (permalink / raw)
  To: Eddie Dong; +Cc: xen-devel

Hi, 

Thanks for these patches.  They look pretty good; I have a few comments
on the individual patches that I'll post separately.

Overall the only worry I have is the number of TODOs left at the end of
the series.  Some of them are obvioulsy ony important when you come to
do the nested EPT work.  I'd appreciate a comment on whether you think
any of these is important:

+static int nvmx_intr_intercept(struct vcpu *v, struct hvm_intack
intack)
+{
+    u32 exit_ctrl;
+
+    /*
+     * TODO:
+     *   - if L1 intr-window exiting == 0
+     *   - vNMI
+     */


+static int decode_vmx_inst(struct cpu_user_regs *regs,
+                           struct vmx_inst_decoded *decode,
+                           unsigned long *poperandS, int vmxon_check)
+{
[...]
+        /* TODO: segment type check */

This one, at least, I think does need to be fixed!


+static void load_shadow_control(struct vcpu *v)
+{
+    /* TODO: Make sure the shadow control doesn't set the bits 
+     * L0 VMM doesn't handle.
+     */


+int nvmx_handle_vmlaunch(struct cpu_user_regs *regs)
+{
+    /* TODO: check for initial launch/resume */
+    return nvmx_handle_vmresume(regs);
+}


+void nvmx_idtv_handling(void)
+{
[...]
+    /* TODO: NMI */
+}


+static void load_shadow_guest_state(struct vcpu *v)
+{
[...]
+    /* XXX: should refer to GUEST_HOST_MASK of both L0 and L1 */


Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 05 of 20] Emulation of guest VMXON/OFF instruction
  2011-06-02  8:57 ` [PATCH 05 of 20] Emulation of guest VMXON/OFF instruction Eddie Dong
@ 2011-06-02 14:36   ` Tim Deegan
  2011-06-03  5:54     ` Dong, Eddie
  0 siblings, 1 reply; 45+ messages in thread
From: Tim Deegan @ 2011-06-02 14:36 UTC (permalink / raw)
  To: Eddie Dong; +Cc: xen-devel

At 16:57 +0800 on 02 Jun (1307033838), Eddie Dong wrote:
> diff -r 4e094881883f -r c8812151acfd xen/arch/x86/hvm/vmx/Makefile
> --- a/xen/arch/x86/hvm/vmx/Makefile	Thu Jun 02 16:33:20 2011 +0800
> +++ b/xen/arch/x86/hvm/vmx/Makefile	Thu Jun 02 16:33:20 2011 +0800
> @@ -5,3 +5,4 @@ obj-y += vmcs.o
>  obj-y += vmx.o
>  obj-y += vpmu_core2.o
>  obj-y += vvmx.o
> +obj-y += vvmx.o

Harmless, but wrong. :)

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 07 of 20] Emulation of guest vmptrld
  2011-06-02  8:57 ` [PATCH 07 of 20] Emulation of guest vmptrld Eddie Dong
@ 2011-06-02 14:45   ` Tim Deegan
  2011-06-03  6:07     ` Dong, Eddie
  0 siblings, 1 reply; 45+ messages in thread
From: Tim Deegan @ 2011-06-02 14:45 UTC (permalink / raw)
  To: Eddie Dong; +Cc: xen-devel

At 16:57 +0800 on 02 Jun (1307033840), Eddie Dong wrote:
> diff -r 8264b01b476b -r 4dad232d7fc3 xen/arch/x86/hvm/vmx/vvmx.c
> --- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 02 16:33:20 2011 +0800
> @@ -356,6 +356,41 @@ static void vmreturn(struct cpu_user_reg
>      regs->eflags = eflags;
>  }
>  
> +static void __map_io_bitmap(struct vcpu *v, u64 vmcs_reg)
> +{
> +    struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
> +    unsigned long gpa;
> +    unsigned long mfn;
> +    p2m_type_t p2mt;
> +
> +    if ( vmcs_reg == IO_BITMAP_A )
> +    {
> +        if (nvmx->iobitmap[0]) {
> +            unmap_domain_page_global(nvmx->iobitmap[0]);
> +        }
> +        gpa = __get_vvmcs(vcpu_nestedhvm(v).nv_vvmcx, IO_BITMAP_A);
> +        mfn = mfn_x(gfn_to_mfn(p2m_get_hostp2m(v->domain),
> +                              gpa >> PAGE_SHIFT, &p2mt));
> +        nvmx->iobitmap[0] = map_domain_page_global(mfn);

Why are these maps _global?  It might be OK to use 2 more global
mappings per VCPU but the reason should probably go in a comment beside
the call.

Also, I don't see where these mappings get torn down on domain
destruction. 

(While I'm looking at this code, this function is quite ugly.  Why have
a single function if you're going to duplicate its contents anyway?)

> +    }
> +    else if ( vmcs_reg == IO_BITMAP_B )
> +    {
> +        if (nvmx->iobitmap[1]) {
> +            unmap_domain_page_global(nvmx->iobitmap[1]);
> +        }
> +        gpa = __get_vvmcs(vcpu_nestedhvm(v).nv_vvmcx, IO_BITMAP_B);
> +        mfn = mfn_x(gfn_to_mfn(p2m_get_hostp2m(v->domain),
> +                               gpa >> PAGE_SHIFT, &p2mt));
> +        nvmx->iobitmap[1] = map_domain_page_global(mfn);
> +    }
> +}
> +
> +static inline void map_io_bitmap_all(struct vcpu *v)
> +{
> +   __map_io_bitmap (v, IO_BITMAP_A);
> +   __map_io_bitmap (v, IO_BITMAP_B);
> +}
> +
>  /*
>   * VMX instructions handling
>   */
> @@ -364,6 +399,7 @@ int nvmx_handle_vmxon(struct cpu_user_re
>  {
>      struct vcpu *v=current;
>      struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
> +    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
>      struct vmx_inst_decoded decode;
>      unsigned long gpa = 0;
>      int rc;
> @@ -372,7 +408,22 @@ int nvmx_handle_vmxon(struct cpu_user_re
>      if ( rc != X86EMUL_OKAY )
>          return rc;
>  
> +    if ( nvmx->vmxon_region_pa )
> +        gdprintk(XENLOG_WARNING, 
> +                 "vmxon again: orig %lx new %lx\n",
> +                 nvmx->vmxon_region_pa, gpa);
> +
>      nvmx->vmxon_region_pa = gpa;
> +
> +    /*
> +     * `fork' the host vmcs to shadow_vmcs
> +     * vmcs_lock is not needed since we are on current
> +     */
> +    nvcpu->nv_n1vmcx = v->arch.hvm_vmx.vmcs;
> +    __vmpclear(virt_to_maddr(v->arch.hvm_vmx.vmcs));
> +    memcpy(nvcpu->nv_n2vmcx, v->arch.hvm_vmx.vmcs, PAGE_SIZE);
> +    __vmptrld(virt_to_maddr(v->arch.hvm_vmx.vmcs));
> +    v->arch.hvm_vmx.launched = 0;
>      vmreturn(regs, VMSUCCEED);
>  
>      return X86EMUL_OKAY;
> @@ -394,3 +445,38 @@ int nvmx_handle_vmxoff(struct cpu_user_r
>      return X86EMUL_OKAY;
>  }
>  
> +int nvmx_handle_vmptrld(struct cpu_user_regs *regs)
> +{
> +    struct vcpu *v = current;
> +    struct vmx_inst_decoded decode;
> +    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
> +    unsigned long gpa = 0;
> +    unsigned long mfn;
> +    p2m_type_t p2mt;
> +    int rc;
> +
> +    rc = decode_vmx_inst(regs, &decode, &gpa, 0);
> +    if ( rc != X86EMUL_OKAY )
> +        return rc;
> +
> +    if ( gpa == vcpu_2_nvmx(v).vmxon_region_pa || gpa & 0xfff )
> +    {
> +        vmreturn(regs, VMFAIL_INVALID);
> +        goto out;
> +    }
> +
> +    if ( nvcpu->nv_vvmcxaddr == VMCX_EADDR )
> +    {
> +        mfn = mfn_x(gfn_to_mfn(p2m_get_hostp2m(v->domain),
> +                               gpa >> PAGE_SHIFT, &p2mt));
> +        nvcpu->nv_vvmcx = map_domain_page_global(mfn);

Again, why _global?

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 12 of 20] Add APIs to switch n1/n2 VMCS
  2011-06-02  8:57 ` [PATCH 12 of 20] Add APIs to switch n1/n2 VMCS Eddie Dong
@ 2011-06-02 14:50   ` Tim Deegan
  2011-06-03  7:30     ` Dong, Eddie
  0 siblings, 1 reply; 45+ messages in thread
From: Tim Deegan @ 2011-06-02 14:50 UTC (permalink / raw)
  To: Eddie Dong; +Cc: xen-devel

At 16:57 +0800 on 02 Jun (1307033845), Eddie Dong wrote:
> diff -r 4631a9511200 -r 62cc6c7516e0 xen/arch/x86/hvm/vmx/vmcs.c
> --- a/xen/arch/x86/hvm/vmx/vmcs.c	Thu Jun 02 16:33:20 2011 +0800
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c	Thu Jun 02 16:33:21 2011 +0800
> @@ -669,6 +669,38 @@ void vmx_disable_intercept_for_msr(struc
>      }
>  }
>  
> +/*
> + * Switch VMCS between layer 1 & 2 guest
> + */
> +void vmx_vmcs_switch(struct vcpu *v,
> +                             struct vmcs_struct *from,
> +                             struct vmcs_struct *to)
> +{
> +    /* no foreign access */
> +    if ( unlikely(v != current) )
> +        return;
> +
> +    if ( unlikely(current->arch.hvm_vmx.vmcs != from) )
> +        return;

Do you really want this function to fail silently if called with v !=
current?  Use ASSERT(), or, even better, remove the first argument
entirely.

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 15 of 20] Switch shadow/virtual VMCS between n1/n2 guests
  2011-06-02  8:57 ` [PATCH 15 of 20] Switch shadow/virtual VMCS between n1/n2 guests Eddie Dong
@ 2011-06-02 14:56   ` Tim Deegan
  2011-06-03  7:57     ` Dong, Eddie
  2011-06-02 14:58   ` Tim Deegan
  1 sibling, 1 reply; 45+ messages in thread
From: Tim Deegan @ 2011-06-02 14:56 UTC (permalink / raw)
  To: Eddie Dong; +Cc: xen-devel

At 16:57 +0800 on 02 Jun (1307033848), Eddie Dong wrote:
> +static void nvmx_update_exit_control(struct vcpu *v,
> +					unsigned long host_cntrl)
> +{
> +    u32 shadow_cntrl;
> +    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
> +
> +#define REMOVED_EXIT_CONTROL_BITS    ((1<<2) |           \

Define a macro for whatever 1<<2 means here, please. 

> +                (VM_EXIT_SAVE_GUEST_PAT) |               \
> +                (VM_EXIT_SAVE_GUEST_EFER) |              \
> +                (VM_EXIT_SAVE_PREEMPT_TIMER))
> +    shadow_cntrl = __get_vvmcs(nvcpu->nv_vvmcx, VM_EXIT_CONTROLS);
> +    shadow_cntrl &= ~REMOVED_EXIT_CONTROL_BITS;
> +    shadow_cntrl |= host_cntrl;
> +    __vmwrite(VM_EXIT_CONTROLS, shadow_cntrl);
> +}
[...]
> +static void sync_vvmcs_guest_state(struct vcpu *v, struct cpu_user_regs *regs)
> +{
> +    int i;
> +    unsigned long mask;
> +    unsigned long cr;
> +    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
> +    void *vvmcs = nvcpu->nv_vvmcx;
> +
> +    /* copy shadow vmcs.gstate back to vvmcs.gstate */
> +    for ( i = 0; i < ARRAY_SIZE(vmcs_gstate_field); i++ )
> +        shadow_to_vvmcs(vvmcs, vmcs_gstate_field[i]);
> +    /* RIP, RSP are in user regs */
> +    __set_vvmcs(vvmcs, GUEST_RIP, regs->rip);
> +    __set_vvmcs(vvmcs, GUEST_RSP, regs->rsp);
> +
> +    /* SDM 20.6.6: L2 guest execution may change GUEST CR0/CR4 */
> +    mask = __get_vvmcs(vvmcs, CR0_GUEST_HOST_MASK);
> +    if ( ~mask )
> +    {
> +        cr = __get_vvmcs(vvmcs, GUEST_CR0);
> +        cr = (cr & mask) | (__vmread(GUEST_CR4) & ~mask);

Cut-n-paste error?                      ^^^^^^^^^

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 15 of 20] Switch shadow/virtual VMCS between n1/n2 guests
  2011-06-02  8:57 ` [PATCH 15 of 20] Switch shadow/virtual VMCS between n1/n2 guests Eddie Dong
  2011-06-02 14:56   ` Tim Deegan
@ 2011-06-02 14:58   ` Tim Deegan
  1 sibling, 0 replies; 45+ messages in thread
From: Tim Deegan @ 2011-06-02 14:58 UTC (permalink / raw)
  To: Eddie Dong; +Cc: xen-devel

Hi,

> +asmlinkage void nvmx_switch_guest(void)
> +{
> +    struct vcpu *v = current;
> +    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
> +    struct cpu_user_regs *regs = guest_cpu_user_regs();
> +
> +    /*
> +     * a softirq may interrupt us between a virtual vmentry is
> +     * just handled and the true vmentry. If during this window,
> +     * a L1 virtual interrupt causes another virtual vmexit, we
> +     * cannot let that happen or VM_ENTRY_INTR_INFO will be lost.
> +     */
> +    if ( unlikely(nvcpu->nv_vmswitch_in_progress) )
> +        return;
> +
> +    if ( nestedhvm_vcpu_in_guestmode(v) && nvcpu->nv_vmexit_pending )
> +    {
> +        local_irq_enable();

Why?  Is this function every called with interrupts disabled?  And if
so, will its caller deal with having them enabled when it exits?

> +        virtual_vmexit(regs);
> +    }
> +    else if ( !nestedhvm_vcpu_in_guestmode(v) && nvcpu->nv_vmentry_pending )
> +    {
> +        local_irq_enable();

ditto.

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 17 of 20] VM exit handler of n2-guest
  2011-06-02  8:57 ` [PATCH 17 of 20] VM exit handler of n2-guest Eddie Dong
@ 2011-06-02 14:59   ` Tim Deegan
  2011-06-03  8:06     ` Dong, Eddie
  0 siblings, 1 reply; 45+ messages in thread
From: Tim Deegan @ 2011-06-02 14:59 UTC (permalink / raw)
  To: Eddie Dong; +Cc: xen-devel

At 16:57 +0800 on 02 Jun (1307033850), Eddie Dong wrote:
> +    case EXIT_REASON_WBINVD:
> +    case EXIT_REASON_EPT_VIOLATION:
> +    case EXIT_REASON_EPT_MISCONFIG:
> +    case EXIT_REASON_EXTERNAL_INTERRUPT:
> +        /* pass to L0 handler */
> +        break;

If the L1 guest asked to intercept WBINVD, will it ever get the VMEXIT?
I didn't see any code in the L0 WBINVD handler to pass it on.

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 19 of 20] Add VMXE bits in virtual CR4
  2011-06-02  8:57 ` [PATCH 19 of 20] Add VMXE bits in virtual CR4 Eddie Dong
@ 2011-06-02 15:01   ` Tim Deegan
  2011-06-03  8:12     ` Dong, Eddie
  0 siblings, 1 reply; 45+ messages in thread
From: Tim Deegan @ 2011-06-02 15:01 UTC (permalink / raw)
  To: Eddie Dong; +Cc: xen-devel

At 16:57 +0800 on 02 Jun (1307033852), Eddie Dong wrote:
> diff -r 0cedbe9214c1 -r c046b2513520 xen/include/asm-x86/hvm/hvm.h
> --- a/xen/include/asm-x86/hvm/hvm.h	Thu Jun 02 16:33:21 2011 +0800
> +++ b/xen/include/asm-x86/hvm/hvm.h	Thu Jun 02 16:33:21 2011 +0800
> @@ -313,6 +313,8 @@ static inline int hvm_do_pmu_interrupt(s
>          X86_CR4_DE  | X86_CR4_PSE | X86_CR4_PAE |       \
>          X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE |       \
>          X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT |           \
> +	((nestedhvm_enabled((_v)->domain) &&            \
> +          cpu_has_vmx) ? X86_CR4_VMXE : 0)  |       	\

Should we also add VMXE to this mask even if !nestedhvm_enabled()?

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 20 of 20] n2 MSR handling and capability exposure
  2011-06-02  8:57 ` [PATCH 20 of 20] n2 MSR handling and capability exposure Eddie Dong
@ 2011-06-02 15:07   ` Tim Deegan
  2011-06-02 15:11     ` Tim Deegan
  2011-06-03  8:25     ` Dong, Eddie
  0 siblings, 2 replies; 45+ messages in thread
From: Tim Deegan @ 2011-06-02 15:07 UTC (permalink / raw)
  To: Eddie Dong; +Cc: xen-devel

At 16:57 +0800 on 02 Jun (1307033853), Eddie Dong wrote:
> +    case MSR_IA32_VMX_PINBASED_CTLS:
> +#define REMOVED_PIN_CONTROL_CAP (PIN_BASED_PREEMPT_TIMER)
> +        rdmsr(msr, eax, edx);
> +        data = edx;
> +        data = (data << 32) | eax;
> +        break;

You don't actually mask the value here. 

BTW, I don't really like defining all these REMOVED_* macros, each
of which is used only once a few lines from the definition (here and
elsewhere in the series).  It just adds clutter for no benefit.

Tim.

> +    case MSR_IA32_VMX_PROCBASED_CTLS:
> +        rdmsr(msr, eax, edx);
> +#define REMOVED_EXEC_CONTROL_CAP (CPU_BASED_TPR_SHADOW \
> +            | CPU_BASED_ACTIVATE_MSR_BITMAP            \
> +            | CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)
> +        data = edx & ~REMOVED_EXEC_CONTROL_CAP;
> +        data = (data << 32) | eax;
> +        break;
> +    case MSR_IA32_VMX_EXIT_CTLS:
> +        rdmsr(msr, eax, edx);
> +#define REMOVED_EXIT_CONTROL_CAP (VM_EXIT_SAVE_GUEST_PAT \
> +            | VM_EXIT_LOAD_HOST_PAT                      \
> +            | VM_EXIT_SAVE_GUEST_EFER                    \
> +            | VM_EXIT_LOAD_HOST_EFER                     \
> +            | VM_EXIT_SAVE_PREEMPT_TIMER)
> +        data = edx & ~REMOVED_EXIT_CONTROL_CAP;
> +        data = (data << 32) | eax;
> +        break;
> +    case MSR_IA32_VMX_ENTRY_CTLS:
> +        rdmsr(msr, eax, edx);
> +#define REMOVED_ENTRY_CONTROL_CAP (VM_ENTRY_LOAD_GUEST_PAT \
> +            | VM_ENTRY_LOAD_GUEST_EFER)
> +        data = edx & ~REMOVED_ENTRY_CONTROL_CAP;
> +        data = (data << 32) | eax;
> +        break;
> +    case MSR_IA32_VMX_PROCBASED_CTLS2:
> +        mask = 0;
> +
> +        rdmsr(msr, eax, edx);
> +        data = edx & mask;
> +        data = (data << 32) | eax;
> +        break;
> +

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 20 of 20] n2 MSR handling and capability exposure
  2011-06-02 15:07   ` Tim Deegan
@ 2011-06-02 15:11     ` Tim Deegan
  2011-06-02 19:20       ` Keir Fraser
  2011-06-03  8:39       ` Dong, Eddie
  2011-06-03  8:25     ` Dong, Eddie
  1 sibling, 2 replies; 45+ messages in thread
From: Tim Deegan @ 2011-06-02 15:11 UTC (permalink / raw)
  To: Eddie Dong; +Cc: xen-devel

At 16:07 +0100 on 02 Jun (1307030872), Tim Deegan wrote:
> At 16:57 +0800 on 02 Jun (1307033853), Eddie Dong wrote:
> > +    case MSR_IA32_VMX_PINBASED_CTLS:
> > +#define REMOVED_PIN_CONTROL_CAP (PIN_BASED_PREEMPT_TIMER)
> > +        rdmsr(msr, eax, edx);
> > +        data = edx;
> > +        data = (data << 32) | eax;
> > +        break;
> 
> You don't actually mask the value here. 
> 
> BTW, I don't really like defining all these REMOVED_* macros, each
> of which is used only once a few lines from the definition (here and
> elsewhere in the series).  It just adds clutter for no benefit.
> 

Oh, I forgot to say: will this feature-blacklisting work over live
migration to a machine with a different CPU?  There isn't an equivalnet
of the CPUID masking feature to make all the machines in a cluster seem
to have the same VMX features. 

Elsewhere we use whitelisting for passsing hardware capability flags to
HVM guests; I think we should use whitelists here too. 

Cheers,

Tim.

 
> > +    case MSR_IA32_VMX_PROCBASED_CTLS:
> > +        rdmsr(msr, eax, edx);
> > +#define REMOVED_EXEC_CONTROL_CAP (CPU_BASED_TPR_SHADOW \
> > +            | CPU_BASED_ACTIVATE_MSR_BITMAP            \
> > +            | CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)
> > +        data = edx & ~REMOVED_EXEC_CONTROL_CAP;
> > +        data = (data << 32) | eax;
> > +        break;
> > +    case MSR_IA32_VMX_EXIT_CTLS:
> > +        rdmsr(msr, eax, edx);
> > +#define REMOVED_EXIT_CONTROL_CAP (VM_EXIT_SAVE_GUEST_PAT \
> > +            | VM_EXIT_LOAD_HOST_PAT                      \
> > +            | VM_EXIT_SAVE_GUEST_EFER                    \
> > +            | VM_EXIT_LOAD_HOST_EFER                     \
> > +            | VM_EXIT_SAVE_PREEMPT_TIMER)
> > +        data = edx & ~REMOVED_EXIT_CONTROL_CAP;
> > +        data = (data << 32) | eax;
> > +        break;
> > +    case MSR_IA32_VMX_ENTRY_CTLS:
> > +        rdmsr(msr, eax, edx);
> > +#define REMOVED_ENTRY_CONTROL_CAP (VM_ENTRY_LOAD_GUEST_PAT \
> > +            | VM_ENTRY_LOAD_GUEST_EFER)
> > +        data = edx & ~REMOVED_ENTRY_CONTROL_CAP;
> > +        data = (data << 32) | eax;
> > +        break;
> > +    case MSR_IA32_VMX_PROCBASED_CTLS2:
> > +        mask = 0;
> > +
> > +        rdmsr(msr, eax, edx);
> > +        data = edx & mask;
> > +        data = (data << 32) | eax;
> > +        break;
> > +
> 

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 20 of 20] n2 MSR handling and capability exposure
  2011-06-02 15:11     ` Tim Deegan
@ 2011-06-02 19:20       ` Keir Fraser
  2011-06-03  8:39       ` Dong, Eddie
  1 sibling, 0 replies; 45+ messages in thread
From: Keir Fraser @ 2011-06-02 19:20 UTC (permalink / raw)
  To: Tim Deegan, Eddie Dong; +Cc: xen-devel

On 02/06/2011 16:11, "Tim Deegan" <Tim.Deegan@citrix.com> wrote:

>> BTW, I don't really like defining all these REMOVED_* macros, each
>> of which is used only once a few lines from the definition (here and
>> elsewhere in the series).  It just adds clutter for no benefit.
>> 
> 
> Oh, I forgot to say: will this feature-blacklisting work over live
> migration to a machine with a different CPU?  There isn't an equivalnet
> of the CPUID masking feature to make all the machines in a cluster seem
> to have the same VMX features.
> 
> Elsewhere we use whitelisting for passsing hardware capability flags to
> HVM guests; I think we should use whitelists here too.

Blacklists create a total mess of doom. We should absolutely disallow the
creation of any new ones. I think HVM guests are currently clean in this
regard and should stay that way.

 -- Keir

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH 00 of 20] NestedVMX support
  2011-06-02 14:33 ` [PATCH 00 of 20] NestedVMX support Tim Deegan
@ 2011-06-03  5:47   ` Dong, Eddie
  0 siblings, 0 replies; 45+ messages in thread
From: Dong, Eddie @ 2011-06-03  5:47 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Dong, Eddie



> -----Original Message-----
> From: Tim Deegan [mailto:Tim.Deegan@citrix.com]
> Sent: Thursday, June 02, 2011 10:34 PM
> To: Dong, Eddie
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] [PATCH 00 of 20] NestedVMX support> 
> Hi,
> 
> Thanks for these patches.  They look pretty good; I have a few comments
> on the individual patches that I'll post separately.
> 
> Overall the only worry I have is the number of TODOs left at the end of
> the series.  Some of them are obvioulsy ony important when you come to
> do the nested EPT work.  I'd appreciate a comment on whether you think
> any of these is important:
> 
> +static int nvmx_intr_intercept(struct vcpu *v, struct hvm_intack
> intack)
> +{
> +    u32 exit_ctrl;
> +
> +    /*
> +     * TODO:
> +     *   - if L1 intr-window exiting == 0
> +     *   - vNMI
> +     */
> 

Deleted.

> 
> +static int decode_vmx_inst(struct cpu_user_regs *regs,
> +                           struct vmx_inst_decoded *decode,
> +                           unsigned long *poperandS, int
> vmxon_check)
> +{
> [...]
> +        /* TODO: segment type check */
> 

Fixed.

> This one, at least, I think does need to be fixed!
> 
> 
> +static void load_shadow_control(struct vcpu *v)
> +{
> +    /* TODO: Make sure the shadow control doesn't set the bits
> +     * L0 VMM doesn't handle.
> +     */
> 
deleted
> 
> +int nvmx_handle_vmlaunch(struct cpu_user_regs *regs)
> +{
> +    /* TODO: check for initial launch/resume */
> +    return nvmx_handle_vmresume(regs);
> +}
> 

Handled w/ correct launch state.
> 
> +void nvmx_idtv_handling(void)
> +{
> [...]
> +    /* TODO: NMI */
> +}
> 
deleted
> 
> +static void load_shadow_guest_state(struct vcpu *v)
> +{
> [...]
> +    /* XXX: should refer to GUEST_HOST_MASK of both L0 and L1 */
> 
Deleted and will revisit later.

> 
> Cheers,
> 
> Tim.
> 
> --
> Tim Deegan <Tim.Deegan@citrix.com>
> Principal Software Engineer, Xen Platform Team
> Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH 05 of 20] Emulation of guest VMXON/OFF instruction
  2011-06-02 14:36   ` Tim Deegan
@ 2011-06-03  5:54     ` Dong, Eddie
  0 siblings, 0 replies; 45+ messages in thread
From: Dong, Eddie @ 2011-06-03  5:54 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Dong, Eddie

> >  obj-y += vpmu_core2.o
> >  obj-y += vvmx.o
> > +obj-y += vvmx.o
> 
> Harmless, but wrong. :)
> 

Thanks, a patch merge introduced error :)
Fixed.
Eddie

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH 07 of 20] Emulation of guest vmptrld
  2011-06-02 14:45   ` Tim Deegan
@ 2011-06-03  6:07     ` Dong, Eddie
  2011-06-03  8:42       ` Tim Deegan
  0 siblings, 1 reply; 45+ messages in thread
From: Dong, Eddie @ 2011-06-03  6:07 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Dong, Eddie

> > +    if ( vmcs_reg == IO_BITMAP_A )
> > +    {
> > +        if (nvmx->iobitmap[0]) {
> > +            unmap_domain_page_global(nvmx->iobitmap[0]);
> > +        }
> > +        gpa = __get_vvmcs(vcpu_nestedhvm(v).nv_vvmcx,
> IO_BITMAP_A);
> > +        mfn = mfn_x(gfn_to_mfn(p2m_get_hostp2m(v->domain),
> > +                              gpa >> PAGE_SHIFT, &p2mt));
> > +        nvmx->iobitmap[0] = map_domain_page_global(mfn);
> 
> Why are these maps _global?  It might be OK to use 2 more global
> mappings per VCPU but the reason should probably go in a comment beside
> the call.

Do you mean to use hvm_map_guest_frame_ro? Fine to me.
> 
> Also, I don't see where these mappings get torn down on domain
> destruction.
> 
Yes. Fixed in nvmx_vcpu_destroy.

> (While I'm looking at this code, this function is quite ugly.  Why have
> a single function if you're going to duplicate its contents anyway?)

??? We don't know fi guest changed the bitmap, so we have to check each time.

> 
> +
> > +    if ( nvcpu->nv_vvmcxaddr == VMCX_EADDR )
> > +    {
> > +        mfn = mfn_x(gfn_to_mfn(p2m_get_hostp2m(v->domain),
> > +                               gpa >> PAGE_SHIFT, &p2mt));
> > +        nvcpu->nv_vvmcx = map_domain_page_global(mfn);
> 
> Again, why _global?

Will fix with hvm_map_guest_frame.

Thx, Eddie

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH 12 of 20] Add APIs to switch n1/n2 VMCS
  2011-06-02 14:50   ` Tim Deegan
@ 2011-06-03  7:30     ` Dong, Eddie
  0 siblings, 0 replies; 45+ messages in thread
From: Dong, Eddie @ 2011-06-03  7:30 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Dong, Eddie

> > +    /* no foreign access */
> > +    if ( unlikely(v != current) )
> > +        return;
> > +
> > +    if ( unlikely(current->arch.hvm_vmx.vmcs != from) )
> > +        return;
> 
> Do you really want this function to fail silently if called with v !=
> current?  Use ASSERT(), or, even better, remove the first argument
> entirely.
> 
Deleted.

Thx, Eddie

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH 15 of 20] Switch shadow/virtual VMCS between n1/n2 guests
  2011-06-02 14:56   ` Tim Deegan
@ 2011-06-03  7:57     ` Dong, Eddie
  0 siblings, 0 replies; 45+ messages in thread
From: Dong, Eddie @ 2011-06-03  7:57 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Dong, Eddie


> > +    u32 shadow_cntrl;
> > +    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
> > +
> > +#define REMOVED_EXIT_CONTROL_BITS    ((1<<2) |           \
> 
> Define a macro for whatever 1<<2 means here, please.
> 

Done.

> > +                (VM_EXIT_SAVE_GUEST_PAT) |               \
> > +                (VM_EXIT_SAVE_GUEST_EFER) |              \
> > +                (VM_EXIT_SAVE_PREEMPT_TIMER))
> > +    shadow_cntrl = __get_vvmcs(nvcpu->nv_vvmcx,
> VM_EXIT_CONTROLS);
> > +    shadow_cntrl &= ~REMOVED_EXIT_CONTROL_BITS;
> > +    shadow_cntrl |= host_cntrl;
> > +    __vmwrite(VM_EXIT_CONTROLS, shadow_cntrl);
> > +}



> > +    /* SDM 20.6.6: L2 guest execution may change GUEST CR0/CR4 */
> > +    mask = __get_vvmcs(vvmcs, CR0_GUEST_HOST_MASK);
> > +    if ( ~mask )
> > +    {
> > +        cr = __get_vvmcs(vvmcs, GUEST_CR0);
> > +        cr = (cr & mask) | (__vmread(GUEST_CR4) & ~mask);
> 
> Cut-n-paste error?                      ^^^^^^^^^
> 
Oh, Yes, Thanks.
Eddie

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH 17 of 20] VM exit handler of n2-guest
  2011-06-02 14:59   ` Tim Deegan
@ 2011-06-03  8:06     ` Dong, Eddie
  2011-06-03  8:43       ` Tim Deegan
  0 siblings, 1 reply; 45+ messages in thread
From: Dong, Eddie @ 2011-06-03  8:06 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Dong, Eddie

> At 16:57 +0800 on 02 Jun (1307033850), Eddie Dong wrote:
> > +    case EXIT_REASON_WBINVD:
> > +    case EXIT_REASON_EPT_VIOLATION:
> > +    case EXIT_REASON_EPT_MISCONFIG:
> > +    case EXIT_REASON_EXTERNAL_INTERRUPT:
> > +        /* pass to L0 handler */
> > +        break;
> 
> If the L1 guest asked to intercept WBINVD, will it ever get the VMEXIT?
> I didn't see any code in the L0 WBINVD handler to pass it on.
> 
Current patch doesn't expose Secondary Processor-Based VM-Execution Controls. So WBINVD exiting capability is removed in L1 guest.

Thx, Eddie

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH 19 of 20] Add VMXE bits in virtual CR4
  2011-06-02 15:01   ` Tim Deegan
@ 2011-06-03  8:12     ` Dong, Eddie
  0 siblings, 0 replies; 45+ messages in thread
From: Dong, Eddie @ 2011-06-03  8:12 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Dong, Eddie

> > diff -r 0cedbe9214c1 -r c046b2513520 xen/include/asm-x86/hvm/hvm.h
> > --- a/xen/include/asm-x86/hvm/hvm.h	Thu Jun 02 16:33:21 2011 +0800
> > +++ b/xen/include/asm-x86/hvm/hvm.h	Thu Jun 02 16:33:21 2011 +0800
> > @@ -313,6 +313,8 @@ static inline int hvm_do_pmu_interrupt(s
> >          X86_CR4_DE  | X86_CR4_PSE | X86_CR4_PAE |       \
> >          X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE |       \
> >          X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT |           \
> > +	((nestedhvm_enabled((_v)->domain) &&            \
> > +          cpu_has_vmx) ? X86_CR4_VMXE : 0)  |       	\
> 
> Should we also add VMXE to this mask even if !nestedhvm_enabled()?
> 
Fine.
Eddie

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH 20 of 20] n2 MSR handling and capability exposure
  2011-06-02 15:07   ` Tim Deegan
  2011-06-02 15:11     ` Tim Deegan
@ 2011-06-03  8:25     ` Dong, Eddie
  1 sibling, 0 replies; 45+ messages in thread
From: Dong, Eddie @ 2011-06-03  8:25 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Dong, Eddie

> 
> At 16:57 +0800 on 02 Jun (1307033853), Eddie Dong wrote:
> > +    case MSR_IA32_VMX_PINBASED_CTLS:
> > +#define REMOVED_PIN_CONTROL_CAP (PIN_BASED_PREEMPT_TIMER)
> > +        rdmsr(msr, eax, edx);
> > +        data = edx;
> > +        data = (data << 32) | eax;
> > +        break;
> 
> You don't actually mask the value here.

Fixed. 

> 
> BTW, I don't really like defining all these REMOVED_* macros, each
> of which is used only once a few lines from the definition (here and
> elsewhere in the series).  It just adds clutter for no benefit.

OK, removed to be in code itself.

Thx, Eddie

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH 20 of 20] n2 MSR handling and capability exposure
  2011-06-02 15:11     ` Tim Deegan
  2011-06-02 19:20       ` Keir Fraser
@ 2011-06-03  8:39       ` Dong, Eddie
  1 sibling, 0 replies; 45+ messages in thread
From: Dong, Eddie @ 2011-06-03  8:39 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Dong, Eddie

> 
> Oh, I forgot to say: will this feature-blacklisting work over live
> migration to a machine with a different CPU?  There isn't an equivalnet
> of the CPUID masking feature to make all the machines in a cluster seem
> to have the same VMX features.

That seems to be an issue neutral to nested virtualization. We should be able to migrate among same processors. But it is difficult to migrate a L2 guest to other machine as L1 guest. It may be OK evnetually, but not addressed right now.

My understanding is that same CPUID doesn't mean exactly same capability. 

> 
> Elsewhere we use whitelisting for passsing hardware capability flags to
> HVM guests; I think we should use whitelists here too.
> 

Thx, Eddie

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 07 of 20] Emulation of guest vmptrld
  2011-06-03  6:07     ` Dong, Eddie
@ 2011-06-03  8:42       ` Tim Deegan
  2011-06-07  1:48         ` Dong, Eddie
  0 siblings, 1 reply; 45+ messages in thread
From: Tim Deegan @ 2011-06-03  8:42 UTC (permalink / raw)
  To: Dong, Eddie; +Cc: xen-devel

At 14:07 +0800 on 03 Jun (1307110060), Dong, Eddie wrote:
> > > +    if ( vmcs_reg == IO_BITMAP_A )
> > > +    {
> > > +        if (nvmx->iobitmap[0]) {
> > > +            unmap_domain_page_global(nvmx->iobitmap[0]);
> > > +        }
> > > +        gpa = __get_vvmcs(vcpu_nestedhvm(v).nv_vvmcx,
> > IO_BITMAP_A);
> > > +        mfn = mfn_x(gfn_to_mfn(p2m_get_hostp2m(v->domain),
> > > +                              gpa >> PAGE_SHIFT, &p2mt));
> > > +        nvmx->iobitmap[0] = map_domain_page_global(mfn);
> > 
> > Why are these maps _global?  It might be OK to use 2 more global
> > mappings per VCPU but the reason should probably go in a comment beside
> > the call.
> 
> Do you mean to use hvm_map_guest_frame_ro? Fine to me.

Yes, I think that would be better unless you know there's a point where
the bitmaps are accessed on a vcpu other than current.  (On 64-bit it
makes no difference but on 32-bit map_domain_page_global() uses up a
global shared resource).

> > 
> > Also, I don't see where these mappings get torn down on domain
> > destruction.
> > 
> Yes. Fixed in nvmx_vcpu_destroy.
> 
> > (While I'm looking at this code, this function is quite ugly.  Why have
> > a single function if you're going to duplicate its contents anyway?)
> 
> ??? We don't know fi guest changed the bitmap, so we have to check each time.

I think I wasn't clear.  The logic is fine, I was just cavilling about
coding style.  You have some code that's basically

f1() { BUNCH_O_CODE(1) }

f2() { BUNCH_O_CODE(2) }

and places that need to call f1(), f2() or both.  Merging those into a
single function is a good idea, but the function should look like

f(x) { 
  int i = (x ? 1 : 2)
  BUNCH_O_CODE(i)
}

and what you have is

f(x) {
  if (x) 
     BUNCH_O_CODE(1)
  else
     BUNCH_O_CODE(2)
}

which keeps the duplication. 

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 17 of 20] VM exit handler of n2-guest
  2011-06-03  8:06     ` Dong, Eddie
@ 2011-06-03  8:43       ` Tim Deegan
  0 siblings, 0 replies; 45+ messages in thread
From: Tim Deegan @ 2011-06-03  8:43 UTC (permalink / raw)
  To: Dong, Eddie; +Cc: xen-devel

At 16:06 +0800 on 03 Jun (1307117213), Dong, Eddie wrote:
> > At 16:57 +0800 on 02 Jun (1307033850), Eddie Dong wrote:
> > > +    case EXIT_REASON_WBINVD:
> > > +    case EXIT_REASON_EPT_VIOLATION:
> > > +    case EXIT_REASON_EPT_MISCONFIG:
> > > +    case EXIT_REASON_EXTERNAL_INTERRUPT:
> > > +        /* pass to L0 handler */
> > > +        break;
> > 
> > If the L1 guest asked to intercept WBINVD, will it ever get the VMEXIT?
> > I didn't see any code in the L0 WBINVD handler to pass it on.
> > 

> Current patch doesn't expose Secondary Processor-Based VM-Execution
> Controls. So WBINVD exiting capability is removed in L1 guest.

Ah, OK, thanks. 

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* RE: [PATCH 07 of 20] Emulation of guest vmptrld
  2011-06-03  8:42       ` Tim Deegan
@ 2011-06-07  1:48         ` Dong, Eddie
  0 siblings, 0 replies; 45+ messages in thread
From: Dong, Eddie @ 2011-06-07  1:48 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Dong, Eddie

> > > (While I'm looking at this code, this function is quite ugly.  Why have
> > > a single function if you're going to duplicate its contents anyway?)
> >
> > ??? We don't know fi guest changed the bitmap, so we have to check each
> time.
> 
> I think I wasn't clear.  The logic is fine, I was just cavilling about
> coding style.  You have some code that's basically
> 
I see, yes it is better and fixed.

Thx, Eddie

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 17 of 20] VM exit handler of n2-guest
  2011-06-09  8:25 [PATCH 00 of 20] Rebased Nested VMX v2 Eddie Dong
@ 2011-06-09  8:25 ` Eddie Dong
  0 siblings, 0 replies; 45+ messages in thread
From: Eddie Dong @ 2011-06-09  8:25 UTC (permalink / raw)
  To: Tim.Deegan; +Cc: xen-devel

# HG changeset patch
# User Eddie Dong <eddie.dong@intel.com>
# Date 1307607849 -28800
# Node ID 5c3ab1e07ab1c1a903660f1c48a54aa67f738a7e
# Parent  4496678bbb000792aafa7e34a14ab893f5a32b8e
VM exit handler of n2-guest

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>

diff -r 4496678bbb00 -r 5c3ab1e07ab1 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 09 16:24:09 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Thu Jun 09 16:24:09 2011 +0800
@@ -943,6 +943,10 @@ static void vmx_set_segment_register(str
 static void vmx_set_tsc_offset(struct vcpu *v, u64 offset)
 {
     vmx_vmcs_enter(v);
+
+    if ( nestedhvm_vcpu_in_guestmode(v) )
+        offset += nvmx_get_tsc_offset(v);
+
     __vmwrite(TSC_OFFSET, offset);
 #if defined (__i386__)
     __vmwrite(TSC_OFFSET_HIGH, offset >> 32);
@@ -2258,6 +2262,11 @@ asmlinkage void vmx_vmexit_handler(struc
      * any pending vmresume has really happened
      */
     vcpu_nestedhvm(v).nv_vmswitch_in_progress = 0;
+    if ( nestedhvm_vcpu_in_guestmode(v) )
+    {
+        if ( nvmx_n2_vmexit_handler(regs, exit_reason) )
+            goto out;
+    }
 
     if ( unlikely(exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) )
         return vmx_failed_vmentry(exit_reason, regs);
@@ -2655,6 +2664,7 @@ asmlinkage void vmx_vmexit_handler(struc
         break;
     }
 
+out:
     if ( nestedhvm_vcpu_in_guestmode(v) )
         nvmx_idtv_handling();
 }
diff -r 4496678bbb00 -r 5c3ab1e07ab1 xen/arch/x86/hvm/vmx/vvmx.c
--- a/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 09 16:24:09 2011 +0800
+++ b/xen/arch/x86/hvm/vmx/vvmx.c	Thu Jun 09 16:24:09 2011 +0800
@@ -286,13 +286,19 @@ static int vmx_inst_check_privilege(stru
     if ( (regs->eflags & X86_EFLAGS_VM) ||
          (hvm_long_mode_enabled(v) && cs.attr.fields.l == 0) )
         goto invalid_op;
-    /* TODO: check vmx operation mode */
+    else if ( nestedhvm_vcpu_in_guestmode(v) )
+        goto vmexit;
 
     if ( (cs.sel & 3) > 0 )
         goto gp_fault;
 
     return X86EMUL_OKAY;
 
+vmexit:
+    gdprintk(XENLOG_ERR, "vmx_inst_check_privilege: vmexit\n");
+    vcpu_nestedhvm(v).nv_vmexit_pending = 1;
+    return X86EMUL_EXCEPTION;
+    
 invalid_op:
     gdprintk(XENLOG_ERR, "vmx_inst_check_privilege: invalid_op\n");
     hvm_inject_exception(TRAP_invalid_op, 0, 0);
@@ -589,6 +595,18 @@ static void nvmx_purge_vvmcs(struct vcpu
     }
 }
 
+u64 nvmx_get_tsc_offset(struct vcpu *v)
+{
+    u64 offset = 0;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+
+    if ( __get_vvmcs(nvcpu->nv_vvmcx, CPU_BASED_VM_EXEC_CONTROL) &
+         CPU_BASED_USE_TSC_OFFSETING )
+        offset = __get_vvmcs(nvcpu->nv_vvmcx, TSC_OFFSET);
+
+    return offset;
+}
+
 /*
  * Context synchronized between shadow and virtual VMCS.
  */
@@ -738,6 +756,8 @@ static void load_shadow_guest_state(stru
     hvm_set_cr4(__get_vvmcs(vvmcs, GUEST_CR4));
     hvm_set_cr3(__get_vvmcs(vvmcs, GUEST_CR3));
 
+    hvm_funcs.set_tsc_offset(v, v->arch.hvm_vcpu.cache_tsc_offset);
+
     vvmcs_to_shadow(vvmcs, VM_ENTRY_INTR_INFO);
     vvmcs_to_shadow(vvmcs, VM_ENTRY_EXCEPTION_ERROR_CODE);
     vvmcs_to_shadow(vvmcs, VM_ENTRY_INSTRUCTION_LEN);
@@ -865,6 +885,8 @@ static void load_vvmcs_host_state(struct
     hvm_set_cr4(__get_vvmcs(vvmcs, HOST_CR4));
     hvm_set_cr3(__get_vvmcs(vvmcs, HOST_CR3));
 
+    hvm_funcs.set_tsc_offset(v, v->arch.hvm_vcpu.cache_tsc_offset);
+
     __set_vvmcs(vvmcs, VM_ENTRY_INTR_INFO, 0);
 }
 
@@ -1261,3 +1283,195 @@ void nvmx_idtv_handling(void)
    }
 }
 
+/*
+ * L2 VMExit handling
+ *    return 1: Done or skip the normal layer 0 hypervisor process.
+ *              Typically it requires layer 1 hypervisor processing
+ *              or it may be already processed here.
+ *           0: Require the normal layer 0 process.
+ */
+int nvmx_n2_vmexit_handler(struct cpu_user_regs *regs,
+                               unsigned int exit_reason)
+{
+    struct vcpu *v = current;
+    struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
+    u32 ctrl;
+    u16 port;
+    u8 *bitmap;
+
+    nvcpu->nv_vmexit_pending = 0;
+    nvmx->intr.intr_info = 0;
+    nvmx->intr.error_code = 0;
+
+    switch (exit_reason) {
+    case EXIT_REASON_EXCEPTION_NMI:
+    {
+        u32 intr_info = __vmread(VM_EXIT_INTR_INFO);
+        u32 valid_mask = (X86_EVENTTYPE_HW_EXCEPTION << 8) |
+                         INTR_INFO_VALID_MASK;
+        u64 exec_bitmap;
+        int vector = intr_info & INTR_INFO_VECTOR_MASK;
+
+        /*
+         * decided by L0 and L1 exception bitmap, if the vetor is set by
+         * both, L0 has priority on #PF, L1 has priority on others
+         */
+        if ( vector == TRAP_page_fault )
+        {
+            if ( paging_mode_hap(v->domain) )
+                nvcpu->nv_vmexit_pending = 1;
+        }
+        else if ( (intr_info & valid_mask) == valid_mask )
+        {
+            exec_bitmap =__get_vvmcs(nvcpu->nv_vvmcx, EXCEPTION_BITMAP);
+
+            if ( exec_bitmap & (1 << vector) )
+                nvcpu->nv_vmexit_pending = 1;
+        }
+        break;
+    }
+    case EXIT_REASON_WBINVD:
+    case EXIT_REASON_EPT_VIOLATION:
+    case EXIT_REASON_EPT_MISCONFIG:
+    case EXIT_REASON_EXTERNAL_INTERRUPT:
+        /* pass to L0 handler */
+        break;
+    case VMX_EXIT_REASONS_FAILED_VMENTRY:
+    case EXIT_REASON_TRIPLE_FAULT:
+    case EXIT_REASON_TASK_SWITCH:
+    case EXIT_REASON_CPUID:
+    case EXIT_REASON_MSR_READ:
+    case EXIT_REASON_MSR_WRITE:
+    case EXIT_REASON_VMCALL:
+    case EXIT_REASON_VMCLEAR:
+    case EXIT_REASON_VMLAUNCH:
+    case EXIT_REASON_VMPTRLD:
+    case EXIT_REASON_VMPTRST:
+    case EXIT_REASON_VMREAD:
+    case EXIT_REASON_VMRESUME:
+    case EXIT_REASON_VMWRITE:
+    case EXIT_REASON_VMXOFF:
+    case EXIT_REASON_VMXON:
+    case EXIT_REASON_INVEPT:
+        /* inject to L1 */
+        nvcpu->nv_vmexit_pending = 1;
+        break;
+    case EXIT_REASON_IO_INSTRUCTION:
+        ctrl = __n2_exec_control(v);
+        if ( ctrl & CPU_BASED_ACTIVATE_IO_BITMAP )
+        {
+            port = __vmread(EXIT_QUALIFICATION) >> 16;
+            bitmap = nvmx->iobitmap[port >> 15];
+            if ( bitmap[(port <<1) >> 4] & (1 << (port & 0x7)) )
+                nvcpu->nv_vmexit_pending = 1;
+            if ( !nvcpu->nv_vmexit_pending )
+               gdprintk(XENLOG_WARNING, "L0 PIO %x.\n", port);
+        }
+        else if ( ctrl & CPU_BASED_UNCOND_IO_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+        break;
+
+    case EXIT_REASON_PENDING_VIRT_INTR:
+        ctrl = __n2_exec_control(v);
+        if ( ctrl & CPU_BASED_VIRTUAL_INTR_PENDING )
+            nvcpu->nv_vmexit_pending = 1;
+        break;
+    case EXIT_REASON_PENDING_VIRT_NMI:
+        ctrl = __n2_exec_control(v);
+        if ( ctrl & CPU_BASED_VIRTUAL_NMI_PENDING )
+            nvcpu->nv_vmexit_pending = 1;
+        break;
+    /* L1 has priority handling several other types of exits */
+    case EXIT_REASON_HLT:
+        ctrl = __n2_exec_control(v);
+        if ( ctrl & CPU_BASED_HLT_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+        break;
+    case EXIT_REASON_RDTSC:
+        ctrl = __n2_exec_control(v);
+        if ( ctrl & CPU_BASED_RDTSC_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+        else
+        {
+            uint64_t tsc;
+
+            /*
+             * special handler is needed if L1 doesn't intercept rdtsc,
+             * avoiding changing guest_tsc and messing up timekeeping in L1
+             */
+            tsc = hvm_get_guest_tsc(v);
+            tsc += __get_vvmcs(nvcpu->nv_vvmcx, TSC_OFFSET);
+            regs->eax = (uint32_t)tsc;
+            regs->edx = (uint32_t)(tsc >> 32);
+
+            return 1;
+        }
+        break;
+    case EXIT_REASON_RDPMC:
+        ctrl = __n2_exec_control(v);
+        if ( ctrl & CPU_BASED_RDPMC_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+        break;
+    case EXIT_REASON_MWAIT_INSTRUCTION:
+        ctrl = __n2_exec_control(v);
+        if ( ctrl & CPU_BASED_MWAIT_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+        break;
+    case EXIT_REASON_PAUSE_INSTRUCTION:
+        ctrl = __n2_exec_control(v);
+        if ( ctrl & CPU_BASED_PAUSE_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+        break;
+    case EXIT_REASON_MONITOR_INSTRUCTION:
+        ctrl = __n2_exec_control(v);
+        if ( ctrl & CPU_BASED_MONITOR_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+        break;
+    case EXIT_REASON_DR_ACCESS:
+        ctrl = __n2_exec_control(v);
+        if ( ctrl & CPU_BASED_MOV_DR_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+        break;
+    case EXIT_REASON_INVLPG:
+        ctrl = __n2_exec_control(v);
+        if ( ctrl & CPU_BASED_INVLPG_EXITING )
+            nvcpu->nv_vmexit_pending = 1;
+        break;
+    case EXIT_REASON_CR_ACCESS:
+    {
+        u64 exit_qualification = __vmread(EXIT_QUALIFICATION);
+        int cr = exit_qualification & 15;
+        int write = (exit_qualification >> 4) & 3;
+        u32 mask = 0;
+
+        /* also according to guest exec_control */
+        ctrl = __n2_exec_control(v);
+
+        if ( cr == 3 )
+        {
+            mask = write? CPU_BASED_CR3_STORE_EXITING:
+                          CPU_BASED_CR3_LOAD_EXITING;
+            if ( ctrl & mask )
+                nvcpu->nv_vmexit_pending = 1;
+        }
+        else if ( cr == 8 )
+        {
+            mask = write? CPU_BASED_CR8_STORE_EXITING:
+                          CPU_BASED_CR8_LOAD_EXITING;
+            if ( ctrl & mask )
+                nvcpu->nv_vmexit_pending = 1;
+        }
+        else  /* CR0, CR4, CLTS, LMSW */
+            nvcpu->nv_vmexit_pending = 1;
+
+        break;
+    }
+    default:
+        gdprintk(XENLOG_WARNING, "Unknown nested vmexit reason %x.\n",
+                 exit_reason);
+    }
+
+    return ( nvcpu->nv_vmexit_pending == 1 );
+}
+
diff -r 4496678bbb00 -r 5c3ab1e07ab1 xen/include/asm-x86/hvm/vmx/vvmx.h
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 09 16:24:09 2011 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h	Thu Jun 09 16:24:09 2011 +0800
@@ -170,6 +170,9 @@ void nvmx_update_secondary_exec_control(
 void nvmx_update_exception_bitmap(struct vcpu *v, unsigned long value);
 asmlinkage void nvmx_switch_guest(void);
 void nvmx_idtv_handling(void);
+u64 nvmx_get_tsc_offset(struct vcpu *v);
+int nvmx_n2_vmexit_handler(struct cpu_user_regs *regs,
+                          unsigned int exit_reason);
 
 #endif /* __ASM_X86_HVM_VVMX_H__ */

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2011-06-09  8:25 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-02  8:57 [PATCH 00 of 20] NestedVMX support Eddie Dong
2011-06-02  8:57 ` [PATCH 01 of 20] pre-cleanup1: Extend nhvm_vmcx_guest_intercepts_trap to include errcode to Eddie Dong
2011-06-02  8:57 ` [PATCH 02 of 20] pre-cleanup2: Move IDT_VECTORING processing code out of intr_assist Eddie Dong
2011-06-02  8:57 ` [PATCH 03 of 20] Add data structure for nestedvmx Eddie Dong
2011-06-02  8:57 ` [PATCH 04 of 20] Add APIs for nestedhvm_ops Eddie Dong
2011-06-02  8:57 ` [PATCH 05 of 20] Emulation of guest VMXON/OFF instruction Eddie Dong
2011-06-02 14:36   ` Tim Deegan
2011-06-03  5:54     ` Dong, Eddie
2011-06-02  8:57 ` [PATCH 06 of 20] Define structure and access APIs for virtual VMCS Eddie Dong
2011-06-02  8:57 ` [PATCH 07 of 20] Emulation of guest vmptrld Eddie Dong
2011-06-02 14:45   ` Tim Deegan
2011-06-03  6:07     ` Dong, Eddie
2011-06-03  8:42       ` Tim Deegan
2011-06-07  1:48         ` Dong, Eddie
2011-06-02  8:57 ` [PATCH 08 of 20] Emulation of guest VMPTRST Eddie Dong
2011-06-02  8:57 ` [PATCH 09 of 20] Emulation of guest VMCLEAR Eddie Dong
2011-06-02  8:57 ` [PATCH 10 of 20] Emulation of guest VMWRITE Eddie Dong
2011-06-02  8:57 ` [PATCH 11 of 20] Emulation of guest VMREAD Eddie Dong
2011-06-02  8:57 ` [PATCH 12 of 20] Add APIs to switch n1/n2 VMCS Eddie Dong
2011-06-02 14:50   ` Tim Deegan
2011-06-03  7:30     ` Dong, Eddie
2011-06-02  8:57 ` [PATCH 13 of 20] Emulation of VMRESUME/VMLAUNCH Eddie Dong
2011-06-02  8:57 ` [PATCH 14 of 20] Extend VMCS control fields for n2 guest Eddie Dong
2011-06-02  8:57 ` [PATCH 15 of 20] Switch shadow/virtual VMCS between n1/n2 guests Eddie Dong
2011-06-02 14:56   ` Tim Deegan
2011-06-03  7:57     ` Dong, Eddie
2011-06-02 14:58   ` Tim Deegan
2011-06-02  8:57 ` [PATCH 16 of 20] interrupt/exception handling for n2 guest Eddie Dong
2011-06-02  8:57 ` [PATCH 17 of 20] VM exit handler of n2-guest Eddie Dong
2011-06-02 14:59   ` Tim Deegan
2011-06-03  8:06     ` Dong, Eddie
2011-06-03  8:43       ` Tim Deegan
2011-06-02  8:57 ` [PATCH 18 of 20] Lazy FPU for n2 guest Eddie Dong
2011-06-02  8:57 ` [PATCH 19 of 20] Add VMXE bits in virtual CR4 Eddie Dong
2011-06-02 15:01   ` Tim Deegan
2011-06-03  8:12     ` Dong, Eddie
2011-06-02  8:57 ` [PATCH 20 of 20] n2 MSR handling and capability exposure Eddie Dong
2011-06-02 15:07   ` Tim Deegan
2011-06-02 15:11     ` Tim Deegan
2011-06-02 19:20       ` Keir Fraser
2011-06-03  8:39       ` Dong, Eddie
2011-06-03  8:25     ` Dong, Eddie
2011-06-02 14:33 ` [PATCH 00 of 20] NestedVMX support Tim Deegan
2011-06-03  5:47   ` Dong, Eddie
2011-06-09  8:25 [PATCH 00 of 20] Rebased Nested VMX v2 Eddie Dong
2011-06-09  8:25 ` [PATCH 17 of 20] VM exit handler of n2-guest Eddie Dong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.