All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jun Nakajima <jun.nakajima@intel.com>
To: kvm@vger.kernel.org
Cc: Gleb Natapov <gleb@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
Subject: [PATCH v3 10/13] nEPT: Nested INVEPT
Date: Sat, 18 May 2013 21:52:29 -0700	[thread overview]
Message-ID: <1368939152-11406-10-git-send-email-jun.nakajima@intel.com> (raw)
In-Reply-To: <1368939152-11406-1-git-send-email-jun.nakajima@intel.com>

From: Nadav Har'El <nyh@il.ibm.com>

If we let L1 use EPT, we should probably also support the INVEPT instruction.

In our current nested EPT implementation, when L1 changes its EPT table for
L2 (i.e., EPT12), L0 modifies the shadow EPT table (EPT02), and in the course
of this modification already calls INVEPT. Therefore, when L1 calls INVEPT,
we don't really need to do anything. In particular we *don't* need to call
the real INVEPT again. All we do in our INVEPT is verify the validity of the
call, and its parameters, and then do nothing.

In KVM Forum 2010, Dong et al. presented "Nested Virtualization Friendly KVM"
and classified our current nested EPT implementation as "shadow-like virtual
EPT". He recommended instead a different approach, which he called "VTLB-like
virtual EPT". If we had taken that alternative approach, INVEPT would have had
a bigger role: L0 would only rebuild the shadow EPT table when L1 calls INVEPT.

Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Jun Nakajima <jun.nakajima@intel.com>
Signed-off-by: Xinhao Xu <xinhao.xu@intel.com>
---
 arch/x86/include/uapi/asm/vmx.h |  1 +
 arch/x86/kvm/vmx.c              | 83 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 84 insertions(+)

diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h
index d651082..7a34e8f 100644
--- a/arch/x86/include/uapi/asm/vmx.h
+++ b/arch/x86/include/uapi/asm/vmx.h
@@ -65,6 +65,7 @@
 #define EXIT_REASON_EOI_INDUCED         45
 #define EXIT_REASON_EPT_VIOLATION       48
 #define EXIT_REASON_EPT_MISCONFIG       49
+#define EXIT_REASON_INVEPT              50
 #define EXIT_REASON_PREEMPTION_TIMER    52
 #define EXIT_REASON_WBINVD              54
 #define EXIT_REASON_XSETBV              55
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 1cf8a41..d9d991d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6251,6 +6251,87 @@ static int handle_vmptrst(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+/* Emulate the INVEPT instruction */
+static int handle_invept(struct kvm_vcpu *vcpu)
+{
+	u32 vmx_instruction_info;
+	unsigned long type;
+	gva_t gva;
+	struct x86_exception e;
+	struct {
+		u64 eptp, gpa;
+	} operand;
+
+	if (!(nested_vmx_secondary_ctls_high & SECONDARY_EXEC_ENABLE_EPT) ||
+	    !(nested_vmx_ept_caps & VMX_EPT_INVEPT_BIT)) {
+		kvm_queue_exception(vcpu, UD_VECTOR);
+		return 1;
+	}
+
+	if (!nested_vmx_check_permission(vcpu))
+		return 1;
+
+	if (!kvm_read_cr0_bits(vcpu, X86_CR0_PE)) {
+		kvm_queue_exception(vcpu, UD_VECTOR);
+		return 1;
+	}
+
+	/* According to the Intel VMX instruction reference, the memory
+	 * operand is read even if it isn't needed (e.g., for type==global)
+	 */
+	vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
+	if (get_vmx_mem_address(vcpu, vmcs_readl(EXIT_QUALIFICATION),
+			vmx_instruction_info, &gva))
+		return 1;
+	if (kvm_read_guest_virt(&vcpu->arch.emulate_ctxt, gva, &operand,
+				sizeof(operand), &e)) {
+		kvm_inject_page_fault(vcpu, &e);
+		return 1;
+	}
+
+	type = kvm_register_read(vcpu, (vmx_instruction_info >> 28) & 0xf);
+
+	switch (type) {
+	case VMX_EPT_EXTENT_GLOBAL:
+		if (!(nested_vmx_ept_caps & VMX_EPT_EXTENT_GLOBAL_BIT))
+			nested_vmx_failValid(vcpu,
+				VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID);
+		else {
+			/*
+			 * Do nothing: when L1 changes EPT12, we already
+			 * update EPT02 (the shadow EPT table) and call INVEPT.
+			 * So when L1 calls INVEPT, there's nothing left to do.
+			 */
+			nested_vmx_succeed(vcpu);
+		}
+		break;
+	case VMX_EPT_EXTENT_CONTEXT:
+		if (!(nested_vmx_ept_caps & VMX_EPT_EXTENT_CONTEXT_BIT))
+			nested_vmx_failValid(vcpu,
+				VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID);
+		else {
+			/* Do nothing */
+			nested_vmx_succeed(vcpu);
+		}
+		break;
+	case VMX_EPT_EXTENT_INDIVIDUAL_ADDR:
+		if (!(nested_vmx_ept_caps & VMX_EPT_EXTENT_INDIVIDUAL_BIT))
+			nested_vmx_failValid(vcpu,
+				VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID);
+		else {
+			/* Do nothing */
+			nested_vmx_succeed(vcpu);
+		}
+		break;
+	default:
+		nested_vmx_failValid(vcpu,
+			VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID);
+	}
+
+	skip_emulated_instruction(vcpu);
+	return 1;
+}
+
 /*
  * The exit handlers return 1 if the exit was handled fully and guest execution
  * may resume.  Otherwise they set the kvm_run parameter to indicate what needs
@@ -6295,6 +6376,7 @@ static int (*const kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
 	[EXIT_REASON_PAUSE_INSTRUCTION]       = handle_pause,
 	[EXIT_REASON_MWAIT_INSTRUCTION]	      = handle_invalid_op,
 	[EXIT_REASON_MONITOR_INSTRUCTION]     = handle_invalid_op,
+	[EXIT_REASON_INVEPT]                  = handle_invept,
 };
 
 static const int kvm_vmx_max_exit_handlers =
@@ -6521,6 +6603,7 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
 	case EXIT_REASON_VMPTRST: case EXIT_REASON_VMREAD:
 	case EXIT_REASON_VMRESUME: case EXIT_REASON_VMWRITE:
 	case EXIT_REASON_VMOFF: case EXIT_REASON_VMON:
+	case EXIT_REASON_INVEPT:
 		/*
 		 * VMX instructions trap unconditionally. This allows L1 to
 		 * emulate them for its L2 guest, i.e., allows 3-level nesting!
-- 
1.8.1.2


  parent reply	other threads:[~2013-05-19  4:53 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-19  4:52 [PATCH v3 01/13] nEPT: Support LOAD_IA32_EFER entry/exit controls for L1 Jun Nakajima
2013-05-19  4:52 ` [PATCH v3 02/13] nEPT: Move gpte_access() and prefetch_invalid_gpte() to paging_tmpl.h Jun Nakajima
2013-05-20 12:34   ` Paolo Bonzini
2013-05-19  4:52 ` [PATCH v3 03/13] nEPT: Add EPT tables support " Jun Nakajima
2013-05-21  7:52   ` Xiao Guangrong
2013-05-21  8:30     ` Xiao Guangrong
2013-05-21  9:01       ` Gleb Natapov
2013-05-21 11:05         ` Xiao Guangrong
2013-05-21 22:26           ` Nakajima, Jun
2013-05-22  1:10             ` Xiao Guangrong
2013-05-22  6:16             ` Gleb Natapov
2013-06-11 11:32     ` Gleb Natapov
2013-06-17 12:11       ` Xiao Guangrong
2013-06-18 10:57         ` Gleb Natapov
2013-06-18 12:51           ` Xiao Guangrong
2013-06-18 13:01             ` Gleb Natapov
2013-05-19  4:52 ` [PATCH v3 04/13] nEPT: Define EPT-specific link_shadow_page() Jun Nakajima
2013-05-20 12:43   ` Paolo Bonzini
2013-05-21  8:15   ` Xiao Guangrong
2013-05-21 21:44     ` Nakajima, Jun
2013-05-19  4:52 ` [PATCH v3 05/13] nEPT: MMU context for nested EPT Jun Nakajima
2013-05-21  8:50   ` Xiao Guangrong
2013-05-21 22:30     ` Nakajima, Jun
2013-05-19  4:52 ` [PATCH v3 06/13] nEPT: Fix cr3 handling in nested exit and entry Jun Nakajima
2013-05-20 13:19   ` Paolo Bonzini
2013-06-12 12:42   ` Gleb Natapov
2013-05-19  4:52 ` [PATCH v3 07/13] nEPT: Fix wrong test in kvm_set_cr3 Jun Nakajima
2013-05-20 13:17   ` Paolo Bonzini
2013-05-19  4:52 ` [PATCH v3 08/13] nEPT: Some additional comments Jun Nakajima
2013-05-20 13:21   ` Paolo Bonzini
2013-05-19  4:52 ` [PATCH v3 09/13] nEPT: Advertise EPT to L1 Jun Nakajima
2013-05-20 13:05   ` Paolo Bonzini
2013-05-19  4:52 ` Jun Nakajima [this message]
2013-05-20 12:46   ` [PATCH v3 10/13] nEPT: Nested INVEPT Paolo Bonzini
2013-05-21  9:16   ` Xiao Guangrong
2013-05-19  4:52 ` [PATCH v3 11/13] nEPT: Miscelleneous cleanups Jun Nakajima
2013-05-19  4:52 ` [PATCH v3 12/13] nEPT: Move is_rsvd_bits_set() to paging_tmpl.h Jun Nakajima
2013-05-19  4:52 ` [PATCH v3 13/13] nEPT: Inject EPT violation/misconfigration Jun Nakajima
2013-05-20 13:09   ` Paolo Bonzini
2013-05-21 10:56   ` Xiao Guangrong
2013-05-20 12:33 ` [PATCH v3 01/13] nEPT: Support LOAD_IA32_EFER entry/exit controls for L1 Paolo Bonzini
2013-07-02  3:01   ` Zhang, Yang Z
2013-07-02 13:59     ` Gleb Natapov
2013-07-02 14:28       ` Jan Kiszka
2013-07-02 15:15         ` Gleb Natapov
2013-07-02 15:34           ` Jan Kiszka
2013-07-02 15:43             ` Gleb Natapov
2013-07-04  8:42               ` Zhang, Yang Z
2013-07-08 12:37                 ` Gleb Natapov
2013-07-08 14:28                   ` Zhang, Yang Z
2013-07-08 16:08                     ` Gleb Natapov
  -- strict thread matches above, loose matches on Subject: below --
2013-05-09  0:53 Jun Nakajima
2013-05-09  0:53 ` [PATCH v3 02/13] nEPT: Move gpte_access() and prefetch_invalid_gpte() to paging_tmpl.h Jun Nakajima
2013-05-09  0:53   ` [PATCH v3 03/13] nEPT: Add EPT tables support " Jun Nakajima
2013-05-09  0:53     ` [PATCH v3 04/13] nEPT: Define EPT-specific link_shadow_page() Jun Nakajima
2013-05-09  0:53       ` [PATCH v3 05/13] nEPT: MMU context for nested EPT Jun Nakajima
2013-05-09  0:53         ` [PATCH v3 06/13] nEPT: Fix cr3 handling in nested exit and entry Jun Nakajima
2013-05-09  0:53           ` [PATCH v3 07/13] nEPT: Fix wrong test in kvm_set_cr3 Jun Nakajima
2013-05-09  0:53             ` [PATCH v3 08/13] nEPT: Some additional comments Jun Nakajima
2013-05-09  0:53               ` [PATCH v3 09/13] nEPT: Advertise EPT to L1 Jun Nakajima
2013-05-09  0:53                 ` [PATCH v3 10/13] nEPT: Nested INVEPT Jun Nakajima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1368939152-11406-10-git-send-email-jun.nakajima@intel.com \
    --to=jun.nakajima@intel.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.