All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3 v2] Expose VMFUNC to the nested hypervisor
@ 2017-07-06 23:03 Bandan Das
  2017-07-06 23:03 ` [PATCH 1/3 v2] KVM: vmx: Enable VMFUNCs Bandan Das
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Bandan Das @ 2017-07-06 23:03 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, linux-kernel

v2:
 1/3: Patch to enable vmfunc on the host but cause a #UD if
      L1 tries to use it directly. (new)
 2/3: Expose vmfunc to the nested hypervisor, but no vm functions
      are exposed and L0 emulates a vmfunc vmexit to L1. 
 3/3: Force a vmfunc vmexit when L2 tries to use vmfunc and emulate
      eptp switching. Unconditionally expose EPTP switching to the
      L1 hypervisor since L0 fakes eptp switching via a mmu reload.

These patches expose eptp switching/vmfunc to the nested hypervisor.
vmfunc is enabled in the secondary controls for the host and is
exposed to the nested hypervisor. However, if the nested hypervisor
decides to use eptp switching, L0 emulates it.

v1:
 https://lkml.org/lkml/2017/6/29/958

Bandan Das (3):
  KVM: vmx: Enable VMFUNCs
  KVM: nVMX: Enable VMFUNC for the L1 hypervisor
  KVM: nVMX: Emulate EPTP switching for the L1 hypervisor

 arch/x86/include/asm/vmx.h |   9 ++++
 arch/x86/kvm/vmx.c         | 122 ++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 129 insertions(+), 2 deletions(-)

-- 
2.9.4

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3 v2] KVM: vmx: Enable VMFUNCs
  2017-07-06 23:03 [PATCH 0/3 v2] Expose VMFUNC to the nested hypervisor Bandan Das
@ 2017-07-06 23:03 ` Bandan Das
  2017-07-10  8:54   ` David Hildenbrand
  2017-07-06 23:03 ` [PATCH 2/3 v2] KVM: nVMX: Enable VMFUNC for the L1 hypervisor Bandan Das
  2017-07-06 23:03 ` [PATCH 3/3 v2] KVM: nVMX: Emulate EPTP switching " Bandan Das
  2 siblings, 1 reply; 12+ messages in thread
From: Bandan Das @ 2017-07-06 23:03 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, linux-kernel

Enable VMFUNC in the secondary execution controls.  This simplifies the
changes necessary to expose it to nested hypervisors.  VMFUNCs still
cause #UD when invoked.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/include/asm/vmx.h |  3 +++
 arch/x86/kvm/vmx.c         | 22 +++++++++++++++++++++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 35cd06f..da5375e 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -72,6 +72,7 @@
 #define SECONDARY_EXEC_PAUSE_LOOP_EXITING	0x00000400
 #define SECONDARY_EXEC_RDRAND			0x00000800
 #define SECONDARY_EXEC_ENABLE_INVPCID		0x00001000
+#define SECONDARY_EXEC_ENABLE_VMFUNC            0x00002000
 #define SECONDARY_EXEC_SHADOW_VMCS              0x00004000
 #define SECONDARY_EXEC_RDSEED			0x00010000
 #define SECONDARY_EXEC_ENABLE_PML               0x00020000
@@ -187,6 +188,8 @@ enum vmcs_field {
 	APIC_ACCESS_ADDR_HIGH		= 0x00002015,
 	POSTED_INTR_DESC_ADDR           = 0x00002016,
 	POSTED_INTR_DESC_ADDR_HIGH      = 0x00002017,
+	VM_FUNCTION_CONTROL             = 0x00002018,
+	VM_FUNCTION_CONTROL_HIGH        = 0x00002019,
 	EPT_POINTER                     = 0x0000201a,
 	EPT_POINTER_HIGH                = 0x0000201b,
 	EOI_EXIT_BITMAP0                = 0x0000201c,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ca5d2b9..a483b49 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1314,6 +1314,12 @@ static inline bool cpu_has_vmx_tsc_scaling(void)
 		SECONDARY_EXEC_TSC_SCALING;
 }
 
+static inline bool cpu_has_vmx_vmfunc(void)
+{
+	return vmcs_config.cpu_based_2nd_exec_ctrl &
+		SECONDARY_EXEC_ENABLE_VMFUNC;
+}
+
 static inline bool report_flexpriority(void)
 {
 	return flexpriority_enabled;
@@ -3575,7 +3581,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
 			SECONDARY_EXEC_SHADOW_VMCS |
 			SECONDARY_EXEC_XSAVES |
 			SECONDARY_EXEC_ENABLE_PML |
-			SECONDARY_EXEC_TSC_SCALING;
+			SECONDARY_EXEC_TSC_SCALING |
+			SECONDARY_EXEC_ENABLE_VMFUNC;
 		if (adjust_vmx_controls(min2, opt2,
 					MSR_IA32_VMX_PROCBASED_CTLS2,
 					&_cpu_based_2nd_exec_control) < 0)
@@ -5233,6 +5240,9 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
 	vmcs_writel(HOST_GS_BASE, 0); /* 22.2.4 */
 #endif
 
+	if (cpu_has_vmx_vmfunc())
+		vmcs_write64(VM_FUNCTION_CONTROL, 0);
+
 	vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0);
 	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, 0);
 	vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host));
@@ -7740,6 +7750,12 @@ static int handle_preemption_timer(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+static int handle_vmfunc(struct kvm_vcpu *vcpu)
+{
+	kvm_queue_exception(vcpu, UD_VECTOR);
+	return 1;
+}
+
 /*
  * The exit handlers return 1 if the exit was handled fully and guest execution
  * may resume.  Otherwise they set the kvm_run parameter to indicate what needs
@@ -7790,6 +7806,7 @@ static int (*const kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
 	[EXIT_REASON_XSAVES]                  = handle_xsaves,
 	[EXIT_REASON_XRSTORS]                 = handle_xrstors,
 	[EXIT_REASON_PML_FULL]		      = handle_pml_full,
+	[EXIT_REASON_VMFUNC]                  = handle_vmfunc,
 	[EXIT_REASON_PREEMPTION_TIMER]	      = handle_preemption_timer,
 };
 
@@ -8111,6 +8128,9 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
 	case EXIT_REASON_PML_FULL:
 		/* We emulate PML support to L1. */
 		return false;
+	case EXIT_REASON_VMFUNC:
+		/* VM functions are emulated through L2->L0 vmexits. */
+		return false;
 	default:
 		return true;
 	}
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3 v2] KVM: nVMX: Enable VMFUNC for the L1 hypervisor
  2017-07-06 23:03 [PATCH 0/3 v2] Expose VMFUNC to the nested hypervisor Bandan Das
  2017-07-06 23:03 ` [PATCH 1/3 v2] KVM: vmx: Enable VMFUNCs Bandan Das
@ 2017-07-06 23:03 ` Bandan Das
  2017-07-10  9:17   ` David Hildenbrand
  2017-07-06 23:03 ` [PATCH 3/3 v2] KVM: nVMX: Emulate EPTP switching " Bandan Das
  2 siblings, 1 reply; 12+ messages in thread
From: Bandan Das @ 2017-07-06 23:03 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, linux-kernel

Expose VMFUNC in MSRs and VMCS fields. No actual VMFUNCs are enabled.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/kvm/vmx.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 51 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a483b49..7364678 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -240,6 +240,7 @@ struct __packed vmcs12 {
 	u64 virtual_apic_page_addr;
 	u64 apic_access_addr;
 	u64 posted_intr_desc_addr;
+	u64 vm_function_control;
 	u64 ept_pointer;
 	u64 eoi_exit_bitmap0;
 	u64 eoi_exit_bitmap1;
@@ -481,6 +482,7 @@ struct nested_vmx {
 	u64 nested_vmx_cr4_fixed0;
 	u64 nested_vmx_cr4_fixed1;
 	u64 nested_vmx_vmcs_enum;
+	u64 nested_vmx_vmfunc_controls;
 };
 
 #define POSTED_INTR_ON  0
@@ -763,6 +765,7 @@ static const unsigned short vmcs_field_to_offset_table[] = {
 	FIELD64(VIRTUAL_APIC_PAGE_ADDR, virtual_apic_page_addr),
 	FIELD64(APIC_ACCESS_ADDR, apic_access_addr),
 	FIELD64(POSTED_INTR_DESC_ADDR, posted_intr_desc_addr),
+	FIELD64(VM_FUNCTION_CONTROL, vm_function_control),
 	FIELD64(EPT_POINTER, ept_pointer),
 	FIELD64(EOI_EXIT_BITMAP0, eoi_exit_bitmap0),
 	FIELD64(EOI_EXIT_BITMAP1, eoi_exit_bitmap1),
@@ -1394,6 +1397,11 @@ static inline bool nested_cpu_has_posted_intr(struct vmcs12 *vmcs12)
 	return vmcs12->pin_based_vm_exec_control & PIN_BASED_POSTED_INTR;
 }
 
+static inline bool nested_cpu_has_vmfunc(struct vmcs12 *vmcs12)
+{
+	return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_VMFUNC);
+}
+
 static inline bool is_nmi(u32 intr_info)
 {
 	return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK))
@@ -2780,6 +2788,12 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
 	} else
 		vmx->nested.nested_vmx_ept_caps = 0;
 
+	if (cpu_has_vmx_vmfunc()) {
+		vmx->nested.nested_vmx_secondary_ctls_high |=
+			SECONDARY_EXEC_ENABLE_VMFUNC;
+		vmx->nested.nested_vmx_vmfunc_controls = 0;
+	}
+
 	/*
 	 * Old versions of KVM use the single-context version without
 	 * checking for support, so declare that it is supported even
@@ -3149,6 +3163,9 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
 		*pdata = vmx->nested.nested_vmx_ept_caps |
 			((u64)vmx->nested.nested_vmx_vpid_caps << 32);
 		break;
+	case MSR_IA32_VMX_VMFUNC:
+		*pdata = vmx->nested.nested_vmx_vmfunc_controls;
+		break;
 	default:
 		return 1;
 	}
@@ -7752,7 +7769,29 @@ static int handle_preemption_timer(struct kvm_vcpu *vcpu)
 
 static int handle_vmfunc(struct kvm_vcpu *vcpu)
 {
-	kvm_queue_exception(vcpu, UD_VECTOR);
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	struct vmcs12 *vmcs12;
+	u32 function = vcpu->arch.regs[VCPU_REGS_RAX];
+
+	/*
+	 * VMFUNC is only supported for nested guests, but we always enable the
+	 * secondary control for simplicity; for non-nested mode, fake that we
+	 * didn't by injecting #UD.
+	 */
+	if (!is_guest_mode(vcpu)) {
+		kvm_queue_exception(vcpu, UD_VECTOR);
+		return 1;
+	}
+
+	vmcs12 = get_vmcs12(vcpu);
+	if ((vmcs12->vm_function_control & (1 << function)) == 0)
+		goto fail;
+	WARN(1, "VMCS12 VM function control should have been zero");
+
+fail:
+	nested_vmx_vmexit(vcpu, vmx->exit_reason,
+			  vmcs_read32(VM_EXIT_INTR_INFO),
+			  vmcs_readl(EXIT_QUALIFICATION));
 	return 1;
 }
 
@@ -10053,7 +10092,8 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
 		exec_control &= ~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
 				  SECONDARY_EXEC_RDTSCP |
 				  SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
-				  SECONDARY_EXEC_APIC_REGISTER_VIRT);
+				  SECONDARY_EXEC_APIC_REGISTER_VIRT |
+				  SECONDARY_EXEC_ENABLE_VMFUNC);
 		if (nested_cpu_has(vmcs12,
 				   CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)) {
 			vmcs12_exec_ctrl = vmcs12->secondary_vm_exec_control &
@@ -10061,6 +10101,10 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
 			exec_control |= vmcs12_exec_ctrl;
 		}
 
+		/* All VMFUNCs are currently emulated through L0 vmexits.  */
+		if (exec_control & SECONDARY_EXEC_ENABLE_VMFUNC)
+			vmcs_write64(VM_FUNCTION_CONTROL, 0);
+
 		if (exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) {
 			vmcs_write64(EOI_EXIT_BITMAP0,
 				vmcs12->eoi_exit_bitmap0);
@@ -10310,6 +10354,11 @@ static int check_vmentry_prereqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 				vmx->nested.nested_vmx_entry_ctls_high))
 		return VMXERR_ENTRY_INVALID_CONTROL_FIELD;
 
+	if (nested_cpu_has_vmfunc(vmcs12) &&
+	    (vmcs12->vm_function_control &
+	     ~vmx->nested.nested_vmx_vmfunc_controls))
+		return VMXERR_ENTRY_INVALID_CONTROL_FIELD;
+
 	if (vmcs12->cr3_target_count > nested_cpu_vmx_misc_cr3_count(vcpu))
 		return VMXERR_ENTRY_INVALID_CONTROL_FIELD;
 
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3 v2] KVM: nVMX: Emulate EPTP switching for the L1 hypervisor
  2017-07-06 23:03 [PATCH 0/3 v2] Expose VMFUNC to the nested hypervisor Bandan Das
  2017-07-06 23:03 ` [PATCH 1/3 v2] KVM: vmx: Enable VMFUNCs Bandan Das
  2017-07-06 23:03 ` [PATCH 2/3 v2] KVM: nVMX: Enable VMFUNC for the L1 hypervisor Bandan Das
@ 2017-07-06 23:03 ` Bandan Das
  2017-07-07  8:30   ` Paolo Bonzini
  2 siblings, 1 reply; 12+ messages in thread
From: Bandan Das @ 2017-07-06 23:03 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, linux-kernel

When L2 uses vmfunc, L0 utilizes the associated vmexit to
emulate a switching of the ept pointer by reloading the
guest MMU.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/include/asm/vmx.h |  6 +++++
 arch/x86/kvm/vmx.c         | 55 +++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 58 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index da5375e..5f63a2e 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -115,6 +115,10 @@
 #define VMX_MISC_SAVE_EFER_LMA			0x00000020
 #define VMX_MISC_ACTIVITY_HLT			0x00000040
 
+/* VMFUNC functions */
+#define VMX_VMFUNC_EPTP_SWITCHING               0x00000001
+#define VMFUNC_EPTP_ENTRIES  512
+
 static inline u32 vmx_basic_vmcs_revision_id(u64 vmx_basic)
 {
 	return vmx_basic & GENMASK_ULL(30, 0);
@@ -200,6 +204,8 @@ enum vmcs_field {
 	EOI_EXIT_BITMAP2_HIGH           = 0x00002021,
 	EOI_EXIT_BITMAP3                = 0x00002022,
 	EOI_EXIT_BITMAP3_HIGH           = 0x00002023,
+	EPTP_LIST_ADDRESS               = 0x00002024,
+	EPTP_LIST_ADDRESS_HIGH          = 0x00002025,
 	VMREAD_BITMAP                   = 0x00002026,
 	VMWRITE_BITMAP                  = 0x00002028,
 	XSS_EXIT_BITMAP                 = 0x0000202C,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7364678..3a4aa68 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -246,6 +246,7 @@ struct __packed vmcs12 {
 	u64 eoi_exit_bitmap1;
 	u64 eoi_exit_bitmap2;
 	u64 eoi_exit_bitmap3;
+	u64 eptp_list_address;
 	u64 xss_exit_bitmap;
 	u64 guest_physical_address;
 	u64 vmcs_link_pointer;
@@ -771,6 +772,7 @@ static const unsigned short vmcs_field_to_offset_table[] = {
 	FIELD64(EOI_EXIT_BITMAP1, eoi_exit_bitmap1),
 	FIELD64(EOI_EXIT_BITMAP2, eoi_exit_bitmap2),
 	FIELD64(EOI_EXIT_BITMAP3, eoi_exit_bitmap3),
+	FIELD64(EPTP_LIST_ADDRESS, eptp_list_address),
 	FIELD64(XSS_EXIT_BITMAP, xss_exit_bitmap),
 	FIELD64(GUEST_PHYSICAL_ADDRESS, guest_physical_address),
 	FIELD64(VMCS_LINK_POINTER, vmcs_link_pointer),
@@ -1402,6 +1404,13 @@ static inline bool nested_cpu_has_vmfunc(struct vmcs12 *vmcs12)
 	return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_VMFUNC);
 }
 
+static inline bool nested_cpu_has_eptp_switching(struct vmcs12 *vmcs12)
+{
+	return nested_cpu_has_vmfunc(vmcs12) &&
+		(vmcs12->vm_function_control &
+		 VMX_VMFUNC_EPTP_SWITCHING);
+}
+
 static inline bool is_nmi(u32 intr_info)
 {
 	return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK))
@@ -2791,7 +2800,12 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
 	if (cpu_has_vmx_vmfunc()) {
 		vmx->nested.nested_vmx_secondary_ctls_high |=
 			SECONDARY_EXEC_ENABLE_VMFUNC;
-		vmx->nested.nested_vmx_vmfunc_controls = 0;
+		/*
+		 * Advertise EPTP switching unconditionally
+		 * since we emulate it
+		 */
+		vmx->nested.nested_vmx_vmfunc_controls =
+			VMX_VMFUNC_EPTP_SWITCHING;
 	}
 
 	/*
@@ -7772,6 +7786,9 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu)
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	struct vmcs12 *vmcs12;
 	u32 function = vcpu->arch.regs[VCPU_REGS_RAX];
+	u32 index = vcpu->arch.regs[VCPU_REGS_RCX];
+	struct page *page = NULL;
+	u64 *l1_eptp_list;
 
 	/*
 	 * VMFUNC is only supported for nested guests, but we always enable the
@@ -7784,11 +7801,43 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu)
 	}
 
 	vmcs12 = get_vmcs12(vcpu);
-	if ((vmcs12->vm_function_control & (1 << function)) == 0)
+	if (((vmcs12->vm_function_control & (1 << function)) == 0) ||
+	    WARN_ON_ONCE(function))
+		goto fail;
+
+	if (!nested_cpu_has_ept(vmcs12) ||
+	    !nested_cpu_has_eptp_switching(vmcs12))
+		goto fail;
+
+	if (!vmcs12->eptp_list_address || index >= VMFUNC_EPTP_ENTRIES)
+		goto fail;
+
+	page = nested_get_page(vcpu, vmcs12->eptp_list_address);
+	if (!page)
+		goto fail;
+
+	l1_eptp_list = kmap(page);
+	if (!l1_eptp_list[index])
 		goto fail;
-	WARN(1, "VMCS12 VM function control should have been zero");
+
+	/*
+	 * If the (L2) guest does a vmfunc to the currently
+	 * active ept pointer, we don't have to do anything else
+	 */
+	if (vmcs12->ept_pointer != l1_eptp_list[index]) {
+		kvm_mmu_unload(vcpu);
+		/*
+		 * TODO: Verify that guest ept satisfies vmentry prereqs
+		 */
+		vmcs12->ept_pointer = l1_eptp_list[index];
+		kvm_mmu_reload(vcpu);
+		kunmap(page);
+	}
+	return kvm_skip_emulated_instruction(vcpu);
 
 fail:
+	if (page)
+		kunmap(page);
 	nested_vmx_vmexit(vcpu, vmx->exit_reason,
 			  vmcs_read32(VM_EXIT_INTR_INFO),
 			  vmcs_readl(EXIT_QUALIFICATION));
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3 v2] KVM: nVMX: Emulate EPTP switching for the L1 hypervisor
  2017-07-06 23:03 ` [PATCH 3/3 v2] KVM: nVMX: Emulate EPTP switching " Bandan Das
@ 2017-07-07  8:30   ` Paolo Bonzini
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2017-07-07  8:30 UTC (permalink / raw)
  To: Bandan Das, kvm; +Cc: linux-kernel



On 07/07/2017 01:03, Bandan Das wrote:
> When L2 uses vmfunc, L0 utilizes the associated vmexit to
> emulate a switching of the ept pointer by reloading the
> guest MMU.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Bandan Das <bsd@redhat.com>
> ---
>  arch/x86/include/asm/vmx.h |  6 +++++
>  arch/x86/kvm/vmx.c         | 55 +++++++++++++++++++++++++++++++++++++++++++---
>  2 files changed, 58 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
> index da5375e..5f63a2e 100644
> --- a/arch/x86/include/asm/vmx.h
> +++ b/arch/x86/include/asm/vmx.h
> @@ -115,6 +115,10 @@
>  #define VMX_MISC_SAVE_EFER_LMA			0x00000020
>  #define VMX_MISC_ACTIVITY_HLT			0x00000040
>  
> +/* VMFUNC functions */
> +#define VMX_VMFUNC_EPTP_SWITCHING               0x00000001
> +#define VMFUNC_EPTP_ENTRIES  512
> +
>  static inline u32 vmx_basic_vmcs_revision_id(u64 vmx_basic)
>  {
>  	return vmx_basic & GENMASK_ULL(30, 0);
> @@ -200,6 +204,8 @@ enum vmcs_field {
>  	EOI_EXIT_BITMAP2_HIGH           = 0x00002021,
>  	EOI_EXIT_BITMAP3                = 0x00002022,
>  	EOI_EXIT_BITMAP3_HIGH           = 0x00002023,
> +	EPTP_LIST_ADDRESS               = 0x00002024,
> +	EPTP_LIST_ADDRESS_HIGH          = 0x00002025,
>  	VMREAD_BITMAP                   = 0x00002026,
>  	VMWRITE_BITMAP                  = 0x00002028,
>  	XSS_EXIT_BITMAP                 = 0x0000202C,
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 7364678..3a4aa68 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -246,6 +246,7 @@ struct __packed vmcs12 {
>  	u64 eoi_exit_bitmap1;
>  	u64 eoi_exit_bitmap2;
>  	u64 eoi_exit_bitmap3;
> +	u64 eptp_list_address;
>  	u64 xss_exit_bitmap;
>  	u64 guest_physical_address;
>  	u64 vmcs_link_pointer;
> @@ -771,6 +772,7 @@ static const unsigned short vmcs_field_to_offset_table[] = {
>  	FIELD64(EOI_EXIT_BITMAP1, eoi_exit_bitmap1),
>  	FIELD64(EOI_EXIT_BITMAP2, eoi_exit_bitmap2),
>  	FIELD64(EOI_EXIT_BITMAP3, eoi_exit_bitmap3),
> +	FIELD64(EPTP_LIST_ADDRESS, eptp_list_address),
>  	FIELD64(XSS_EXIT_BITMAP, xss_exit_bitmap),
>  	FIELD64(GUEST_PHYSICAL_ADDRESS, guest_physical_address),
>  	FIELD64(VMCS_LINK_POINTER, vmcs_link_pointer),
> @@ -1402,6 +1404,13 @@ static inline bool nested_cpu_has_vmfunc(struct vmcs12 *vmcs12)
>  	return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_VMFUNC);
>  }
>  
> +static inline bool nested_cpu_has_eptp_switching(struct vmcs12 *vmcs12)
> +{
> +	return nested_cpu_has_vmfunc(vmcs12) &&
> +		(vmcs12->vm_function_control &
> +		 VMX_VMFUNC_EPTP_SWITCHING);
> +}
> +
>  static inline bool is_nmi(u32 intr_info)
>  {
>  	return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK))
> @@ -2791,7 +2800,12 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
>  	if (cpu_has_vmx_vmfunc()) {
>  		vmx->nested.nested_vmx_secondary_ctls_high |=
>  			SECONDARY_EXEC_ENABLE_VMFUNC;
> -		vmx->nested.nested_vmx_vmfunc_controls = 0;
> +		/*
> +		 * Advertise EPTP switching unconditionally
> +		 * since we emulate it
> +		 */
> +		vmx->nested.nested_vmx_vmfunc_controls =
> +			VMX_VMFUNC_EPTP_SWITCHING;
>  	}
>  
>  	/*
> @@ -7772,6 +7786,9 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu)
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  	struct vmcs12 *vmcs12;
>  	u32 function = vcpu->arch.regs[VCPU_REGS_RAX];
> +	u32 index = vcpu->arch.regs[VCPU_REGS_RCX];
> +	struct page *page = NULL;
> +	u64 *l1_eptp_list;
>  
>  	/*
>  	 * VMFUNC is only supported for nested guests, but we always enable the
> @@ -7784,11 +7801,43 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu)
>  	}
>  
>  	vmcs12 = get_vmcs12(vcpu);
> -	if ((vmcs12->vm_function_control & (1 << function)) == 0)
> +	if (((vmcs12->vm_function_control & (1 << function)) == 0) ||
> +	    WARN_ON_ONCE(function))
> +		goto fail;
> +
> +	if (!nested_cpu_has_ept(vmcs12) ||
> +	    !nested_cpu_has_eptp_switching(vmcs12))
> +		goto fail;
> +
> +	if (!vmcs12->eptp_list_address || index >= VMFUNC_EPTP_ENTRIES)
> +		goto fail;
> +
> +	page = nested_get_page(vcpu, vmcs12->eptp_list_address);
> +	if (!page)
> +		goto fail;
> +
> +	l1_eptp_list = kmap(page);
> +	if (!l1_eptp_list[index])
>  		goto fail;
> -	WARN(1, "VMCS12 VM function control should have been zero");
> +
> +	/*
> +	 * If the (L2) guest does a vmfunc to the currently
> +	 * active ept pointer, we don't have to do anything else
> +	 */
> +	if (vmcs12->ept_pointer != l1_eptp_list[index]) {
> +		kvm_mmu_unload(vcpu);
> +		/*
> +		 * TODO: Verify that guest ept satisfies vmentry prereqs
> +		 */
> +		vmcs12->ept_pointer = l1_eptp_list[index];
> +		kvm_mmu_reload(vcpu);
> +		kunmap(page);
> +	}

Missing nested_release_page_clean, here and at the "fail" label.

The TODO is a symptom of a bigger problem, so I guess it's okay for now.

Paolo

> +	return kvm_skip_emulated_instruction(vcpu);
>  
>  fail:
> +	if (page)
> +		kunmap(page);
>  	nested_vmx_vmexit(vcpu, vmx->exit_reason,
>  			  vmcs_read32(VM_EXIT_INTR_INFO),
>  			  vmcs_readl(EXIT_QUALIFICATION));
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3 v2] KVM: vmx: Enable VMFUNCs
  2017-07-06 23:03 ` [PATCH 1/3 v2] KVM: vmx: Enable VMFUNCs Bandan Das
@ 2017-07-10  8:54   ` David Hildenbrand
  2017-07-10  9:17     ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: David Hildenbrand @ 2017-07-10  8:54 UTC (permalink / raw)
  To: Bandan Das, kvm; +Cc: pbonzini, linux-kernel


>  /*
>   * The exit handlers return 1 if the exit was handled fully and guest execution
>   * may resume.  Otherwise they set the kvm_run parameter to indicate what needs
> @@ -7790,6 +7806,7 @@ static int (*const kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
>  	[EXIT_REASON_XSAVES]                  = handle_xsaves,
>  	[EXIT_REASON_XRSTORS]                 = handle_xrstors,
>  	[EXIT_REASON_PML_FULL]		      = handle_pml_full,
> +	[EXIT_REASON_VMFUNC]                  = handle_vmfunc,
>  	[EXIT_REASON_PREEMPTION_TIMER]	      = handle_preemption_timer,
>  };
>  
> @@ -8111,6 +8128,9 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
>  	case EXIT_REASON_PML_FULL:
>  		/* We emulate PML support to L1. */
>  		return false;
> +	case EXIT_REASON_VMFUNC:
> +		/* VM functions are emulated through L2->L0 vmexits. */
> +		return false;

This would fit better into the second patch.

>  	default:
>  		return true;
>  	}
> 


Looks good to me.

-- 

Thanks,

David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3 v2] KVM: vmx: Enable VMFUNCs
  2017-07-10  8:54   ` David Hildenbrand
@ 2017-07-10  9:17     ` Paolo Bonzini
  2017-07-10  9:20       ` David Hildenbrand
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2017-07-10  9:17 UTC (permalink / raw)
  To: David Hildenbrand, Bandan Das, kvm; +Cc: linux-kernel



On 10/07/2017 10:54, David Hildenbrand wrote:
> 
>>  /*
>>   * The exit handlers return 1 if the exit was handled fully and guest execution
>>   * may resume.  Otherwise they set the kvm_run parameter to indicate what needs
>> @@ -7790,6 +7806,7 @@ static int (*const kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
>>  	[EXIT_REASON_XSAVES]                  = handle_xsaves,
>>  	[EXIT_REASON_XRSTORS]                 = handle_xrstors,
>>  	[EXIT_REASON_PML_FULL]		      = handle_pml_full,
>> +	[EXIT_REASON_VMFUNC]                  = handle_vmfunc,
>>  	[EXIT_REASON_PREEMPTION_TIMER]	      = handle_preemption_timer,
>>  };
>>  
>> @@ -8111,6 +8128,9 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
>>  	case EXIT_REASON_PML_FULL:
>>  		/* We emulate PML support to L1. */
>>  		return false;
>> +	case EXIT_REASON_VMFUNC:
>> +		/* VM functions are emulated through L2->L0 vmexits. */
>> +		return false;
> 
> This would fit better into the second patch.

It depends on how you reason about it.  I put it here because:

- until this patch, EXIT_REASON_VMFUNC should never be generated.  We
don't even know that it exists.

- after this patch, it should still never be generated in nested
scenarios.  However, if it did because of a bug, we're in a better place
to handle it than L1 (because as far as L1 knows, it should never be
generated).

Perhaps this is an argument in favor of changing the default case of
nested_vmx_exit_handled from true to false.

Paolo

>>  	default:
>>  		return true;
>>  	}
>>
> 
> 
> Looks good to me.
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3 v2] KVM: nVMX: Enable VMFUNC for the L1 hypervisor
  2017-07-06 23:03 ` [PATCH 2/3 v2] KVM: nVMX: Enable VMFUNC for the L1 hypervisor Bandan Das
@ 2017-07-10  9:17   ` David Hildenbrand
  2017-07-10 11:03     ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: David Hildenbrand @ 2017-07-10  9:17 UTC (permalink / raw)
  To: Bandan Das, kvm; +Cc: pbonzini, linux-kernel


> @@ -7752,7 +7769,29 @@ static int handle_preemption_timer(struct kvm_vcpu *vcpu)
>  
>  static int handle_vmfunc(struct kvm_vcpu *vcpu)
>  {
> -	kvm_queue_exception(vcpu, UD_VECTOR);
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +	struct vmcs12 *vmcs12;
> +	u32 function = vcpu->arch.regs[VCPU_REGS_RAX];
> +
> +	/*
> +	 * VMFUNC is only supported for nested guests, but we always enable the
> +	 * secondary control for simplicity; for non-nested mode, fake that we
> +	 * didn't by injecting #UD.
> +	 */
> +	if (!is_guest_mode(vcpu)) {
> +		kvm_queue_exception(vcpu, UD_VECTOR);
> +		return 1;
> +	}
> +
> +	vmcs12 = get_vmcs12(vcpu);
> +	if ((vmcs12->vm_function_control & (1 << function)) == 0)

(learned that in c, shifting beyond the type size is undefined)

Should we check for function < 64 here? (as SDM mentions)

> +		goto fail;
> +	WARN(1, "VMCS12 VM function control should have been zero");
> +
> +fail:

We will never hit the case !nested_cpu_has_vmfunc(vmcs12) here, correct?

> +	nested_vmx_vmexit(vcpu, vmx->exit_reason,
> +			  vmcs_read32(VM_EXIT_INTR_INFO),
> +			  vmcs_readl(EXIT_QUALIFICATION));
>  	return 1;
>  }
>  


-- 

Thanks,

David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3 v2] KVM: vmx: Enable VMFUNCs
  2017-07-10  9:17     ` Paolo Bonzini
@ 2017-07-10  9:20       ` David Hildenbrand
  2017-07-10  9:21         ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: David Hildenbrand @ 2017-07-10  9:20 UTC (permalink / raw)
  To: Paolo Bonzini, Bandan Das, kvm; +Cc: linux-kernel

On 10.07.2017 11:17, Paolo Bonzini wrote:
> 
> 
> On 10/07/2017 10:54, David Hildenbrand wrote:
>>
>>>  /*
>>>   * The exit handlers return 1 if the exit was handled fully and guest execution
>>>   * may resume.  Otherwise they set the kvm_run parameter to indicate what needs
>>> @@ -7790,6 +7806,7 @@ static int (*const kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
>>>  	[EXIT_REASON_XSAVES]                  = handle_xsaves,
>>>  	[EXIT_REASON_XRSTORS]                 = handle_xrstors,
>>>  	[EXIT_REASON_PML_FULL]		      = handle_pml_full,
>>> +	[EXIT_REASON_VMFUNC]                  = handle_vmfunc,
>>>  	[EXIT_REASON_PREEMPTION_TIMER]	      = handle_preemption_timer,
>>>  };
>>>  
>>> @@ -8111,6 +8128,9 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
>>>  	case EXIT_REASON_PML_FULL:
>>>  		/* We emulate PML support to L1. */
>>>  		return false;
>>> +	case EXIT_REASON_VMFUNC:
>>> +		/* VM functions are emulated through L2->L0 vmexits. */
>>> +		return false;
>>
>> This would fit better into the second patch.
> 
> It depends on how you reason about it.  I put it here because:
> 
> - until this patch, EXIT_REASON_VMFUNC should never be generated.  We
> don't even know that it exists.
> 
> - after this patch, it should still never be generated in nested
> scenarios.  However, if it did because of a bug, we're in a better place
> to handle it than L1 (because as far as L1 knows, it should never be
> generated).
> 
> Perhaps this is an argument in favor of changing the default case of
> nested_vmx_exit_handled from true to false.

I remember having the same discussion before :) I still think the
default should be changed (then we don't need nVMX hunks in VMX patches
;) ).

Anyhow

Reviewed-by: David Hildenbrand <david@redhat.com>

> 
> Paolo


-- 

Thanks,

David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3 v2] KVM: vmx: Enable VMFUNCs
  2017-07-10  9:20       ` David Hildenbrand
@ 2017-07-10  9:21         ` Paolo Bonzini
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2017-07-10  9:21 UTC (permalink / raw)
  To: David Hildenbrand, Bandan Das, kvm; +Cc: linux-kernel



On 10/07/2017 11:20, David Hildenbrand wrote:
>> Perhaps this is an argument in favor of changing the default case of
>> nested_vmx_exit_handled from true to false.
>
> I remember having the same discussion before :) I still think the
> default should be changed (then we don't need nVMX hunks in VMX patches
> ;) ).

Yeah, that was my point.  Patches welcome! ;)

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3 v2] KVM: nVMX: Enable VMFUNC for the L1 hypervisor
  2017-07-10  9:17   ` David Hildenbrand
@ 2017-07-10 11:03     ` Paolo Bonzini
  2017-07-10 12:15       ` David Hildenbrand
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2017-07-10 11:03 UTC (permalink / raw)
  To: David Hildenbrand, Bandan Das, kvm; +Cc: linux-kernel



On 10/07/2017 11:17, David Hildenbrand wrote:
>> +
>> +	vmcs12 = get_vmcs12(vcpu);
>> +	if ((vmcs12->vm_function_control & (1 << function)) == 0)
> (learned that in c, shifting beyond the type size is undefined)
> 
> Should we check for function < 64 here? (as SDM mentions)

It should be already.  The manual says:

The VMFUNC instruction causes an invalid-opcode exception (#UD) if the
“enable VM functions” VM-execution controls is 0 1 or the value of EAX
is greater than 63 (only VM functions 0–63 can be enable). Otherwise,
the instruction causes a VM exit if the bit at position EAX is 0 in the
VM-function controls (the selected VM function is not enabled).

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3 v2] KVM: nVMX: Enable VMFUNC for the L1 hypervisor
  2017-07-10 11:03     ` Paolo Bonzini
@ 2017-07-10 12:15       ` David Hildenbrand
  0 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand @ 2017-07-10 12:15 UTC (permalink / raw)
  To: Paolo Bonzini, Bandan Das, kvm; +Cc: linux-kernel

On 10.07.2017 13:03, Paolo Bonzini wrote:
> 
> 
> On 10/07/2017 11:17, David Hildenbrand wrote:
>>> +
>>> +	vmcs12 = get_vmcs12(vcpu);
>>> +	if ((vmcs12->vm_function_control & (1 << function)) == 0)
>> (learned that in c, shifting beyond the type size is undefined)
>>
>> Should we check for function < 64 here? (as SDM mentions)
> 
> It should be already.  The manual says:
> 
> The VMFUNC instruction causes an invalid-opcode exception (#UD) if the
> “enable VM functions” VM-execution controls is 0 1 or the value of EAX
> is greater than 63 (only VM functions 0–63 can be enable). Otherwise,
> the instruction causes a VM exit if the bit at position EAX is 0 in the
> VM-function controls (the selected VM function is not enabled).
> 

Thanks, I missed that paragraph.

> Paolo
> 


-- 

Thanks,

David

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-07-10 12:15 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-06 23:03 [PATCH 0/3 v2] Expose VMFUNC to the nested hypervisor Bandan Das
2017-07-06 23:03 ` [PATCH 1/3 v2] KVM: vmx: Enable VMFUNCs Bandan Das
2017-07-10  8:54   ` David Hildenbrand
2017-07-10  9:17     ` Paolo Bonzini
2017-07-10  9:20       ` David Hildenbrand
2017-07-10  9:21         ` Paolo Bonzini
2017-07-06 23:03 ` [PATCH 2/3 v2] KVM: nVMX: Enable VMFUNC for the L1 hypervisor Bandan Das
2017-07-10  9:17   ` David Hildenbrand
2017-07-10 11:03     ` Paolo Bonzini
2017-07-10 12:15       ` David Hildenbrand
2017-07-06 23:03 ` [PATCH 3/3 v2] KVM: nVMX: Emulate EPTP switching " Bandan Das
2017-07-07  8:30   ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.