linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] kvm: x86: nVMX: maintain internal copy of current VMCS
@ 2016-07-14  0:16 David Matlack
  2016-07-14  8:33 ` Paolo Bonzini
  0 siblings, 1 reply; 3+ messages in thread
From: David Matlack @ 2016-07-14  0:16 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, pfeiner, linux-kernel, David Matlack

KVM maintains L1's current VMCS in guest memory, at the guest physical
page identified by the argument to VMPTRLD. This makes hairy
time-of-check to time-of-use bugs possible,as VCPUs can be writing
the the VMCS page in memory while KVM is emulating VMLAUNCH and
VMRESUME.

The spec documents that writing to the VMCS page while it is loaded is
"undefined". Therefore it is reasonable to load the entire VMCS into
an internal cache during VMPTRLD and ignore writes to the VMCS page
-- the guest should be using VMREAD and VMWRITE to access the current
VMCS.

To adhere to the spec, KVM should flush the current VMCS during VMPTRLD,
and the target VMCS during VMCLEAR (as given by the operand to VMCLEAR).
Since this implementation of VMCS caching only maintains the the current
VMCS, VMCLEAR will only do a flush if the operand to VMCLEAR is the
current VMCS pointer.

KVM will also flush during VMXOFF, which is not mandated by the spec,
but also not in conflict with the spec.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/vmx.c | 31 ++++++++++++++++++++++++++++---
 1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 64a79f2..640ad91 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -398,6 +398,12 @@ struct nested_vmx {
 	/* The host-usable pointer to the above */
 	struct page *current_vmcs12_page;
 	struct vmcs12 *current_vmcs12;
+	/*
+	 * Cache of the guest's VMCS, existing outside of guest memory.
+	 * Loaded from guest memory during VMPTRLD. Flushed to guest
+	 * memory during VMXOFF, VMCLEAR, VMPTRLD.
+	 */
+	struct vmcs12 *cached_vmcs12;
 	struct vmcs *current_shadow_vmcs;
 	/*
 	 * Indicates if the shadow vmcs must be updated with the
@@ -841,7 +847,7 @@ static inline short vmcs_field_to_offset(unsigned long field)
 
 static inline struct vmcs12 *get_vmcs12(struct kvm_vcpu *vcpu)
 {
-	return to_vmx(vcpu)->nested.current_vmcs12;
+	return to_vmx(vcpu)->nested.cached_vmcs12;
 }
 
 static struct page *nested_get_page(struct kvm_vcpu *vcpu, gpa_t addr)
@@ -6866,10 +6872,16 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
 		return 1;
 	}
 
+	vmx->nested.cached_vmcs12 = kmalloc(VMCS12_SIZE, GFP_KERNEL);
+	if (!vmx->nested.cached_vmcs12)
+		return -ENOMEM;
+
 	if (enable_shadow_vmcs) {
 		shadow_vmcs = alloc_vmcs();
-		if (!shadow_vmcs)
+		if (!shadow_vmcs) {
+			kfree(vmx->nested.cached_vmcs12);
 			return -ENOMEM;
+		}
 		/* mark vmcs as shadow */
 		shadow_vmcs->revision_id |= (1u << 31);
 		/* init shadow vmcs */
@@ -6940,6 +6952,11 @@ static inline void nested_release_vmcs12(struct vcpu_vmx *vmx)
 		vmcs_write64(VMCS_LINK_POINTER, -1ull);
 	}
 	vmx->nested.posted_intr_nv = -1;
+
+	/* Flush VMCS12 to guest memory */
+	memcpy(vmx->nested.current_vmcs12, vmx->nested.cached_vmcs12,
+	       VMCS12_SIZE);
+
 	kunmap(vmx->nested.current_vmcs12_page);
 	nested_release_page(vmx->nested.current_vmcs12_page);
 	vmx->nested.current_vmptr = -1ull;
@@ -6960,6 +6977,7 @@ static void free_nested(struct vcpu_vmx *vmx)
 	nested_release_vmcs12(vmx);
 	if (enable_shadow_vmcs)
 		free_vmcs(vmx->nested.current_shadow_vmcs);
+	kfree(vmx->nested.cached_vmcs12);
 	/* Unpin physical memory we referred to in current vmcs02 */
 	if (vmx->nested.apic_access_page) {
 		nested_release_page(vmx->nested.apic_access_page);
@@ -7363,6 +7381,13 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu)
 		vmx->nested.current_vmptr = vmptr;
 		vmx->nested.current_vmcs12 = new_vmcs12;
 		vmx->nested.current_vmcs12_page = page;
+		/*
+		 * Load VMCS12 from guest memory since it is not already
+		 * cached.
+		 */
+		memcpy(vmx->nested.cached_vmcs12,
+		       vmx->nested.current_vmcs12, VMCS12_SIZE);
+
 		if (enable_shadow_vmcs) {
 			vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
 				      SECONDARY_EXEC_SHADOW_VMCS);
@@ -8326,7 +8351,7 @@ static void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu, hpa_t hpa)
 	 * the next L2->L1 exit.
 	 */
 	if (!is_guest_mode(vcpu) ||
-	    !nested_cpu_has2(vmx->nested.current_vmcs12,
+	    !nested_cpu_has2(get_vmcs12(&vmx->vcpu),
 			     SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES))
 		vmcs_write64(APIC_ACCESS_ADDR, hpa);
 }
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] kvm: x86: nVMX: maintain internal copy of current VMCS
  2016-07-14  0:16 [PATCH] kvm: x86: nVMX: maintain internal copy of current VMCS David Matlack
@ 2016-07-14  8:33 ` Paolo Bonzini
  2016-07-15 17:54   ` David Matlack
  0 siblings, 1 reply; 3+ messages in thread
From: Paolo Bonzini @ 2016-07-14  8:33 UTC (permalink / raw)
  To: David Matlack, kvm; +Cc: jmattson, pfeiner, linux-kernel



On 14/07/2016 02:16, David Matlack wrote:
> KVM maintains L1's current VMCS in guest memory, at the guest physical
> page identified by the argument to VMPTRLD. This makes hairy
> time-of-check to time-of-use bugs possible,as VCPUs can be writing
> the the VMCS page in memory while KVM is emulating VMLAUNCH and
> VMRESUME.
> 
> The spec documents that writing to the VMCS page while it is loaded is
> "undefined". Therefore it is reasonable to load the entire VMCS into
> an internal cache during VMPTRLD and ignore writes to the VMCS page
> -- the guest should be using VMREAD and VMWRITE to access the current
> VMCS.
> 
> To adhere to the spec, KVM should flush the current VMCS during VMPTRLD,
> and the target VMCS during VMCLEAR (as given by the operand to VMCLEAR).
> Since this implementation of VMCS caching only maintains the the current
> VMCS, VMCLEAR will only do a flush if the operand to VMCLEAR is the
> current VMCS pointer.
> 
> KVM will also flush during VMXOFF, which is not mandated by the spec,
> but also not in conflict with the spec.
> 
> Signed-off-by: David Matlack <dmatlack@google.com>

This is a good change.  There is another change that is possible on top:
with this change you don't need current_vmcs12/current_vmcs12_page at
all, I think.  You can just use current_vmptr and kvm_read/write_guest
to write back the VMCS12, possibly the cached variants.

Of course this would just be a small simplification, so I'm applying the
patch as is to kvm/next.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] kvm: x86: nVMX: maintain internal copy of current VMCS
  2016-07-14  8:33 ` Paolo Bonzini
@ 2016-07-15 17:54   ` David Matlack
  0 siblings, 0 replies; 3+ messages in thread
From: David Matlack @ 2016-07-15 17:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm list, Jim Mattson, Peter Feiner, linux-kernel

On Thu, Jul 14, 2016 at 1:33 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 14/07/2016 02:16, David Matlack wrote:
>> KVM maintains L1's current VMCS in guest memory, at the guest physical
>> page identified by the argument to VMPTRLD. This makes hairy
>> time-of-check to time-of-use bugs possible,as VCPUs can be writing
>> the the VMCS page in memory while KVM is emulating VMLAUNCH and
>> VMRESUME.
>>
>> The spec documents that writing to the VMCS page while it is loaded is
>> "undefined". Therefore it is reasonable to load the entire VMCS into
>> an internal cache during VMPTRLD and ignore writes to the VMCS page
>> -- the guest should be using VMREAD and VMWRITE to access the current
>> VMCS.
>>
>> To adhere to the spec, KVM should flush the current VMCS during VMPTRLD,
>> and the target VMCS during VMCLEAR (as given by the operand to VMCLEAR).
>> Since this implementation of VMCS caching only maintains the the current
>> VMCS, VMCLEAR will only do a flush if the operand to VMCLEAR is the
>> current VMCS pointer.
>>
>> KVM will also flush during VMXOFF, which is not mandated by the spec,
>> but also not in conflict with the spec.
>>
>> Signed-off-by: David Matlack <dmatlack@google.com>
>
> This is a good change.  There is another change that is possible on top:
> with this change you don't need current_vmcs12/current_vmcs12_page at
> all, I think.  You can just use current_vmptr and kvm_read/write_guest
> to write back the VMCS12, possibly the cached variants.

Good catch, I agree they can be removed.

>
> Of course this would just be a small simplification, so I'm applying the
> patch as is to kvm/next.

SGTM. Thanks for the review.

>
> Thanks,
>
> Paolo

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-07-15 17:54 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-14  0:16 [PATCH] kvm: x86: nVMX: maintain internal copy of current VMCS David Matlack
2016-07-14  8:33 ` Paolo Bonzini
2016-07-15 17:54   ` David Matlack

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).