kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] KVM: nSVM: few random fixes
@ 2021-01-07  9:38 Maxim Levitsky
  2021-01-07  9:38 ` [PATCH v2 1/4] KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit Maxim Levitsky
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-01-07  9:38 UTC (permalink / raw)
  To: kvm
  Cc: Vitaly Kuznetsov, H. Peter Anvin, Sean Christopherson,
	Borislav Petkov, Paolo Bonzini, Thomas Gleixner, linux-kernel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Wanpeng Li, Joerg Roedel, Jim Mattson,
	Maxim Levitsky

This is a series of fixes to nested SVM, that finally makes my kvm on kvm
stress test pass, and fix various other issues/regressions.

Patch 1 is a fix for recent regression related to code that delayed the nested
msr bitmap processing to the next vm entry, and started to crash the L1 after
my on demand nested state allocation patches.

The problem was that the code assumed that we will still be in the nested
guest mode on next vmentry after setting the nested state, but a pending event
can cause a nested vmexit prior to that.

Patch 2 makes KVM restore nested_run_pending flag on migration which fixes
various issues including potentially missed L1->L2 event injection
if migration happens while nested run is pending.

Patches 3,4 are few things I found while reviewing the nested migration code.
I don't have a reproducer for them.

Thanks a lot to Sean Christopherson for the review feedback on V1 of this
series, which is fully incorporated in this series.

Best regards,
	Maxim Levitsky

Maxim Levitsky (4):
  KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit
  KVM: nSVM: correctly restore nested_run_pending on migration
  KVM: nSVM: always leave the nested state first on KVM_SET_NESTED_STATE
  KVM: nSVM: mark vmcb as dirty when forcingly leaving the guest mode

 arch/x86/kvm/svm/nested.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/4] KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit
  2021-01-07  9:38 [PATCH v2 0/4] KVM: nSVM: few random fixes Maxim Levitsky
@ 2021-01-07  9:38 ` Maxim Levitsky
  2021-01-07 17:00   ` Sean Christopherson
  2021-01-07  9:38 ` [PATCH v2 2/4] KVM: nSVM: correctly restore nested_run_pending on migration Maxim Levitsky
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Maxim Levitsky @ 2021-01-07  9:38 UTC (permalink / raw)
  To: kvm
  Cc: Vitaly Kuznetsov, H. Peter Anvin, Sean Christopherson,
	Borislav Petkov, Paolo Bonzini, Thomas Gleixner, linux-kernel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Wanpeng Li, Joerg Roedel, Jim Mattson,
	Maxim Levitsky

It is possible to exit the nested guest mode, entered by
svm_set_nested_state prior to first vm entry to it (e.g due to pending event)
if the nested run was not pending during the migration.

In this case we must not switch to the nested msr permission bitmap.
Also add a warning to catch similar cases in the future.

Fixes: a7d5c7ce41ac1 ("KVM: nSVM: delay MSR permission processing to first nested VM run")

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index b0b667456b2e7..ee4f2082ad1bd 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -199,6 +199,10 @@ static bool nested_svm_vmrun_msrpm(struct vcpu_svm *svm)
 static bool svm_get_nested_state_pages(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+
+	if (WARN_ON_ONCE(!is_guest_mode(&svm->vcpu)))
+		return false;
+
 	if (!nested_svm_vmrun_msrpm(svm)) {
 		vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
 		vcpu->run->internal.suberror =
@@ -595,6 +599,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 	svm->nested.vmcb12_gpa = 0;
 	WARN_ON_ONCE(svm->nested.nested_run_pending);
 
+	kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, &svm->vcpu);
+
 	/* in case we halted in L2 */
 	svm->vcpu.arch.mp_state = KVM_MP_STATE_RUNNABLE;
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/4] KVM: nSVM: correctly restore nested_run_pending on migration
  2021-01-07  9:38 [PATCH v2 0/4] KVM: nSVM: few random fixes Maxim Levitsky
  2021-01-07  9:38 ` [PATCH v2 1/4] KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit Maxim Levitsky
@ 2021-01-07  9:38 ` Maxim Levitsky
  2021-01-07 18:03   ` Paolo Bonzini
  2021-01-07  9:38 ` [PATCH v2 3/4] KVM: nSVM: always leave the nested state first on KVM_SET_NESTED_STATE Maxim Levitsky
  2021-01-07  9:38 ` [PATCH v2 4/4] KVM: nSVM: mark vmcb as dirty when forcingly leaving the guest mode Maxim Levitsky
  3 siblings, 1 reply; 13+ messages in thread
From: Maxim Levitsky @ 2021-01-07  9:38 UTC (permalink / raw)
  To: kvm
  Cc: Vitaly Kuznetsov, H. Peter Anvin, Sean Christopherson,
	Borislav Petkov, Paolo Bonzini, Thomas Gleixner, linux-kernel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Wanpeng Li, Joerg Roedel, Jim Mattson,
	Maxim Levitsky

The code to store it on the migration exists, but no code was restoring it.

One of the side effects of fixing this is that L1->L2 injected events
are no longer lost when migration happens with nested run pending.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index ee4f2082ad1bd..cc3130ab612e5 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1200,6 +1200,10 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 	 * in the registers, the save area of the nested state instead
 	 * contains saved L1 state.
 	 */
+
+	svm->nested.nested_run_pending =
+		!!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
+
 	copy_vmcb_control_area(&hsave->control, &svm->vmcb->control);
 	hsave->save = *save;
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/4] KVM: nSVM: always leave the nested state first on KVM_SET_NESTED_STATE
  2021-01-07  9:38 [PATCH v2 0/4] KVM: nSVM: few random fixes Maxim Levitsky
  2021-01-07  9:38 ` [PATCH v2 1/4] KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit Maxim Levitsky
  2021-01-07  9:38 ` [PATCH v2 2/4] KVM: nSVM: correctly restore nested_run_pending on migration Maxim Levitsky
@ 2021-01-07  9:38 ` Maxim Levitsky
  2021-01-07  9:38 ` [PATCH v2 4/4] KVM: nSVM: mark vmcb as dirty when forcingly leaving the guest mode Maxim Levitsky
  3 siblings, 0 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-01-07  9:38 UTC (permalink / raw)
  To: kvm
  Cc: Vitaly Kuznetsov, H. Peter Anvin, Sean Christopherson,
	Borislav Petkov, Paolo Bonzini, Thomas Gleixner, linux-kernel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Wanpeng Li, Joerg Roedel, Jim Mattson,
	Maxim Levitsky

This should prevent bad things from happening if the user calls the
KVM_SET_NESTED_STATE twice.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index cc3130ab612e5..e91d40c8d8c91 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1200,6 +1200,7 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 	 * in the registers, the save area of the nested state instead
 	 * contains saved L1 state.
 	 */
+	svm_leave_nested(svm);
 
 	svm->nested.nested_run_pending =
 		!!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/4] KVM: nSVM: mark vmcb as dirty when forcingly leaving the guest mode
  2021-01-07  9:38 [PATCH v2 0/4] KVM: nSVM: few random fixes Maxim Levitsky
                   ` (2 preceding siblings ...)
  2021-01-07  9:38 ` [PATCH v2 3/4] KVM: nSVM: always leave the nested state first on KVM_SET_NESTED_STATE Maxim Levitsky
@ 2021-01-07  9:38 ` Maxim Levitsky
  3 siblings, 0 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-01-07  9:38 UTC (permalink / raw)
  To: kvm
  Cc: Vitaly Kuznetsov, H. Peter Anvin, Sean Christopherson,
	Borislav Petkov, Paolo Bonzini, Thomas Gleixner, linux-kernel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Wanpeng Li, Joerg Roedel, Jim Mattson,
	Maxim Levitsky

We overwrite most of vmcb fields while doing so, so we must
mark it as dirty.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index e91d40c8d8c91..c340fbad88566 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -760,6 +760,7 @@ void svm_leave_nested(struct vcpu_svm *svm)
 		leave_guest_mode(&svm->vcpu);
 		copy_vmcb_control_area(&vmcb->control, &hsave->control);
 		nested_svm_uninit_mmu_context(&svm->vcpu);
+		vmcb_mark_all_dirty(svm->vmcb);
 	}
 
 	kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, &svm->vcpu);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/4] KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit
  2021-01-07  9:38 ` [PATCH v2 1/4] KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit Maxim Levitsky
@ 2021-01-07 17:00   ` Sean Christopherson
  2021-01-07 17:51     ` Paolo Bonzini
  0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2021-01-07 17:00 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm, Vitaly Kuznetsov, H. Peter Anvin, Sean Christopherson,
	Borislav Petkov, Paolo Bonzini, Thomas Gleixner, linux-kernel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Wanpeng Li, Joerg Roedel, Jim Mattson

On Thu, Jan 07, 2021, Maxim Levitsky wrote:
> It is possible to exit the nested guest mode, entered by
> svm_set_nested_state prior to first vm entry to it (e.g due to pending event)
> if the nested run was not pending during the migration.

Ugh, I assume this is due to one of the "premature" nested_ops->check_events()
calls that are necessitated by the event mess?  I'm guessing kvm_vcpu_running()
is the culprit?

If my assumption is correct, this bug affects nVMX as well.  Rather than clear
the request blindly on any nested VM-Exit, what about something like the
following?  IMO, KVM really shouldn't be synthesizing a nested VM-Exit before it
processes pending requests, but unfortunately the nested_ops->check_events()
mess isn't easily fixed.  This will at least limit the mess, e.g. with this we'd
get a WARN if KVM_REQ_GET_NESTED_STATE_PAGES is set after some other VM-Exit.

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3136e05831cf..f44e6f7a0c9b 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2857,17 +2857,15 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu)
        if (!pe)
                return;

-       if (is_guest_mode(vcpu)) {
-               r = kvm_x86_ops.nested_ops->check_events(vcpu);
-               if (r < 0)
-                       return;
-               /*
-                * If an event has happened and caused a vmexit,
-                * we know INITs are latched and therefore
-                * we will not incorrectly deliver an APIC
-                * event instead of a vmexit.
-                */
-       }
+       r = kvm_nested_check_events(vcpu);
+       if (r < 0)
+               return;
+
+       /*
+        * If an event has happened and caused a vmexit, we know INITs are
+        * latched and therefore we will not incorrectly deliver an APIC
+        * event instead of a vmexit.
+        */

        /*
         * INITs are latched while CPU is in specific states
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3f7c1fc7a3ce..b0f172d13cab 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8223,6 +8223,25 @@ static void update_cr8_intercept(struct kvm_vcpu *vcpu)
        kvm_x86_ops.update_cr8_intercept(vcpu, tpr, max_irr);
 }

+int kvm_nested_check_events(struct kvm_vcpu *vcpu)
+{
+       int r;
+
+       if (!is_guest_mode(vcpu))
+               return 0;
+
+       r = kvm_x86_ops.nested_ops->check_events(vcpu);
+
+       /*
+        * Clear nested-specific requests if checking nested events triggered a
+        * VM-Exit, they'll be re-requested on nested VM-Enter (if necessary).
+        */
+       if (!is_guest_mode(vcpu))
+               kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
+
+       return r;
+}
+
 static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit)
 {
        int r;
@@ -8267,11 +8286,9 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
         * from L2 to L1 due to pending L1 events which require exit
         * from L2 to L1.
         */
-       if (is_guest_mode(vcpu)) {
-               r = kvm_x86_ops.nested_ops->check_events(vcpu);
-               if (r < 0)
-                       goto busy;
-       }
+       r = kvm_nested_check_events(vcpu);
+       if (r < 0)
+               goto busy;

        /* try to inject new event if pending */
        if (vcpu->arch.exception.pending) {
@@ -8789,7 +8806,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)

        if (kvm_request_pending(vcpu)) {
                if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) {
-                       if (unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu))) {
+                       if (!WARN_ON(!is_guest_mode(vcpu) &&
+                           unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu)))) {
                                r = 0;
                                goto out;
                        }
@@ -9111,8 +9129,7 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)

 static inline bool kvm_vcpu_running(struct kvm_vcpu *vcpu)
 {
-       if (is_guest_mode(vcpu))
-               kvm_x86_ops.nested_ops->check_events(vcpu);
+       (void)kvm_nested_check_events(vcpu);

        return (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
                !vcpu->arch.apf.halted);
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index c5ee0f5ce0f1..dce61fda9c5e 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -247,6 +247,8 @@ static inline bool kvm_vcpu_latch_init(struct kvm_vcpu *vcpu)
        return is_smm(vcpu) || kvm_x86_ops.apic_init_signal_blocked(vcpu);
 }

+int kvm_nested_check_events(struct kvm_vcpu *vcpu);
+
 void kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip);

 void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr);


> In this case we must not switch to the nested msr permission bitmap.
> Also add a warning to catch similar cases in the future.
> 
> Fixes: a7d5c7ce41ac1 ("KVM: nSVM: delay MSR permission processing to first nested VM run")
> 
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/svm/nested.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index b0b667456b2e7..ee4f2082ad1bd 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -199,6 +199,10 @@ static bool nested_svm_vmrun_msrpm(struct vcpu_svm *svm)
>  static bool svm_get_nested_state_pages(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_svm *svm = to_svm(vcpu);
> +
> +	if (WARN_ON_ONCE(!is_guest_mode(&svm->vcpu)))
> +		return false;
> +
>  	if (!nested_svm_vmrun_msrpm(svm)) {
>  		vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>  		vcpu->run->internal.suberror =
> @@ -595,6 +599,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
>  	svm->nested.vmcb12_gpa = 0;
>  	WARN_ON_ONCE(svm->nested.nested_run_pending);
>  
> +	kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, &svm->vcpu);
> +
>  	/* in case we halted in L2 */
>  	svm->vcpu.arch.mp_state = KVM_MP_STATE_RUNNABLE;
>  
> -- 
> 2.26.2
> 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/4] KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit
  2021-01-07 17:00   ` Sean Christopherson
@ 2021-01-07 17:51     ` Paolo Bonzini
  2021-01-07 17:59       ` Paolo Bonzini
                         ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Paolo Bonzini @ 2021-01-07 17:51 UTC (permalink / raw)
  To: Sean Christopherson, Maxim Levitsky
  Cc: kvm, Vitaly Kuznetsov, H. Peter Anvin, Sean Christopherson,
	Borislav Petkov, Thomas Gleixner, linux-kernel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Wanpeng Li, Joerg Roedel, Jim Mattson

On 07/01/21 18:00, Sean Christopherson wrote:
> Ugh, I assume this is due to one of the "premature" nested_ops->check_events()
> calls that are necessitated by the event mess?  I'm guessing kvm_vcpu_running()
> is the culprit?
> 
> If my assumption is correct, this bug affects nVMX as well.

Yes, though it may be latent.  For SVM it was until we started 
allocating svm->nested on demand.

> Rather than clear the request blindly on any nested VM-Exit, what
> about something like the following?

I think your patch is overkill, KVM_REQ_GET_NESTED_STATE_PAGES is only 
set from KVM_SET_NESTED_STATE so it cannot happen while the VM runs.

Something like this is small enough and works well.

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index a622e63739b4..cb4c6ee10029 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -595,6 +596,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
  	svm->nested.vmcb12_gpa = 0;
  	WARN_ON_ONCE(svm->nested.nested_run_pending);

+	kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, &svm->vcpu);
+
  	/* in case we halted in L2 */
  	svm->vcpu.arch.mp_state = KVM_MP_STATE_RUNNABLE;

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index e2f26564a12d..0fbb46990dfc 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4442,6 +4442,8 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 
vm_exit_reason,
  	/* trying to cancel vmlaunch/vmresume is a bug */
  	WARN_ON_ONCE(vmx->nested.nested_run_pending);

+	kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
+
  	/* Service the TLB flush request for L2 before switching to L1. */
  	if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu))
  		kvm_vcpu_flush_tlb_current(vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3f7c1fc7a3ce..b7e784b5489c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8789,7 +8789,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)

  	if (kvm_request_pending(vcpu)) {
  		if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) {
-			if (unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu))) {
+			if (WARN_ON_ONCE(!is_guest_mode(&svm->vcpu)))
+				;
+			else if 
(unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu))) {
  				r = 0;
  				goto out;
  			}


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/4] KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit
  2021-01-07 17:51     ` Paolo Bonzini
@ 2021-01-07 17:59       ` Paolo Bonzini
  2021-01-07 18:03       ` Maxim Levitsky
  2021-01-07 19:12       ` Sean Christopherson
  2 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2021-01-07 17:59 UTC (permalink / raw)
  To: Sean Christopherson, Maxim Levitsky
  Cc: kvm, Vitaly Kuznetsov, H. Peter Anvin, Sean Christopherson,
	Borislav Petkov, Thomas Gleixner, linux-kernel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Wanpeng Li, Joerg Roedel, Jim Mattson

On 07/01/21 18:51, Paolo Bonzini wrote:
> On 07/01/21 18:00, Sean Christopherson wrote:
>> Ugh, I assume this is due to one of the "premature" 
>> nested_ops->check_events()
>> calls that are necessitated by the event mess?  I'm guessing 
>> kvm_vcpu_running()
>> is the culprit?
>>
>> If my assumption is correct, this bug affects nVMX as well.
> 
> Yes, though it may be latent.  For SVM it was until we started 
> allocating svm->nested on demand.
> 
>> Rather than clear the request blindly on any nested VM-Exit, what
>> about something like the following?
> 
> I think your patch is overkill, KVM_REQ_GET_NESTED_STATE_PAGES is only 
> set from KVM_SET_NESTED_STATE so it cannot happen while the VM runs.

... and when leaving SMM.  But in either case, there cannot be something 
else causing a nested vmexit before the request is set, because SMM does 
not support VMX operation.  So I still don't think that it justifies the 
extra code and indirection.

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/4] KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit
  2021-01-07 17:51     ` Paolo Bonzini
  2021-01-07 17:59       ` Paolo Bonzini
@ 2021-01-07 18:03       ` Maxim Levitsky
  2021-01-07 19:12       ` Sean Christopherson
  2 siblings, 0 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-01-07 18:03 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Vitaly Kuznetsov, H. Peter Anvin, Sean Christopherson,
	Borislav Petkov, Thomas Gleixner, linux-kernel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Wanpeng Li, Joerg Roedel, Jim Mattson

On Thu, 2021-01-07 at 18:51 +0100, Paolo Bonzini wrote:
> On 07/01/21 18:00, Sean Christopherson wrote:
> > Ugh, I assume this is due to one of the "premature" nested_ops->check_events()
> > calls that are necessitated by the event mess?  I'm guessing kvm_vcpu_running()
> > is the culprit?
> > 
> > If my assumption is correct, this bug affects nVMX as well.
> 
> Yes, though it may be latent.  For SVM it was until we started 
> allocating svm->nested on demand.
> 
> > Rather than clear the request blindly on any nested VM-Exit, what
> > about something like the following?
> 
> I think your patch is overkill, KVM_REQ_GET_NESTED_STATE_PAGES is only 
> set from KVM_SET_NESTED_STATE so it cannot happen while the VM runs.

Note that I didn't include the same fix for VMX becasue it uses a separate vmcs
for guest which has its own msr bitmap so in theory this shouldn't be needed,
but it won't hurt.

I'll test indeed if canceling the KVM_REQ_GET_NESTED_STATE_PAGES on VMX
makes any difference on VMX in regard to nested migration crashes I am seeing.

Best regards,
	Maxim Levitsky

> 
> Something like this is small enough and works well.
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index a622e63739b4..cb4c6ee10029 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -595,6 +596,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
>   	svm->nested.vmcb12_gpa = 0;
>   	WARN_ON_ONCE(svm->nested.nested_run_pending);
> 
> +	kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, &svm->vcpu);
> +
>   	/* in case we halted in L2 */
>   	svm->vcpu.arch.mp_state = KVM_MP_STATE_RUNNABLE;
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index e2f26564a12d..0fbb46990dfc 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4442,6 +4442,8 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 
> vm_exit_reason,
>   	/* trying to cancel vmlaunch/vmresume is a bug */
>   	WARN_ON_ONCE(vmx->nested.nested_run_pending);
> 
> +	kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
> +
>   	/* Service the TLB flush request for L2 before switching to L1. */
>   	if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu))
>   		kvm_vcpu_flush_tlb_current(vcpu);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 3f7c1fc7a3ce..b7e784b5489c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8789,7 +8789,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> 
>   	if (kvm_request_pending(vcpu)) {
>   		if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) {
> -			if (unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu))) {
> +			if (WARN_ON_ONCE(!is_guest_mode(&svm->vcpu)))
> +				;
> +			else if 
> (unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu))) {
>   				r = 0;
>   				goto out;
>   			}
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] KVM: nSVM: correctly restore nested_run_pending on migration
  2021-01-07  9:38 ` [PATCH v2 2/4] KVM: nSVM: correctly restore nested_run_pending on migration Maxim Levitsky
@ 2021-01-07 18:03   ` Paolo Bonzini
  2021-01-07 20:19     ` Sean Christopherson
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2021-01-07 18:03 UTC (permalink / raw)
  To: Maxim Levitsky, kvm
  Cc: Vitaly Kuznetsov, H. Peter Anvin, Borislav Petkov,
	Thomas Gleixner, linux-kernel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Wanpeng Li, Joerg Roedel, Jim Mattson,
	Sean Christopherson

On 07/01/21 10:38, Maxim Levitsky wrote:
> The code to store it on the migration exists, but no code was restoring it.
> 
> One of the side effects of fixing this is that L1->L2 injected events
> are no longer lost when migration happens with nested run pending.
> 
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>   arch/x86/kvm/svm/nested.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index ee4f2082ad1bd..cc3130ab612e5 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -1200,6 +1200,10 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
>   	 * in the registers, the save area of the nested state instead
>   	 * contains saved L1 state.
>   	 */
> +
> +	svm->nested.nested_run_pending =
> +		!!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
> +
>   	copy_vmcb_control_area(&hsave->control, &svm->vmcb->control);
>   	hsave->save = *save;
>   
> 

Nice fix and we need to do it anyway.

That said, the v1 change had some appeal to it.  In the VMX case (if 
properly implemented) it would allow removing the weird 
nested_run_pending case from prepare_vmcs02_early.  I think it's a 
valuable invariant that there are no events in the VMCS after each 
KVM_RUN iteration, and this special case is breaking the invariant.

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/4] KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit
  2021-01-07 17:51     ` Paolo Bonzini
  2021-01-07 17:59       ` Paolo Bonzini
  2021-01-07 18:03       ` Maxim Levitsky
@ 2021-01-07 19:12       ` Sean Christopherson
  2 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2021-01-07 19:12 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Maxim Levitsky, kvm, Vitaly Kuznetsov, H. Peter Anvin,
	Sean Christopherson, Borislav Petkov, Thomas Gleixner,
	linux-kernel, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Wanpeng Li, Joerg Roedel, Jim Mattson

On Thu, Jan 07, 2021, Paolo Bonzini wrote:
> On 07/01/21 18:00, Sean Christopherson wrote:
> > Ugh, I assume this is due to one of the "premature" nested_ops->check_events()
> > calls that are necessitated by the event mess?  I'm guessing kvm_vcpu_running()
> > is the culprit?
> > 
> > If my assumption is correct, this bug affects nVMX as well.
> 
> Yes, though it may be latent.  For SVM it was until we started allocating
> svm->nested on demand.
> 
> > Rather than clear the request blindly on any nested VM-Exit, what
> > about something like the following?
> 
> I think your patch is overkill, KVM_REQ_GET_NESTED_STATE_PAGES is only set
> from KVM_SET_NESTED_STATE so it cannot happen while the VM runs.

Yeah, which is why I was hoping we could avoid clearing the request on every
nested exit.

> Something like this is small enough and works well.

I've no argument against it working, rather that I dislike clearing the request
on every exit.  Except for the ->check_events() case, hitting the scenario where
there's a pending request at the time of nested VM-Exit would ideally be treated
as a KVM bug.

On the other hand, clearing nested-specific request on nested VM-Exit is
logically sound, so I guess I'm ok with the minimal patch.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] KVM: nSVM: correctly restore nested_run_pending on migration
  2021-01-07 18:03   ` Paolo Bonzini
@ 2021-01-07 20:19     ` Sean Christopherson
  2021-01-07 21:05       ` Paolo Bonzini
  0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2021-01-07 20:19 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Maxim Levitsky, kvm, Vitaly Kuznetsov, H. Peter Anvin,
	Borislav Petkov, Thomas Gleixner, linux-kernel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Wanpeng Li, Joerg Roedel, Jim Mattson

On Thu, Jan 07, 2021, Paolo Bonzini wrote:
> On 07/01/21 10:38, Maxim Levitsky wrote:
> > The code to store it on the migration exists, but no code was restoring it.
> > 
> > One of the side effects of fixing this is that L1->L2 injected events
> > are no longer lost when migration happens with nested run pending.
> > 
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> >   arch/x86/kvm/svm/nested.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > index ee4f2082ad1bd..cc3130ab612e5 100644
> > --- a/arch/x86/kvm/svm/nested.c
> > +++ b/arch/x86/kvm/svm/nested.c
> > @@ -1200,6 +1200,10 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
> >   	 * in the registers, the save area of the nested state instead
> >   	 * contains saved L1 state.
> >   	 */
> > +
> > +	svm->nested.nested_run_pending =
> > +		!!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
> > +
> >   	copy_vmcb_control_area(&hsave->control, &svm->vmcb->control);
> >   	hsave->save = *save;
> > 
> 
> Nice fix and we need to do it anyway.
> 
> That said, the v1 change had some appeal to it.

Which v1 change are you referring to?

> In the VMX case (if properly implemented) it would allow removing the weird
> nested_run_pending case from prepare_vmcs02_early.  I think it's a valuable
> invariant that there are no events in the VMCS after each KVM_RUN iteration,
> and this special case is breaking the invariant.

Hmm, as weird as that code is, I think it's actually the most architecturally
correct behavior.  Technically, the clearing of VM_ENTRY_INTR_INFO.VALID
shouldn't be visible in vmcs12 until a nested VM-Exit occurs, e.g. copying the
vmcs02 value to vmcs12 in vmx_get_nested_state() would work, but it's wrong at
the same time.  Ditto for L1 (or L2) writing vmcs12.VM_ENTRY_INTR_INFO while L2
is running (ignoring the SDM's very clear warning that doing so is bad); from
L1/L2's perspective, there is no VM-Entry so writing vmcs12.VM_ENTRY_INTR_INFO
should never generate an event in L2, even on CPUs without VMCS caching.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] KVM: nSVM: correctly restore nested_run_pending on migration
  2021-01-07 20:19     ` Sean Christopherson
@ 2021-01-07 21:05       ` Paolo Bonzini
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2021-01-07 21:05 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Maxim Levitsky, kvm, Vitaly Kuznetsov, H. Peter Anvin,
	Borislav Petkov, Thomas Gleixner, linux-kernel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Wanpeng Li, Joerg Roedel, Jim Mattson

On 07/01/21 21:19, Sean Christopherson wrote:
>> That said, the v1 change had some appeal to it.
> 
> Which v1 change are you referring to?

Moving the to-be-injected event from eventinj to vcpu->arch, and from 
there to vmcb02 on the next vmentry's inject_pending_event.

>> In the VMX case (if properly implemented) it would allow removing the weird
>> nested_run_pending case from prepare_vmcs02_early.  I think it's a valuable
>> invariant that there are no events in the VMCS after each KVM_RUN iteration,
>> and this special case is breaking the invariant.
>
> Hmm, as weird as that code is, I think it's actually the most architecturally
> correct behavior.

I was referring to the "then" branch therein. :)

Paolo


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-01-07 21:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-07  9:38 [PATCH v2 0/4] KVM: nSVM: few random fixes Maxim Levitsky
2021-01-07  9:38 ` [PATCH v2 1/4] KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit Maxim Levitsky
2021-01-07 17:00   ` Sean Christopherson
2021-01-07 17:51     ` Paolo Bonzini
2021-01-07 17:59       ` Paolo Bonzini
2021-01-07 18:03       ` Maxim Levitsky
2021-01-07 19:12       ` Sean Christopherson
2021-01-07  9:38 ` [PATCH v2 2/4] KVM: nSVM: correctly restore nested_run_pending on migration Maxim Levitsky
2021-01-07 18:03   ` Paolo Bonzini
2021-01-07 20:19     ` Sean Christopherson
2021-01-07 21:05       ` Paolo Bonzini
2021-01-07  9:38 ` [PATCH v2 3/4] KVM: nSVM: always leave the nested state first on KVM_SET_NESTED_STATE Maxim Levitsky
2021-01-07  9:38 ` [PATCH v2 4/4] KVM: nSVM: mark vmcb as dirty when forcingly leaving the guest mode Maxim Levitsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).