linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: nVMX: Fix pending events injection
@ 2017-02-26  8:46 Wanpeng Li
  2017-02-27 10:19 ` Paolo Bonzini
  0 siblings, 1 reply; 4+ messages in thread
From: Wanpeng Li @ 2017-02-26  8:46 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li, Jan Kiszka

From: Wanpeng Li <wanpeng.li@hotmail.com>

L2 fails to boot on a non-APICv box dues to 'commit 0ad3bed6c5ec 
("kvm: nVMX: move nested events check to kvm_vcpu_running")'

KVM internal error. Suberror: 3
extra data[0]: 800000ef
extra data[1]: 1
RAX=0000000000000000 RBX=ffffffff81f36140 RCX=0000000000000000 RDX=0000000000000000
RSI=0000000000000000 RDI=0000000000000000 RBP=ffff88007c92fe90 RSP=ffff88007c92fe90
R8 =ffff88007fccdca0 R9 =0000000000000000 R10=00000000fffedb3d R11=0000000000000000
R12=0000000000000003 R13=0000000000000000 R14=0000000000000000 R15=ffff88007c92c000
RIP=ffffffff810645e6 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 0000000000000000 ffffffff 00c00000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0000 0000000000000000 ffffffff 00c00000
DS =0000 0000000000000000 ffffffff 00c00000
FS =0000 0000000000000000 ffffffff 00c00000
GS =0000 ffff88007fcc0000 ffffffff 00c00000
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 ffff88007fcd4200 00002087 00008b00 DPL=0 TSS64-busy
GDT=     ffff88007fcc9000 0000007f
IDT=     ffffffffff578000 00000fff
CR0=80050033 CR2=00000000ffffffff CR3=0000000001e0a000 CR4=003406e0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000fffe0ff0 DR7=0000000000000400
EFER=0000000000000d01

We should try to reinject previous events if any before trying to inject 
new event if pending. If vmexit is triggered by L2 guest and L0 interested
in, we should reinject IDT-vectoring info to L2 through vmcs02 if any,
otherwise, we can consider new IRQs/NMIs which can be injected and call 
nested events callback to switch from L2 to L1 if needed and inject the 
proper vmexit events. However, 'commit 0ad3bed6c5ec ("kvm: nVMX: move 
nested events check to kvm_vcpu_running")' results in the handle events 
order reversely on non-APICv box. This patch fixes it by checking nested 
events if there is no KVM_REQ_EVENT since APICv interrupt injection doesn't 
use KVM_REQ_EVENT any more.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/kvm/x86.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b2a4b11..74fc47b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6843,7 +6843,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 			update_cr8_intercept(vcpu);
 			kvm_lapic_sync_to_vapic(vcpu);
 		}
-	}
+	} else if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events)
+		kvm_x86_ops->check_nested_events(vcpu, false);
 
 	r = kvm_mmu_reload(vcpu);
 	if (unlikely(r)) {
@@ -7025,9 +7026,6 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
 
 static inline bool kvm_vcpu_running(struct kvm_vcpu *vcpu)
 {
-	if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events)
-		kvm_x86_ops->check_nested_events(vcpu, false);
-
 	return (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
 		!vcpu->arch.apf.halted);
 }
@@ -8397,6 +8395,9 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu)
 
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
 {
+	if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events)
+		kvm_x86_ops->check_nested_events(vcpu, false);
+
 	return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: nVMX: Fix pending events injection
  2017-02-26  8:46 [PATCH] KVM: nVMX: Fix pending events injection Wanpeng Li
@ 2017-02-27 10:19 ` Paolo Bonzini
  2017-02-27 11:35   ` Wanpeng Li
  0 siblings, 1 reply; 4+ messages in thread
From: Paolo Bonzini @ 2017-02-27 10:19 UTC (permalink / raw)
  To: Wanpeng Li, linux-kernel, kvm
  Cc: Radim Krčmář, Wanpeng Li, Jan Kiszka



On 26/02/2017 09:46, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> L2 fails to boot on a non-APICv box dues to 'commit 0ad3bed6c5ec 
> ("kvm: nVMX: move nested events check to kvm_vcpu_running")'
> 
> KVM internal error. Suberror: 3
> extra data[0]: 800000ef
> extra data[1]: 1
> RAX=0000000000000000 RBX=ffffffff81f36140 RCX=0000000000000000 RDX=0000000000000000
> RSI=0000000000000000 RDI=0000000000000000 RBP=ffff88007c92fe90 RSP=ffff88007c92fe90
> R8 =ffff88007fccdca0 R9 =0000000000000000 R10=00000000fffedb3d R11=0000000000000000
> R12=0000000000000003 R13=0000000000000000 R14=0000000000000000 R15=ffff88007c92c000
> RIP=ffffffff810645e6 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0000 0000000000000000 ffffffff 00c00000
> CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
> SS =0000 0000000000000000 ffffffff 00c00000
> DS =0000 0000000000000000 ffffffff 00c00000
> FS =0000 0000000000000000 ffffffff 00c00000
> GS =0000 ffff88007fcc0000 ffffffff 00c00000
> LDT=0000 0000000000000000 ffffffff 00c00000
> TR =0040 ffff88007fcd4200 00002087 00008b00 DPL=0 TSS64-busy
> GDT=     ffff88007fcc9000 0000007f
> IDT=     ffffffffff578000 00000fff
> CR0=80050033 CR2=00000000ffffffff CR3=0000000001e0a000 CR4=003406e0
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
> DR6=00000000fffe0ff0 DR7=0000000000000400
> EFER=0000000000000d01
> 
> We should try to reinject previous events if any before trying to inject 
> new event if pending. If vmexit is triggered by L2 guest and L0 interested
> in, we should reinject IDT-vectoring info to L2 through vmcs02 if any,
> otherwise, we can consider new IRQs/NMIs which can be injected and call 
> nested events callback to switch from L2 to L1 if needed and inject the 
> proper vmexit events. However, 'commit 0ad3bed6c5ec ("kvm: nVMX: move 
> nested events check to kvm_vcpu_running")' results in the handle events 
> order reversely on non-APICv box. This patch fixes it by checking nested 
> events if there is no KVM_REQ_EVENT since APICv interrupt injection doesn't 
> use KVM_REQ_EVENT any more.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Jan Kiszka <jan.kiszka@siemens.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>

I need to understand this better.  I would hope that something like

@@ -10668,7 +10598,8 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
 
 	if ((kvm_cpu_has_interrupt(vcpu) || external_intr) &&
 	    nested_exit_on_intr(vcpu)) {
-		if (vmx->nested.nested_run_pending)
+		if (vmx->nested.nested_run_pending ||
+		    vcpu->arch.interrupt.pending)
 			return -EBUSY;
 		nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT, 0, 0);
 		return 0;

does it instead.

Paolo

> ---
>  arch/x86/kvm/x86.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b2a4b11..74fc47b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6843,7 +6843,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  			update_cr8_intercept(vcpu);
>  			kvm_lapic_sync_to_vapic(vcpu);
>  		}
> -	}
> +	} else if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events)
> +		kvm_x86_ops->check_nested_events(vcpu, false);
>  
>  	r = kvm_mmu_reload(vcpu);
>  	if (unlikely(r)) {
> @@ -7025,9 +7026,6 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
>  
>  static inline bool kvm_vcpu_running(struct kvm_vcpu *vcpu)
>  {
> -	if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events)
> -		kvm_x86_ops->check_nested_events(vcpu, false);
> -
>  	return (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
>  		!vcpu->arch.apf.halted);
>  }
> @@ -8397,6 +8395,9 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu)
>  
>  int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
>  {
> +	if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events)
> +		kvm_x86_ops->check_nested_events(vcpu, false);
> +
>  	return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu);
>  }
>  
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: nVMX: Fix pending events injection
  2017-02-27 10:19 ` Paolo Bonzini
@ 2017-02-27 11:35   ` Wanpeng Li
  2017-02-27 11:52     ` Paolo Bonzini
  0 siblings, 1 reply; 4+ messages in thread
From: Wanpeng Li @ 2017-02-27 11:35 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, Radim Krčmář, Wanpeng Li, Jan Kiszka

2017-02-27 18:19 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>
>
> On 26/02/2017 09:46, Wanpeng Li wrote:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> L2 fails to boot on a non-APICv box dues to 'commit 0ad3bed6c5ec
>> ("kvm: nVMX: move nested events check to kvm_vcpu_running")'
>>
>> KVM internal error. Suberror: 3
>> extra data[0]: 800000ef
>> extra data[1]: 1
>> RAX=0000000000000000 RBX=ffffffff81f36140 RCX=0000000000000000 RDX=0000000000000000
>> RSI=0000000000000000 RDI=0000000000000000 RBP=ffff88007c92fe90 RSP=ffff88007c92fe90
>> R8 =ffff88007fccdca0 R9 =0000000000000000 R10=00000000fffedb3d R11=0000000000000000
>> R12=0000000000000003 R13=0000000000000000 R14=0000000000000000 R15=ffff88007c92c000
>> RIP=ffffffff810645e6 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0000 0000000000000000 ffffffff 00c00000
>> CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
>> SS =0000 0000000000000000 ffffffff 00c00000
>> DS =0000 0000000000000000 ffffffff 00c00000
>> FS =0000 0000000000000000 ffffffff 00c00000
>> GS =0000 ffff88007fcc0000 ffffffff 00c00000
>> LDT=0000 0000000000000000 ffffffff 00c00000
>> TR =0040 ffff88007fcd4200 00002087 00008b00 DPL=0 TSS64-busy
>> GDT=     ffff88007fcc9000 0000007f
>> IDT=     ffffffffff578000 00000fff
>> CR0=80050033 CR2=00000000ffffffff CR3=0000000001e0a000 CR4=003406e0
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
>> DR6=00000000fffe0ff0 DR7=0000000000000400
>> EFER=0000000000000d01
>>
>> We should try to reinject previous events if any before trying to inject
>> new event if pending. If vmexit is triggered by L2 guest and L0 interested
>> in, we should reinject IDT-vectoring info to L2 through vmcs02 if any,
>> otherwise, we can consider new IRQs/NMIs which can be injected and call
>> nested events callback to switch from L2 to L1 if needed and inject the
>> proper vmexit events. However, 'commit 0ad3bed6c5ec ("kvm: nVMX: move
>> nested events check to kvm_vcpu_running")' results in the handle events
>> order reversely on non-APICv box. This patch fixes it by checking nested
>> events if there is no KVM_REQ_EVENT since APICv interrupt injection doesn't
>> use KVM_REQ_EVENT any more.
>>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Cc: Jan Kiszka <jan.kiszka@siemens.com>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>
> I need to understand this better.  I would hope that something like
>
> @@ -10668,7 +10598,8 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
>
>         if ((kvm_cpu_has_interrupt(vcpu) || external_intr) &&
>             nested_exit_on_intr(vcpu)) {
> -               if (vmx->nested.nested_run_pending)
> +               if (vmx->nested.nested_run_pending ||
> +                   vcpu->arch.interrupt.pending)
>                         return -EBUSY;
>                 nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT, 0, 0);
>                 return 0;
>

This is insufficient, L2 guest boot with some mitigation, but very
slowly and finally stuck will the same crash as above. How about
something like below:

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ef4ba71..d46af65 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -10642,6 +10642,11 @@ static int vmx_check_nested_events(struct
kvm_vcpu *vcpu, bool external_intr)
 {
     struct vcpu_vmx *vmx = to_vmx(vcpu);

+    if (vcpu->arch.exception.pending ||
+        vcpu->arch.nmi_injected ||
+        vcpu->arch.interrupt.pending)
+        return -EBUSY;
+
     if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) &&
         vmx->nested.preemption_timer_expired) {
         if (vmx->nested.nested_run_pending)

Regards,
Wanpeng Li

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: nVMX: Fix pending events injection
  2017-02-27 11:35   ` Wanpeng Li
@ 2017-02-27 11:52     ` Paolo Bonzini
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Bonzini @ 2017-02-27 11:52 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: linux-kernel, kvm, Radim Krčmář, Wanpeng Li, Jan Kiszka



On 27/02/2017 12:35, Wanpeng Li wrote:
> 2017-02-27 18:19 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>>
>>
>> On 26/02/2017 09:46, Wanpeng Li wrote:
>>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>>
>>> L2 fails to boot on a non-APICv box dues to 'commit 0ad3bed6c5ec
>>> ("kvm: nVMX: move nested events check to kvm_vcpu_running")'
>>>
>>> KVM internal error. Suberror: 3
>>> extra data[0]: 800000ef
>>> extra data[1]: 1
>>> RAX=0000000000000000 RBX=ffffffff81f36140 RCX=0000000000000000 RDX=0000000000000000
>>> RSI=0000000000000000 RDI=0000000000000000 RBP=ffff88007c92fe90 RSP=ffff88007c92fe90
>>> R8 =ffff88007fccdca0 R9 =0000000000000000 R10=00000000fffedb3d R11=0000000000000000
>>> R12=0000000000000003 R13=0000000000000000 R14=0000000000000000 R15=ffff88007c92c000
>>> RIP=ffffffff810645e6 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
>>> ES =0000 0000000000000000 ffffffff 00c00000
>>> CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
>>> SS =0000 0000000000000000 ffffffff 00c00000
>>> DS =0000 0000000000000000 ffffffff 00c00000
>>> FS =0000 0000000000000000 ffffffff 00c00000
>>> GS =0000 ffff88007fcc0000 ffffffff 00c00000
>>> LDT=0000 0000000000000000 ffffffff 00c00000
>>> TR =0040 ffff88007fcd4200 00002087 00008b00 DPL=0 TSS64-busy
>>> GDT=     ffff88007fcc9000 0000007f
>>> IDT=     ffffffffff578000 00000fff
>>> CR0=80050033 CR2=00000000ffffffff CR3=0000000001e0a000 CR4=003406e0
>>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
>>> DR6=00000000fffe0ff0 DR7=0000000000000400
>>> EFER=0000000000000d01
>>>
>>> We should try to reinject previous events if any before trying to inject
>>> new event if pending. If vmexit is triggered by L2 guest and L0 interested
>>> in, we should reinject IDT-vectoring info to L2 through vmcs02 if any,
>>> otherwise, we can consider new IRQs/NMIs which can be injected and call
>>> nested events callback to switch from L2 to L1 if needed and inject the
>>> proper vmexit events. However, 'commit 0ad3bed6c5ec ("kvm: nVMX: move
>>> nested events check to kvm_vcpu_running")' results in the handle events
>>> order reversely on non-APICv box. This patch fixes it by checking nested
>>> events if there is no KVM_REQ_EVENT since APICv interrupt injection doesn't
>>> use KVM_REQ_EVENT any more.
>>>
>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>>> Cc: Jan Kiszka <jan.kiszka@siemens.com>
>>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> I need to understand this better.  I would hope that something like
>>
>> @@ -10668,7 +10598,8 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
>>
>>         if ((kvm_cpu_has_interrupt(vcpu) || external_intr) &&
>>             nested_exit_on_intr(vcpu)) {
>> -               if (vmx->nested.nested_run_pending)
>> +               if (vmx->nested.nested_run_pending ||
>> +                   vcpu->arch.interrupt.pending)
>>                         return -EBUSY;
>>                 nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT, 0, 0);
>>                 return 0;
>>
> 
> This is insufficient, L2 guest boot with some mitigation, but very
> slowly and finally stuck will the same crash as above. How about
> something like below:
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index ef4ba71..d46af65 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -10642,6 +10642,11 @@ static int vmx_check_nested_events(struct
> kvm_vcpu *vcpu, bool external_intr)
>  {
>      struct vcpu_vmx *vmx = to_vmx(vcpu);
> 
> +    if (vcpu->arch.exception.pending ||
> +        vcpu->arch.nmi_injected ||
> +        vcpu->arch.interrupt.pending)
> +        return -EBUSY;
> +
>      if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) &&
>          vmx->nested.preemption_timer_expired) {
>          if (vmx->nested.nested_run_pending)

I think this would be okay for kvm_vcpu_running, and also for the first
call in inject_pending_event (you would never exit here, because all
three conditions are handled earlier in inject_pending_event).

It would also let us the vcpu->arch.interrupt.pending condition here:

                if (vmx->nested.nested_run_pending ||
                    vcpu->arch.interrupt.pending)
                        return -EBUSY;

which isn't bad either.

I think I did test vmx.flat with apicv=0.  If it passes, we really
should add more testcases related to this bug!  And also one for the
second call in inject_pending_event.

Paolo

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-02-27 13:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-26  8:46 [PATCH] KVM: nVMX: Fix pending events injection Wanpeng Li
2017-02-27 10:19 ` Paolo Bonzini
2017-02-27 11:35   ` Wanpeng Li
2017-02-27 11:52     ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).