kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save()
@ 2021-10-26  3:12 David Woodhouse
  2021-10-26 14:42 ` Sean Christopherson
  2021-10-28 14:11 ` Paolo Bonzini
  0 siblings, 2 replies; 7+ messages in thread
From: David Woodhouse @ 2021-10-26  3:12 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, mtosatti

[-- Attachment #1: Type: text/plain, Size: 3122 bytes --]

From: David Woodhouse <dwmw@amazon.co.uk>

The Xen interrupt injection for event channels relies on accessing the
guest's vcpu_info structure in __kvm_xen_has_interrupt(), through a
gfn_to_hva_cache.

This requires the srcu lock to be held, which is mostly the case except
for this code path:

[   11.822877] WARNING: suspicious RCU usage
[   11.822965] -----------------------------
[   11.823013] include/linux/kvm_host.h:664 suspicious rcu_dereference_check() usage!
[   11.823131]
[   11.823131] other info that might help us debug this:
[   11.823131]
[   11.823196]
[   11.823196] rcu_scheduler_active = 2, debug_locks = 1
[   11.823253] 1 lock held by dom:0/90:
[   11.823292]  #0: ffff998956ec8118 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x680
[   11.823379]
[   11.823379] stack backtrace:
[   11.823428] CPU: 2 PID: 90 Comm: dom:0 Kdump: loaded Not tainted 5.4.34+ #5
[   11.823496] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[   11.823612] Call Trace:
[   11.823645]  dump_stack+0x7a/0xa5
[   11.823681]  lockdep_rcu_suspicious+0xc5/0x100
[   11.823726]  __kvm_xen_has_interrupt+0x179/0x190
[   11.823773]  kvm_cpu_has_extint+0x6d/0x90
[   11.823813]  kvm_cpu_accept_dm_intr+0xd/0x40
[   11.823853]  kvm_vcpu_ready_for_interrupt_injection+0x20/0x30
              < post_kvm_run_save() inlined here >
[   11.823906]  kvm_arch_vcpu_ioctl_run+0x135/0x6a0
[   11.823947]  kvm_vcpu_ioctl+0x263/0x680

Fixes: 40da8ccd724f ("KVM: x86/xen: Add event channel interrupt vector upcall")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---

There are potentially other ways of doing this, by shuffling the tail
of kvm_arch_vcpu_ioctl_run() around a little and holding the lock once
there instead of taking it within vcpu_run(). But the call to
post_kvm_run_save() occurs even on the error paths, and it gets complex
to untangle. This is the simple approach.

This doesn't show up when I enable event channel delivery in
xen_shinfo_test because we have pic_in_kernel() there. I'll fix the
tests not to hardcode that, as I expand the event channel support and
add more testing of it.

 arch/x86/kvm/x86.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f0874c3d12ce..cd42d58008f7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8783,9 +8783,17 @@ static void post_kvm_run_save(struct kvm_vcpu *vcpu)
 
 	kvm_run->cr8 = kvm_get_cr8(vcpu);
 	kvm_run->apic_base = kvm_get_apic_base(vcpu);
+
+	/*
+	 * The call to kvm_ready_for_interrupt_injection() may end up in
+	 * kvm_xen_has_interrupt() which may require the srcu lock to be
+	 * held to protect against changes in the vcpu_info address.
+	 */
+	vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
 	kvm_run->ready_for_interrupt_injection =
 		pic_in_kernel(vcpu->kvm) ||
 		kvm_vcpu_ready_for_interrupt_injection(vcpu);
+	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
 
 	if (is_smm(vcpu))
 		kvm_run->flags |= KVM_RUN_X86_SMM;
-- 
2.25.1


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save()
  2021-10-26  3:12 [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save() David Woodhouse
@ 2021-10-26 14:42 ` Sean Christopherson
  2021-10-26 16:00   ` David Woodhouse
  2021-10-28 14:11 ` Paolo Bonzini
  1 sibling, 1 reply; 7+ messages in thread
From: Sean Christopherson @ 2021-10-26 14:42 UTC (permalink / raw)
  To: David Woodhouse
  Cc: kvm, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, mtosatti

On Tue, Oct 26, 2021, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> The Xen interrupt injection for event channels relies on accessing the
> guest's vcpu_info structure in __kvm_xen_has_interrupt(), through a
> gfn_to_hva_cache.
> 
> This requires the srcu lock to be held, which is mostly the case except
> for this code path:
> 
> [   11.822877] WARNING: suspicious RCU usage
> [   11.822965] -----------------------------
> [   11.823013] include/linux/kvm_host.h:664 suspicious rcu_dereference_check() usage!
> [   11.823131]
> [   11.823131] other info that might help us debug this:
> [   11.823131]
> [   11.823196]
> [   11.823196] rcu_scheduler_active = 2, debug_locks = 1
> [   11.823253] 1 lock held by dom:0/90:
> [   11.823292]  #0: ffff998956ec8118 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x680
> [   11.823379]
> [   11.823379] stack backtrace:
> [   11.823428] CPU: 2 PID: 90 Comm: dom:0 Kdump: loaded Not tainted 5.4.34+ #5
> [   11.823496] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [   11.823612] Call Trace:
> [   11.823645]  dump_stack+0x7a/0xa5
> [   11.823681]  lockdep_rcu_suspicious+0xc5/0x100
> [   11.823726]  __kvm_xen_has_interrupt+0x179/0x190
> [   11.823773]  kvm_cpu_has_extint+0x6d/0x90
> [   11.823813]  kvm_cpu_accept_dm_intr+0xd/0x40
> [   11.823853]  kvm_vcpu_ready_for_interrupt_injection+0x20/0x30
>               < post_kvm_run_save() inlined here >
> [   11.823906]  kvm_arch_vcpu_ioctl_run+0x135/0x6a0
> [   11.823947]  kvm_vcpu_ioctl+0x263/0x680
> 
> Fixes: 40da8ccd724f ("KVM: x86/xen: Add event channel interrupt vector upcall")
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
> 
> There are potentially other ways of doing this, by shuffling the tail
> of kvm_arch_vcpu_ioctl_run() around a little and holding the lock once
> there instead of taking it within vcpu_run(). But the call to
> post_kvm_run_save() occurs even on the error paths, and it gets complex
> to untangle. This is the simple approach.

What about taking the lock well early on so that the tail doesn't need to juggle
errors?  Dropping the lock for the KVM_MP_STATE_UNINITIALIZED case is a little
unfortunate, but that at least pairs with similar logic in x86's other call to
kvm_vcpu_block().  Relocking if xfer_to_guest_mode_handle_work() triggers an exit
to userspace is also unfortunate but it's not the end of the world.

On the plus side, the complete_userspace_io() callback doesn't need to worry
about taking the lock.

---
 arch/x86/kvm/x86.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ac83d873d65b..90751a080447 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10039,7 +10039,6 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
 	int r;
 	struct kvm *kvm = vcpu->kvm;

-	vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
 	vcpu->arch.l1tf_flush_l1d = true;

 	for (;;) {
@@ -10067,25 +10066,18 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
 		if (__xfer_to_guest_mode_work_pending()) {
 			srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
 			r = xfer_to_guest_mode_handle_work(vcpu);
+			vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
 			if (r)
 				return r;
-			vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
 		}
 	}

-	srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
-
 	return r;
 }

 static inline int complete_emulated_io(struct kvm_vcpu *vcpu)
 {
-	int r;
-
-	vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
-	r = kvm_emulate_instruction(vcpu, EMULTYPE_NO_DECODE);
-	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
-	return r;
+	return kvm_emulate_instruction(vcpu, EMULTYPE_NO_DECODE);
 }

 static int complete_emulated_pio(struct kvm_vcpu *vcpu)
@@ -10224,12 +10216,16 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 	kvm_run->flags = 0;
 	kvm_load_guest_fpu(vcpu);

+	vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
 	if (unlikely(vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED)) {
 		if (kvm_run->immediate_exit) {
 			r = -EINTR;
 			goto out;
 		}
+		srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
 		kvm_vcpu_block(vcpu);
+		vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
+
 		if (kvm_apic_accept_events(vcpu) < 0) {
 			r = 0;
 			goto out;
@@ -10279,10 +10275,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 		r = vcpu_run(vcpu);

 out:
-	kvm_put_guest_fpu(vcpu);
 	if (kvm_run->kvm_valid_regs)
 		store_regs(vcpu);
 	post_kvm_run_save(vcpu);
+
+	srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
+
+	kvm_put_guest_fpu(vcpu);
 	kvm_sigset_deactivate(vcpu);

 	vcpu_put(vcpu);
--

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save()
  2021-10-26 14:42 ` Sean Christopherson
@ 2021-10-26 16:00   ` David Woodhouse
  2021-10-26 16:05     ` Sean Christopherson
  0 siblings, 1 reply; 7+ messages in thread
From: David Woodhouse @ 2021-10-26 16:00 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, mtosatti

[-- Attachment #1: Type: text/plain, Size: 699 bytes --]

On Tue, 2021-10-26 at 14:42 +0000, Sean Christopherson wrote:
> What about taking the lock well early on so that the tail doesn't need to juggle
> errors?  Dropping the lock for the KVM_MP_STATE_UNINITIALIZED case is a little
> unfortunate, but that at least pairs with similar logic in x86's other call to
> kvm_vcpu_block().  Relocking if xfer_to_guest_mode_handle_work() triggers an exit
> to userspace is also unfortunate but it's not the end of the world.
> 
> On the plus side, the complete_userspace_io() callback doesn't need to worry
> about taking the lock.

Yeah, that seems sensible for master, but I suspect I'd err on the side
of caution for backporting to stable first?


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save()
  2021-10-26 16:00   ` David Woodhouse
@ 2021-10-26 16:05     ` Sean Christopherson
  0 siblings, 0 replies; 7+ messages in thread
From: Sean Christopherson @ 2021-10-26 16:05 UTC (permalink / raw)
  To: David Woodhouse
  Cc: kvm, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, mtosatti

On Tue, Oct 26, 2021, David Woodhouse wrote:
> On Tue, 2021-10-26 at 14:42 +0000, Sean Christopherson wrote:
> > What about taking the lock well early on so that the tail doesn't need to juggle
> > errors?  Dropping the lock for the KVM_MP_STATE_UNINITIALIZED case is a little
> > unfortunate, but that at least pairs with similar logic in x86's other call to
> > kvm_vcpu_block().  Relocking if xfer_to_guest_mode_handle_work() triggers an exit
> > to userspace is also unfortunate but it's not the end of the world.
> > 
> > On the plus side, the complete_userspace_io() callback doesn't need to worry
> > about taking the lock.
> 
> Yeah, that seems sensible for master, but I suspect I'd err on the side
> of caution for backporting to stable first?

Agreed, dirty-but-simple for stable.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save()
  2021-10-26  3:12 [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save() David Woodhouse
  2021-10-26 14:42 ` Sean Christopherson
@ 2021-10-28 14:11 ` Paolo Bonzini
  2021-10-28 14:23   ` David Woodhouse
  1 sibling, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2021-10-28 14:11 UTC (permalink / raw)
  To: David Woodhouse, kvm
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, mtosatti

On 26/10/21 05:12, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> The Xen interrupt injection for event channels relies on accessing the
> guest's vcpu_info structure in __kvm_xen_has_interrupt(), through a
> gfn_to_hva_cache.
> 
> This requires the srcu lock to be held, which is mostly the case except
> for this code path:
> 
> [   11.822877] WARNING: suspicious RCU usage
> [   11.822965] -----------------------------
> [   11.823013] include/linux/kvm_host.h:664 suspicious rcu_dereference_check() usage!
> [   11.823131]
> [   11.823131] other info that might help us debug this:
> [   11.823131]
> [   11.823196]
> [   11.823196] rcu_scheduler_active = 2, debug_locks = 1
> [   11.823253] 1 lock held by dom:0/90:
> [   11.823292]  #0: ffff998956ec8118 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x680
> [   11.823379]
> [   11.823379] stack backtrace:
> [   11.823428] CPU: 2 PID: 90 Comm: dom:0 Kdump: loaded Not tainted 5.4.34+ #5
> [   11.823496] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [   11.823612] Call Trace:
> [   11.823645]  dump_stack+0x7a/0xa5
> [   11.823681]  lockdep_rcu_suspicious+0xc5/0x100
> [   11.823726]  __kvm_xen_has_interrupt+0x179/0x190
> [   11.823773]  kvm_cpu_has_extint+0x6d/0x90
> [   11.823813]  kvm_cpu_accept_dm_intr+0xd/0x40
> [   11.823853]  kvm_vcpu_ready_for_interrupt_injection+0x20/0x30
>                < post_kvm_run_save() inlined here >
> [   11.823906]  kvm_arch_vcpu_ioctl_run+0x135/0x6a0
> [   11.823947]  kvm_vcpu_ioctl+0x263/0x680
> 
> Fixes: 40da8ccd724f ("KVM: x86/xen: Add event channel interrupt vector upcall")
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
> 
> There are potentially other ways of doing this, by shuffling the tail
> of kvm_arch_vcpu_ioctl_run() around a little and holding the lock once
> there instead of taking it within vcpu_run(). But the call to
> post_kvm_run_save() occurs even on the error paths, and it gets complex
> to untangle. This is the simple approach.
> 
> This doesn't show up when I enable event channel delivery in
> xen_shinfo_test because we have pic_in_kernel() there. I'll fix the
> tests not to hardcode that, as I expand the event channel support and
> add more testing of it.

Queued for 5.15, thanks.

Paolo

>   arch/x86/kvm/x86.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index f0874c3d12ce..cd42d58008f7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8783,9 +8783,17 @@ static void post_kvm_run_save(struct kvm_vcpu *vcpu)
>   
>   	kvm_run->cr8 = kvm_get_cr8(vcpu);
>   	kvm_run->apic_base = kvm_get_apic_base(vcpu);
> +
> +	/*
> +	 * The call to kvm_ready_for_interrupt_injection() may end up in
> +	 * kvm_xen_has_interrupt() which may require the srcu lock to be
> +	 * held to protect against changes in the vcpu_info address.
> +	 */
> +	vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
>   	kvm_run->ready_for_interrupt_injection =
>   		pic_in_kernel(vcpu->kvm) ||
>   		kvm_vcpu_ready_for_interrupt_injection(vcpu);
> +	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
>   
>   	if (is_smm(vcpu))
>   		kvm_run->flags |= KVM_RUN_X86_SMM;
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save()
  2021-10-28 14:11 ` Paolo Bonzini
@ 2021-10-28 14:23   ` David Woodhouse
  2021-10-28 16:15     ` Paolo Bonzini
  0 siblings, 1 reply; 7+ messages in thread
From: David Woodhouse @ 2021-10-28 14:23 UTC (permalink / raw)
  To: Paolo Bonzini, kvm
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, mtosatti

[-- Attachment #1: Type: text/plain, Size: 921 bytes --]

On Thu, 2021-10-28 at 16:11 +0200, Paolo Bonzini wrote:
> 
> > There are potentially other ways of doing this, by shuffling the tail
> > of kvm_arch_vcpu_ioctl_run() around a little and holding the lock once
> > there instead of taking it within vcpu_run(). But the call to
> > post_kvm_run_save() occurs even on the error paths, and it gets complex
> > to untangle. This is the simple approach.
> > 
> > This doesn't show up when I enable event channel delivery in
> > xen_shinfo_test because we have pic_in_kernel() there. I'll fix the
> > tests not to hardcode that, as I expand the event channel support and
> > add more testing of it.
> 
> Queued for 5.15, thanks.

Thanks.

Sean, since you actually went ahead and implemented the alternative
approach along the lines I describe above, I'll let you submit that as
a subsequent cleanup while keeping the simple version for stable as
discussed?


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save()
  2021-10-28 14:23   ` David Woodhouse
@ 2021-10-28 16:15     ` Paolo Bonzini
  0 siblings, 0 replies; 7+ messages in thread
From: Paolo Bonzini @ 2021-10-28 16:15 UTC (permalink / raw)
  To: David Woodhouse, kvm
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, mtosatti

On 28/10/21 16:23, David Woodhouse wrote:
>> Queued for 5.15, thanks.
> Thanks.
> 
> Sean, since you actually went ahead and implemented the alternative
> approach along the lines I describe above, I'll let you submit that as
> a subsequent cleanup while keeping the simple version for stable as
> discussed?

Yes, I can also take care of sending it out for review.

Paolo


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-10-28 16:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-26  3:12 [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save() David Woodhouse
2021-10-26 14:42 ` Sean Christopherson
2021-10-26 16:00   ` David Woodhouse
2021-10-26 16:05     ` Sean Christopherson
2021-10-28 14:11 ` Paolo Bonzini
2021-10-28 14:23   ` David Woodhouse
2021-10-28 16:15     ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).