* [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save() @ 2021-10-26 3:12 David Woodhouse 2021-10-26 14:42 ` Sean Christopherson 2021-10-28 14:11 ` Paolo Bonzini 0 siblings, 2 replies; 7+ messages in thread From: David Woodhouse @ 2021-10-26 3:12 UTC (permalink / raw) To: kvm Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, mtosatti [-- Attachment #1: Type: text/plain, Size: 3122 bytes --] From: David Woodhouse <dwmw@amazon.co.uk> The Xen interrupt injection for event channels relies on accessing the guest's vcpu_info structure in __kvm_xen_has_interrupt(), through a gfn_to_hva_cache. This requires the srcu lock to be held, which is mostly the case except for this code path: [ 11.822877] WARNING: suspicious RCU usage [ 11.822965] ----------------------------- [ 11.823013] include/linux/kvm_host.h:664 suspicious rcu_dereference_check() usage! [ 11.823131] [ 11.823131] other info that might help us debug this: [ 11.823131] [ 11.823196] [ 11.823196] rcu_scheduler_active = 2, debug_locks = 1 [ 11.823253] 1 lock held by dom:0/90: [ 11.823292] #0: ffff998956ec8118 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x680 [ 11.823379] [ 11.823379] stack backtrace: [ 11.823428] CPU: 2 PID: 90 Comm: dom:0 Kdump: loaded Not tainted 5.4.34+ #5 [ 11.823496] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 11.823612] Call Trace: [ 11.823645] dump_stack+0x7a/0xa5 [ 11.823681] lockdep_rcu_suspicious+0xc5/0x100 [ 11.823726] __kvm_xen_has_interrupt+0x179/0x190 [ 11.823773] kvm_cpu_has_extint+0x6d/0x90 [ 11.823813] kvm_cpu_accept_dm_intr+0xd/0x40 [ 11.823853] kvm_vcpu_ready_for_interrupt_injection+0x20/0x30 < post_kvm_run_save() inlined here > [ 11.823906] kvm_arch_vcpu_ioctl_run+0x135/0x6a0 [ 11.823947] kvm_vcpu_ioctl+0x263/0x680 Fixes: 40da8ccd724f ("KVM: x86/xen: Add event channel interrupt vector upcall") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> --- There are potentially other ways of doing this, by shuffling the tail of kvm_arch_vcpu_ioctl_run() around a little and holding the lock once there instead of taking it within vcpu_run(). But the call to post_kvm_run_save() occurs even on the error paths, and it gets complex to untangle. This is the simple approach. This doesn't show up when I enable event channel delivery in xen_shinfo_test because we have pic_in_kernel() there. I'll fix the tests not to hardcode that, as I expand the event channel support and add more testing of it. arch/x86/kvm/x86.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index f0874c3d12ce..cd42d58008f7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8783,9 +8783,17 @@ static void post_kvm_run_save(struct kvm_vcpu *vcpu) kvm_run->cr8 = kvm_get_cr8(vcpu); kvm_run->apic_base = kvm_get_apic_base(vcpu); + + /* + * The call to kvm_ready_for_interrupt_injection() may end up in + * kvm_xen_has_interrupt() which may require the srcu lock to be + * held to protect against changes in the vcpu_info address. + */ + vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); kvm_run->ready_for_interrupt_injection = pic_in_kernel(vcpu->kvm) || kvm_vcpu_ready_for_interrupt_injection(vcpu); + srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx); if (is_smm(vcpu)) kvm_run->flags |= KVM_RUN_X86_SMM; -- 2.25.1 [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5174 bytes --] ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save() 2021-10-26 3:12 [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save() David Woodhouse @ 2021-10-26 14:42 ` Sean Christopherson 2021-10-26 16:00 ` David Woodhouse 2021-10-28 14:11 ` Paolo Bonzini 1 sibling, 1 reply; 7+ messages in thread From: Sean Christopherson @ 2021-10-26 14:42 UTC (permalink / raw) To: David Woodhouse Cc: kvm, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, mtosatti On Tue, Oct 26, 2021, David Woodhouse wrote: > From: David Woodhouse <dwmw@amazon.co.uk> > > The Xen interrupt injection for event channels relies on accessing the > guest's vcpu_info structure in __kvm_xen_has_interrupt(), through a > gfn_to_hva_cache. > > This requires the srcu lock to be held, which is mostly the case except > for this code path: > > [ 11.822877] WARNING: suspicious RCU usage > [ 11.822965] ----------------------------- > [ 11.823013] include/linux/kvm_host.h:664 suspicious rcu_dereference_check() usage! > [ 11.823131] > [ 11.823131] other info that might help us debug this: > [ 11.823131] > [ 11.823196] > [ 11.823196] rcu_scheduler_active = 2, debug_locks = 1 > [ 11.823253] 1 lock held by dom:0/90: > [ 11.823292] #0: ffff998956ec8118 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x680 > [ 11.823379] > [ 11.823379] stack backtrace: > [ 11.823428] CPU: 2 PID: 90 Comm: dom:0 Kdump: loaded Not tainted 5.4.34+ #5 > [ 11.823496] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 > [ 11.823612] Call Trace: > [ 11.823645] dump_stack+0x7a/0xa5 > [ 11.823681] lockdep_rcu_suspicious+0xc5/0x100 > [ 11.823726] __kvm_xen_has_interrupt+0x179/0x190 > [ 11.823773] kvm_cpu_has_extint+0x6d/0x90 > [ 11.823813] kvm_cpu_accept_dm_intr+0xd/0x40 > [ 11.823853] kvm_vcpu_ready_for_interrupt_injection+0x20/0x30 > < post_kvm_run_save() inlined here > > [ 11.823906] kvm_arch_vcpu_ioctl_run+0x135/0x6a0 > [ 11.823947] kvm_vcpu_ioctl+0x263/0x680 > > Fixes: 40da8ccd724f ("KVM: x86/xen: Add event channel interrupt vector upcall") > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > --- > > There are potentially other ways of doing this, by shuffling the tail > of kvm_arch_vcpu_ioctl_run() around a little and holding the lock once > there instead of taking it within vcpu_run(). But the call to > post_kvm_run_save() occurs even on the error paths, and it gets complex > to untangle. This is the simple approach. What about taking the lock well early on so that the tail doesn't need to juggle errors? Dropping the lock for the KVM_MP_STATE_UNINITIALIZED case is a little unfortunate, but that at least pairs with similar logic in x86's other call to kvm_vcpu_block(). Relocking if xfer_to_guest_mode_handle_work() triggers an exit to userspace is also unfortunate but it's not the end of the world. On the plus side, the complete_userspace_io() callback doesn't need to worry about taking the lock. --- arch/x86/kvm/x86.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ac83d873d65b..90751a080447 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10039,7 +10039,6 @@ static int vcpu_run(struct kvm_vcpu *vcpu) int r; struct kvm *kvm = vcpu->kvm; - vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); vcpu->arch.l1tf_flush_l1d = true; for (;;) { @@ -10067,25 +10066,18 @@ static int vcpu_run(struct kvm_vcpu *vcpu) if (__xfer_to_guest_mode_work_pending()) { srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); r = xfer_to_guest_mode_handle_work(vcpu); + vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); if (r) return r; - vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); } } - srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); - return r; } static inline int complete_emulated_io(struct kvm_vcpu *vcpu) { - int r; - - vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); - r = kvm_emulate_instruction(vcpu, EMULTYPE_NO_DECODE); - srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx); - return r; + return kvm_emulate_instruction(vcpu, EMULTYPE_NO_DECODE); } static int complete_emulated_pio(struct kvm_vcpu *vcpu) @@ -10224,12 +10216,16 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) kvm_run->flags = 0; kvm_load_guest_fpu(vcpu); + vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); if (unlikely(vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED)) { if (kvm_run->immediate_exit) { r = -EINTR; goto out; } + srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); kvm_vcpu_block(vcpu); + vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); + if (kvm_apic_accept_events(vcpu) < 0) { r = 0; goto out; @@ -10279,10 +10275,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) r = vcpu_run(vcpu); out: - kvm_put_guest_fpu(vcpu); if (kvm_run->kvm_valid_regs) store_regs(vcpu); post_kvm_run_save(vcpu); + + srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); + + kvm_put_guest_fpu(vcpu); kvm_sigset_deactivate(vcpu); vcpu_put(vcpu); -- ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save() 2021-10-26 14:42 ` Sean Christopherson @ 2021-10-26 16:00 ` David Woodhouse 2021-10-26 16:05 ` Sean Christopherson 0 siblings, 1 reply; 7+ messages in thread From: David Woodhouse @ 2021-10-26 16:00 UTC (permalink / raw) To: Sean Christopherson Cc: kvm, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, mtosatti [-- Attachment #1: Type: text/plain, Size: 699 bytes --] On Tue, 2021-10-26 at 14:42 +0000, Sean Christopherson wrote: > What about taking the lock well early on so that the tail doesn't need to juggle > errors? Dropping the lock for the KVM_MP_STATE_UNINITIALIZED case is a little > unfortunate, but that at least pairs with similar logic in x86's other call to > kvm_vcpu_block(). Relocking if xfer_to_guest_mode_handle_work() triggers an exit > to userspace is also unfortunate but it's not the end of the world. > > On the plus side, the complete_userspace_io() callback doesn't need to worry > about taking the lock. Yeah, that seems sensible for master, but I suspect I'd err on the side of caution for backporting to stable first? [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5174 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save() 2021-10-26 16:00 ` David Woodhouse @ 2021-10-26 16:05 ` Sean Christopherson 0 siblings, 0 replies; 7+ messages in thread From: Sean Christopherson @ 2021-10-26 16:05 UTC (permalink / raw) To: David Woodhouse Cc: kvm, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, mtosatti On Tue, Oct 26, 2021, David Woodhouse wrote: > On Tue, 2021-10-26 at 14:42 +0000, Sean Christopherson wrote: > > What about taking the lock well early on so that the tail doesn't need to juggle > > errors? Dropping the lock for the KVM_MP_STATE_UNINITIALIZED case is a little > > unfortunate, but that at least pairs with similar logic in x86's other call to > > kvm_vcpu_block(). Relocking if xfer_to_guest_mode_handle_work() triggers an exit > > to userspace is also unfortunate but it's not the end of the world. > > > > On the plus side, the complete_userspace_io() callback doesn't need to worry > > about taking the lock. > > Yeah, that seems sensible for master, but I suspect I'd err on the side > of caution for backporting to stable first? Agreed, dirty-but-simple for stable. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save() 2021-10-26 3:12 [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save() David Woodhouse 2021-10-26 14:42 ` Sean Christopherson @ 2021-10-28 14:11 ` Paolo Bonzini 2021-10-28 14:23 ` David Woodhouse 1 sibling, 1 reply; 7+ messages in thread From: Paolo Bonzini @ 2021-10-28 14:11 UTC (permalink / raw) To: David Woodhouse, kvm Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, mtosatti On 26/10/21 05:12, David Woodhouse wrote: > From: David Woodhouse <dwmw@amazon.co.uk> > > The Xen interrupt injection for event channels relies on accessing the > guest's vcpu_info structure in __kvm_xen_has_interrupt(), through a > gfn_to_hva_cache. > > This requires the srcu lock to be held, which is mostly the case except > for this code path: > > [ 11.822877] WARNING: suspicious RCU usage > [ 11.822965] ----------------------------- > [ 11.823013] include/linux/kvm_host.h:664 suspicious rcu_dereference_check() usage! > [ 11.823131] > [ 11.823131] other info that might help us debug this: > [ 11.823131] > [ 11.823196] > [ 11.823196] rcu_scheduler_active = 2, debug_locks = 1 > [ 11.823253] 1 lock held by dom:0/90: > [ 11.823292] #0: ffff998956ec8118 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x680 > [ 11.823379] > [ 11.823379] stack backtrace: > [ 11.823428] CPU: 2 PID: 90 Comm: dom:0 Kdump: loaded Not tainted 5.4.34+ #5 > [ 11.823496] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 > [ 11.823612] Call Trace: > [ 11.823645] dump_stack+0x7a/0xa5 > [ 11.823681] lockdep_rcu_suspicious+0xc5/0x100 > [ 11.823726] __kvm_xen_has_interrupt+0x179/0x190 > [ 11.823773] kvm_cpu_has_extint+0x6d/0x90 > [ 11.823813] kvm_cpu_accept_dm_intr+0xd/0x40 > [ 11.823853] kvm_vcpu_ready_for_interrupt_injection+0x20/0x30 > < post_kvm_run_save() inlined here > > [ 11.823906] kvm_arch_vcpu_ioctl_run+0x135/0x6a0 > [ 11.823947] kvm_vcpu_ioctl+0x263/0x680 > > Fixes: 40da8ccd724f ("KVM: x86/xen: Add event channel interrupt vector upcall") > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > --- > > There are potentially other ways of doing this, by shuffling the tail > of kvm_arch_vcpu_ioctl_run() around a little and holding the lock once > there instead of taking it within vcpu_run(). But the call to > post_kvm_run_save() occurs even on the error paths, and it gets complex > to untangle. This is the simple approach. > > This doesn't show up when I enable event channel delivery in > xen_shinfo_test because we have pic_in_kernel() there. I'll fix the > tests not to hardcode that, as I expand the event channel support and > add more testing of it. Queued for 5.15, thanks. Paolo > arch/x86/kvm/x86.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index f0874c3d12ce..cd42d58008f7 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -8783,9 +8783,17 @@ static void post_kvm_run_save(struct kvm_vcpu *vcpu) > > kvm_run->cr8 = kvm_get_cr8(vcpu); > kvm_run->apic_base = kvm_get_apic_base(vcpu); > + > + /* > + * The call to kvm_ready_for_interrupt_injection() may end up in > + * kvm_xen_has_interrupt() which may require the srcu lock to be > + * held to protect against changes in the vcpu_info address. > + */ > + vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); > kvm_run->ready_for_interrupt_injection = > pic_in_kernel(vcpu->kvm) || > kvm_vcpu_ready_for_interrupt_injection(vcpu); > + srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx); > > if (is_smm(vcpu)) > kvm_run->flags |= KVM_RUN_X86_SMM; > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save() 2021-10-28 14:11 ` Paolo Bonzini @ 2021-10-28 14:23 ` David Woodhouse 2021-10-28 16:15 ` Paolo Bonzini 0 siblings, 1 reply; 7+ messages in thread From: David Woodhouse @ 2021-10-28 14:23 UTC (permalink / raw) To: Paolo Bonzini, kvm Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, mtosatti [-- Attachment #1: Type: text/plain, Size: 921 bytes --] On Thu, 2021-10-28 at 16:11 +0200, Paolo Bonzini wrote: > > > There are potentially other ways of doing this, by shuffling the tail > > of kvm_arch_vcpu_ioctl_run() around a little and holding the lock once > > there instead of taking it within vcpu_run(). But the call to > > post_kvm_run_save() occurs even on the error paths, and it gets complex > > to untangle. This is the simple approach. > > > > This doesn't show up when I enable event channel delivery in > > xen_shinfo_test because we have pic_in_kernel() there. I'll fix the > > tests not to hardcode that, as I expand the event channel support and > > add more testing of it. > > Queued for 5.15, thanks. Thanks. Sean, since you actually went ahead and implemented the alternative approach along the lines I describe above, I'll let you submit that as a subsequent cleanup while keeping the simple version for stable as discussed? [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5174 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save() 2021-10-28 14:23 ` David Woodhouse @ 2021-10-28 16:15 ` Paolo Bonzini 0 siblings, 0 replies; 7+ messages in thread From: Paolo Bonzini @ 2021-10-28 16:15 UTC (permalink / raw) To: David Woodhouse, kvm Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, mtosatti On 28/10/21 16:23, David Woodhouse wrote: >> Queued for 5.15, thanks. > Thanks. > > Sean, since you actually went ahead and implemented the alternative > approach along the lines I describe above, I'll let you submit that as > a subsequent cleanup while keeping the simple version for stable as > discussed? Yes, I can also take care of sending it out for review. Paolo ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-10-28 16:15 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-10-26 3:12 [PATCH] KVM: x86: Take srcu lock in post_kvm_run_save() David Woodhouse 2021-10-26 14:42 ` Sean Christopherson 2021-10-26 16:00 ` David Woodhouse 2021-10-26 16:05 ` Sean Christopherson 2021-10-28 14:11 ` Paolo Bonzini 2021-10-28 14:23 ` David Woodhouse 2021-10-28 16:15 ` Paolo Bonzini
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).