From: Shanker Donthineni <sdonthineni@nvidia.com> To: Marc Zyngier <maz@kernel.org>, James Morse <james.morse@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Shanker Donthineni <sdonthineni@nvidia.com>, <linux-arm-kernel@lists.infradead.org>, <kvmarm@lists.linux.dev>, <kvmarm@lists.cs.columbia.edu>, <linux-kernel@vger.kernel.org>, Vikram Sethi <vsethi@nvidia.com> Subject: [PATCH] KVM: arm64: vgic: Fix soft lockup during VM teardown Date: Tue, 17 Jan 2023 20:23:48 -0600 [thread overview] Message-ID: <20230118022348.4137094-1-sdonthineni@nvidia.com> (raw) Getting intermittent CPU soft lockups during the virtual machines teardown on a system with GICv4 features enabled. The function __synchronize_hardirq() has been waiting for IRQD_IRQ_INPROGRESS to be cleared forever as per the current implementation. CPU stuck here for a long time leads to soft lockup: while (irqd_irq_inprogress(&desc->irq_data)) cpu_relax(); Call trace from the lockup CPU: [ 87.238866] watchdog: BUG: soft lockup - CPU#37 stuck for 23s! [ 87.250025] CPU: 37 PID: 1031 Comm: qemu-system-aarch64 [ 87.358397] Call trace: [ 87.360891] __synchronize_hardirq+0x48/0x140 [ 87.365343] free_irq+0x138/0x424 [ 87.368727] vgic_v4_teardown+0xa4/0xe0 [ 87.372649] __kvm_vgic_destroy+0x18c/0x194 [ 87.376922] kvm_vgic_destroy+0x28/0x3c [ 87.380839] kvm_arch_destroy_vm+0x24/0x44 [ 87.385024] kvm_destroy_vm+0x158/0x2c4 [ 87.388943] kvm_vm_release+0x6c/0x98 [ 87.392681] __fput+0x70/0x220 [ 87.395800] ____fput+0x10/0x20 [ 87.399005] task_work_run+0xb4/0x23c [ 87.402746] do_exit+0x2bc/0x8a4 [ 87.406042] do_group_exit+0x34/0xb0 [ 87.409693] get_signal+0x878/0x8a0 [ 87.413254] do_notify_resume+0x138/0x1530 [ 87.417440] el0_svc+0xdc/0xf0 [ 87.420559] el0t_64_sync_handler+0xf0/0x11c [ 87.424919] el0t_64_sync+0x18c/0x190 The state of the IRQD_IRQ_INPROGRESS information is lost inside irq_domain_activate_irq() which happens before calling free_irq(). Instrumented the code and confirmed, the IRQD state is changed from 0x10401400 to 0x10441600 instead of 0x10401600 causing problem. Call trace from irqd_set_activated(): [ 78.983544] irqd_set_activated: lost IRQD_IRQ_INPROGRESS old=0x10401400, new=0x10441600 [ 78.992093] CPU: 19 PID: 1511 Comm: qemu-system-aarch64 [ 79.008461] Call trace: [ 79.010956] dump_backtrace.part.0+0xc8/0xe0 [ 79.015328] show_stack+0x18/0x54 [ 79.018713] dump_stack_lvl+0x64/0x7c [ 79.022459] dump_stack+0x18/0x30 [ 79.025842] irq_domain_activate_irq+0x88/0x94 [ 79.030385] vgic_v3_save_pending_tables+0x260/0x29c [ 79.035463] vgic_set_common_attr+0xac/0x23c [ 79.039826] vgic_v3_set_attr+0x48/0x60 [ 79.043742] kvm_device_ioctl+0x120/0x19c [ 79.047840] __arm64_sys_ioctl+0x42c/0xe00 [ 79.052027] invoke_syscall.constprop.0+0x50/0xe0 [ 79.056835] do_el0_svc+0x58/0x180 [ 79.060308] el0_svc+0x38/0xf0 [ 79.063425] el0t_64_sync_handler+0xf0/0x11c [ 79.067785] el0t_64_sync+0x18c/0x190 irqreturn_t handle_irq_event(struct irq_desc *desc) { irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS); raw_spin_unlock(&desc->lock); ret = handle_irq_event_percpu(desc); raw_spin_lock(&desc->lock); irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS); } In this particular failed case and based on traces, the two functions irqd_set_activated() and handle_irq_event() are concurrently modifying IRQD state without both holding desc->lock. The irqd_set_activated() execution path is reading memory 'state_use_accessors' in between set & clear of IRQD_IRQ_INPROGRESS state change and writing the modified data after executing 'irqd_clear(desc->irq_data, IRQD_IRQ_INPROGRESS)'. To fix the lockup issue, hold desc->lock when calling functions irq_domain_activate_irq() and irq_domain_deactivate_irq). Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> --- arch/arm64/kvm/vgic/vgic-v3.c | 6 ++++++ arch/arm64/kvm/vgic/vgic-v4.c | 4 ++++ 2 files changed, 10 insertions(+) diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c index 2074521d4a8c..e6aa909fcbe2 100644 --- a/arch/arm64/kvm/vgic/vgic-v3.c +++ b/arch/arm64/kvm/vgic/vgic-v3.c @@ -353,22 +353,28 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq) static void unmap_all_vpes(struct vgic_dist *dist) { struct irq_desc *desc; + unsigned long flags; int i; for (i = 0; i < dist->its_vm.nr_vpes; i++) { desc = irq_to_desc(dist->its_vm.vpes[i]->irq); + raw_spin_lock_irqsave(&desc->lock, flags); irq_domain_deactivate_irq(irq_desc_get_irq_data(desc)); + raw_spin_unlock_irqrestore(&desc->lock, flags); } } static void map_all_vpes(struct vgic_dist *dist) { struct irq_desc *desc; + unsigned long flags; int i; for (i = 0; i < dist->its_vm.nr_vpes; i++) { desc = irq_to_desc(dist->its_vm.vpes[i]->irq); + raw_spin_lock_irqsave(&desc->lock, flags); irq_domain_activate_irq(irq_desc_get_irq_data(desc), false); + raw_spin_unlock_irqrestore(&desc->lock, flags); } } diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c index ad06ba6c9b00..a01b8313e82c 100644 --- a/arch/arm64/kvm/vgic/vgic-v4.c +++ b/arch/arm64/kvm/vgic/vgic-v4.c @@ -139,8 +139,10 @@ static void vgic_v4_enable_vsgis(struct kvm_vcpu *vcpu) /* Transfer the full irq state to the vPE */ vgic_v4_sync_sgi_config(vpe, irq); desc = irq_to_desc(irq->host_irq); + raw_spin_lock(&desc->lock); ret = irq_domain_activate_irq(irq_desc_get_irq_data(desc), false); + raw_spin_unlock(&desc->lock); if (!WARN_ON(ret)) { /* Transfer pending state */ ret = irq_set_irqchip_state(irq->host_irq, @@ -177,7 +179,9 @@ static void vgic_v4_disable_vsgis(struct kvm_vcpu *vcpu) WARN_ON(ret); desc = irq_to_desc(irq->host_irq); + raw_spin_lock(&desc->lock); irq_domain_deactivate_irq(irq_desc_get_irq_data(desc)); + raw_spin_unlock(&desc->lock); unlock: raw_spin_unlock_irqrestore(&irq->irq_lock, flags); vgic_put_irq(vcpu->kvm, irq); -- 2.25.1
WARNING: multiple messages have this Message-ID (diff)
From: Shanker Donthineni <sdonthineni@nvidia.com> To: Marc Zyngier <maz@kernel.org>, James Morse <james.morse@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Shanker Donthineni <sdonthineni@nvidia.com>, <linux-arm-kernel@lists.infradead.org>, <kvmarm@lists.linux.dev>, <kvmarm@lists.cs.columbia.edu>, <linux-kernel@vger.kernel.org>, Vikram Sethi <vsethi@nvidia.com> Subject: [PATCH] KVM: arm64: vgic: Fix soft lockup during VM teardown Date: Tue, 17 Jan 2023 20:23:48 -0600 [thread overview] Message-ID: <20230118022348.4137094-1-sdonthineni@nvidia.com> (raw) Getting intermittent CPU soft lockups during the virtual machines teardown on a system with GICv4 features enabled. The function __synchronize_hardirq() has been waiting for IRQD_IRQ_INPROGRESS to be cleared forever as per the current implementation. CPU stuck here for a long time leads to soft lockup: while (irqd_irq_inprogress(&desc->irq_data)) cpu_relax(); Call trace from the lockup CPU: [ 87.238866] watchdog: BUG: soft lockup - CPU#37 stuck for 23s! [ 87.250025] CPU: 37 PID: 1031 Comm: qemu-system-aarch64 [ 87.358397] Call trace: [ 87.360891] __synchronize_hardirq+0x48/0x140 [ 87.365343] free_irq+0x138/0x424 [ 87.368727] vgic_v4_teardown+0xa4/0xe0 [ 87.372649] __kvm_vgic_destroy+0x18c/0x194 [ 87.376922] kvm_vgic_destroy+0x28/0x3c [ 87.380839] kvm_arch_destroy_vm+0x24/0x44 [ 87.385024] kvm_destroy_vm+0x158/0x2c4 [ 87.388943] kvm_vm_release+0x6c/0x98 [ 87.392681] __fput+0x70/0x220 [ 87.395800] ____fput+0x10/0x20 [ 87.399005] task_work_run+0xb4/0x23c [ 87.402746] do_exit+0x2bc/0x8a4 [ 87.406042] do_group_exit+0x34/0xb0 [ 87.409693] get_signal+0x878/0x8a0 [ 87.413254] do_notify_resume+0x138/0x1530 [ 87.417440] el0_svc+0xdc/0xf0 [ 87.420559] el0t_64_sync_handler+0xf0/0x11c [ 87.424919] el0t_64_sync+0x18c/0x190 The state of the IRQD_IRQ_INPROGRESS information is lost inside irq_domain_activate_irq() which happens before calling free_irq(). Instrumented the code and confirmed, the IRQD state is changed from 0x10401400 to 0x10441600 instead of 0x10401600 causing problem. Call trace from irqd_set_activated(): [ 78.983544] irqd_set_activated: lost IRQD_IRQ_INPROGRESS old=0x10401400, new=0x10441600 [ 78.992093] CPU: 19 PID: 1511 Comm: qemu-system-aarch64 [ 79.008461] Call trace: [ 79.010956] dump_backtrace.part.0+0xc8/0xe0 [ 79.015328] show_stack+0x18/0x54 [ 79.018713] dump_stack_lvl+0x64/0x7c [ 79.022459] dump_stack+0x18/0x30 [ 79.025842] irq_domain_activate_irq+0x88/0x94 [ 79.030385] vgic_v3_save_pending_tables+0x260/0x29c [ 79.035463] vgic_set_common_attr+0xac/0x23c [ 79.039826] vgic_v3_set_attr+0x48/0x60 [ 79.043742] kvm_device_ioctl+0x120/0x19c [ 79.047840] __arm64_sys_ioctl+0x42c/0xe00 [ 79.052027] invoke_syscall.constprop.0+0x50/0xe0 [ 79.056835] do_el0_svc+0x58/0x180 [ 79.060308] el0_svc+0x38/0xf0 [ 79.063425] el0t_64_sync_handler+0xf0/0x11c [ 79.067785] el0t_64_sync+0x18c/0x190 irqreturn_t handle_irq_event(struct irq_desc *desc) { irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS); raw_spin_unlock(&desc->lock); ret = handle_irq_event_percpu(desc); raw_spin_lock(&desc->lock); irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS); } In this particular failed case and based on traces, the two functions irqd_set_activated() and handle_irq_event() are concurrently modifying IRQD state without both holding desc->lock. The irqd_set_activated() execution path is reading memory 'state_use_accessors' in between set & clear of IRQD_IRQ_INPROGRESS state change and writing the modified data after executing 'irqd_clear(desc->irq_data, IRQD_IRQ_INPROGRESS)'. To fix the lockup issue, hold desc->lock when calling functions irq_domain_activate_irq() and irq_domain_deactivate_irq). Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> --- arch/arm64/kvm/vgic/vgic-v3.c | 6 ++++++ arch/arm64/kvm/vgic/vgic-v4.c | 4 ++++ 2 files changed, 10 insertions(+) diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c index 2074521d4a8c..e6aa909fcbe2 100644 --- a/arch/arm64/kvm/vgic/vgic-v3.c +++ b/arch/arm64/kvm/vgic/vgic-v3.c @@ -353,22 +353,28 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq) static void unmap_all_vpes(struct vgic_dist *dist) { struct irq_desc *desc; + unsigned long flags; int i; for (i = 0; i < dist->its_vm.nr_vpes; i++) { desc = irq_to_desc(dist->its_vm.vpes[i]->irq); + raw_spin_lock_irqsave(&desc->lock, flags); irq_domain_deactivate_irq(irq_desc_get_irq_data(desc)); + raw_spin_unlock_irqrestore(&desc->lock, flags); } } static void map_all_vpes(struct vgic_dist *dist) { struct irq_desc *desc; + unsigned long flags; int i; for (i = 0; i < dist->its_vm.nr_vpes; i++) { desc = irq_to_desc(dist->its_vm.vpes[i]->irq); + raw_spin_lock_irqsave(&desc->lock, flags); irq_domain_activate_irq(irq_desc_get_irq_data(desc), false); + raw_spin_unlock_irqrestore(&desc->lock, flags); } } diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c index ad06ba6c9b00..a01b8313e82c 100644 --- a/arch/arm64/kvm/vgic/vgic-v4.c +++ b/arch/arm64/kvm/vgic/vgic-v4.c @@ -139,8 +139,10 @@ static void vgic_v4_enable_vsgis(struct kvm_vcpu *vcpu) /* Transfer the full irq state to the vPE */ vgic_v4_sync_sgi_config(vpe, irq); desc = irq_to_desc(irq->host_irq); + raw_spin_lock(&desc->lock); ret = irq_domain_activate_irq(irq_desc_get_irq_data(desc), false); + raw_spin_unlock(&desc->lock); if (!WARN_ON(ret)) { /* Transfer pending state */ ret = irq_set_irqchip_state(irq->host_irq, @@ -177,7 +179,9 @@ static void vgic_v4_disable_vsgis(struct kvm_vcpu *vcpu) WARN_ON(ret); desc = irq_to_desc(irq->host_irq); + raw_spin_lock(&desc->lock); irq_domain_deactivate_irq(irq_desc_get_irq_data(desc)); + raw_spin_unlock(&desc->lock); unlock: raw_spin_unlock_irqrestore(&irq->irq_lock, flags); vgic_put_irq(vcpu->kvm, irq); -- 2.25.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next reply other threads:[~2023-01-18 2:24 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-01-18 2:23 Shanker Donthineni [this message] 2023-01-18 2:23 ` [PATCH] KVM: arm64: vgic: Fix soft lockup during VM teardown Shanker Donthineni 2023-01-18 11:54 ` Marc Zyngier 2023-01-18 11:54 ` Marc Zyngier 2023-01-18 19:24 ` Shanker Donthineni 2023-01-18 19:24 ` Shanker Donthineni 2023-01-19 7:11 ` Marc Zyngier 2023-01-19 7:11 ` Marc Zyngier 2023-01-19 13:00 ` Shanker Donthineni 2023-01-19 13:00 ` Shanker Donthineni 2023-01-19 14:01 ` Marc Zyngier 2023-01-19 14:01 ` Marc Zyngier 2023-01-19 14:16 ` Shanker Donthineni 2023-01-19 14:16 ` Shanker Donthineni 2023-01-20 3:55 ` Shanker Donthineni 2023-01-20 5:02 ` Shanker Donthineni 2023-01-20 12:00 ` Marc Zyngier 2023-01-20 12:00 ` Marc Zyngier 2023-01-21 15:28 ` Shanker Donthineni 2023-01-21 15:28 ` Shanker Donthineni 2023-01-21 15:35 ` Shanker Donthineni 2023-01-21 15:35 ` Shanker Donthineni 2023-01-23 11:23 ` Marc Zyngier 2023-01-23 11:23 ` Marc Zyngier
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20230118022348.4137094-1-sdonthineni@nvidia.com \ --to=sdonthineni@nvidia.com \ --cc=catalin.marinas@arm.com \ --cc=james.morse@arm.com \ --cc=kvmarm@lists.cs.columbia.edu \ --cc=kvmarm@lists.linux.dev \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=maz@kernel.org \ --cc=vsethi@nvidia.com \ --cc=will@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.