All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] KVM: arm/arm64: BUG FIX: Do not inject spurious interrupts
@ 2015-09-25 14:00 Pavel Fedin
  2015-09-29  7:25 ` Christoffer Dall
  0 siblings, 1 reply; 6+ messages in thread
From: Pavel Fedin @ 2015-09-25 14:00 UTC (permalink / raw)
  To: kvmarm, kvm; +Cc: Christoffer Dall, 'Marc Zyngier'

Commit 71760950bf3dc796e5e53ea3300dec724a09f593
("arm/arm64: KVM: add a common vgic_queue_irq_to_lr fn") introduced
vgic_queue_irq_to_lr() function with additional vgic_dist_irq_is_pending()
check before setting LR_STATE_PENDING bit. In some cases it started
causing the following situation if the userland quickly drops a
level-sensitive IRQ back to inactive state for some reason:
1. Userland injects an IRQ with level == 1, this ends up in
   vgic_update_irq_pending(), which in turn calls
   vgic_dist_irq_set_pending() for this IRQ.
2. vCPU gets kicked. But kernel does not manage to reschedule it quickly
   (!!!)
3. Userland quickly resets the IRQ to level == 0. vgic_update_irq_pending()
   in this case will call vgic_dist_irq_clear_pending() and reset the
   pending flag.
4. vCPU finally wakes up. It successfully rolls through through
   __kvm_vgic_flush_hwstate(), which populates vGIC registers. However,
   since neither pending nor active flags are now set for this IRQ,
   vgic_queue_irq_to_lr() does not set any state bits on this LR at all.
   Since this is level-sensitive IRQ, we end up in LR containing only
   LR_EOI_INT bit, causing unnecessary immediate exit from the guest.

This patch fixes the problem by adding forgotten vgic_cpu_irq_clear().
This causes the IRQ not to be included into any lists, if it has been
picked up after getting dropped to inactive level. Since this is a
level-sensitive IRQ, this is correct behavior. Additionally,
irq_pending_on_cpu will also be reset if this was the only pending
interrupt, saving us from unnecessary wakeups.

The bug was caught on ARM64 kernel v4.1.6, running qemu "virt" guest,
where it was caused by emulated pl011.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
---
v1 => v2:
Recheck status and clear irq_pending_on_cpu if needed
---
 virt/kvm/arm/vgic.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 6718135..2a2e945 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1109,7 +1109,8 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 		kvm_debug("Set active, clear distributor: 0x%x\n", vlr.state);
 		vgic_irq_clear_active(vcpu, irq);
 		vgic_update_state(vcpu->kvm);
-	} else if (vgic_dist_irq_is_pending(vcpu, irq)) {
+	} else {
+		WARN_ON(!vgic_dist_irq_is_pending(vcpu, irq));
 		vlr.state |= LR_STATE_PENDING;
 		kvm_debug("Set pending: 0x%x\n", vlr.state);
 	}
@@ -1565,8 +1566,12 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
 	} else {
 		if (level_triggered) {
 			vgic_dist_irq_clear_level(vcpu, irq_num);
-			if (!vgic_dist_irq_soft_pend(vcpu, irq_num))
+			if (!vgic_dist_irq_soft_pend(vcpu, irq_num)) {
 				vgic_dist_irq_clear_pending(vcpu, irq_num);
+				vgic_cpu_irq_clear(vcpu, irq_num);
+				if (!compute_pending_for_cpu(vcpu))
+					clear_bit(cpuid, dist->irq_pending_on_cpu);
+			}
 		}
 
 		ret = false;
-- 
2.4.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] KVM: arm/arm64: BUG FIX: Do not inject spurious interrupts
  2015-09-25 14:00 [PATCH v2] KVM: arm/arm64: BUG FIX: Do not inject spurious interrupts Pavel Fedin
@ 2015-09-29  7:25 ` Christoffer Dall
  2015-09-30 10:24   ` Pavel Fedin
  2015-10-09 14:41   ` Pavel Fedin
  0 siblings, 2 replies; 6+ messages in thread
From: Christoffer Dall @ 2015-09-29  7:25 UTC (permalink / raw)
  To: Pavel Fedin; +Cc: kvmarm, kvm, 'Marc Zyngier'

On Fri, Sep 25, 2015 at 05:00:29PM +0300, Pavel Fedin wrote:
> Commit 71760950bf3dc796e5e53ea3300dec724a09f593
> ("arm/arm64: KVM: add a common vgic_queue_irq_to_lr fn") introduced
> vgic_queue_irq_to_lr() function with additional vgic_dist_irq_is_pending()
> check before setting LR_STATE_PENDING bit. In some cases it started
> causing the following situation if the userland quickly drops a
> level-sensitive IRQ back to inactive state for some reason:
> 1. Userland injects an IRQ with level == 1, this ends up in
>    vgic_update_irq_pending(), which in turn calls
>    vgic_dist_irq_set_pending() for this IRQ.
> 2. vCPU gets kicked. But kernel does not manage to reschedule it quickly
>    (!!!)
> 3. Userland quickly resets the IRQ to level == 0. vgic_update_irq_pending()
>    in this case will call vgic_dist_irq_clear_pending() and reset the
>    pending flag.
> 4. vCPU finally wakes up. It successfully rolls through through
>    __kvm_vgic_flush_hwstate(), which populates vGIC registers. However,
>    since neither pending nor active flags are now set for this IRQ,
>    vgic_queue_irq_to_lr() does not set any state bits on this LR at all.
>    Since this is level-sensitive IRQ, we end up in LR containing only
>    LR_EOI_INT bit, causing unnecessary immediate exit from the guest.
> 
> This patch fixes the problem by adding forgotten vgic_cpu_irq_clear().
> This causes the IRQ not to be included into any lists, if it has been
> picked up after getting dropped to inactive level. Since this is a
> level-sensitive IRQ, this is correct behavior. Additionally,
> irq_pending_on_cpu will also be reset if this was the only pending
> interrupt, saving us from unnecessary wakeups.
> 
> The bug was caught on ARM64 kernel v4.1.6, running qemu "virt" guest,
> where it was caused by emulated pl011.
> 
> Signed-off-by: Pavel Fedin <p.fedin@samsung.com>

I reworked the commit message and applied this patch.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v2] KVM: arm/arm64: BUG FIX: Do not inject spurious interrupts
  2015-09-29  7:25 ` Christoffer Dall
@ 2015-09-30 10:24   ` Pavel Fedin
  2015-10-09 14:41   ` Pavel Fedin
  1 sibling, 0 replies; 6+ messages in thread
From: Pavel Fedin @ 2015-09-30 10:24 UTC (permalink / raw)
  To: 'Christoffer Dall'; +Cc: 'Marc Zyngier', kvmarm, kvm

 Hello!

> I reworked the commit message and applied this patch.

 Thank you very much.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v2] KVM: arm/arm64: BUG FIX: Do not inject spurious interrupts
  2015-09-29  7:25 ` Christoffer Dall
  2015-09-30 10:24   ` Pavel Fedin
@ 2015-10-09 14:41   ` Pavel Fedin
  2015-10-10 15:00     ` Christoffer Dall
  1 sibling, 1 reply; 6+ messages in thread
From: Pavel Fedin @ 2015-10-09 14:41 UTC (permalink / raw)
  To: 'Christoffer Dall'
  Cc: 'Marc Zyngier', andre.przywara, kvmarm, kvm

 Hello!

> I reworked the commit message and applied this patch.

 During testing i discovered a problem with this patch and vITS series by Andre.
 The problem is that compute_pending_for_cpu() does not know anything about LPIs. Therefore, we can
reset this bit even if some LPIs (and only LPIs) are pending. This causes LPI loss.
 This is the confirmation of that clearing irq_pending_on_cpu anywhere else than
__kvm_vgic_flush_hwstate() is a bad idea. I would suggest to stick back to v1 of the patch (without
clearing this bit). We can add a clarifying description to the commit message like this:

--- cut ---
In some situations level-sensitive IRQ disappears before it has been
processed. This is normal, and in this situation we lose this IRQ, the same
as real HW does. The aim of this patch is to handle this situation more
correctly. However, dist->irq_pending_on_cpu stays set until the vCPU
itself rechecks its status. Therefore, this bit does not guarantee that
something is pending at the given moment, it should be treated as attention
flag, saying that something has happened on this vCPU, and it could have
been even gone since that, but wakeup and status recheck is needed.
--- cut ---

 Would you be happy with this? An alternative would be to add a check for pending LPIs, but wouldn't
it just be too complex for a simple problem?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] KVM: arm/arm64: BUG FIX: Do not inject spurious interrupts
  2015-10-09 14:41   ` Pavel Fedin
@ 2015-10-10 15:00     ` Christoffer Dall
  2015-10-12  7:14       ` Pavel Fedin
  0 siblings, 1 reply; 6+ messages in thread
From: Christoffer Dall @ 2015-10-10 15:00 UTC (permalink / raw)
  To: Pavel Fedin; +Cc: 'Marc Zyngier', andre.przywara, kvmarm, kvm

On Fri, Oct 09, 2015 at 05:41:11PM +0300, Pavel Fedin wrote:
>  Hello!
> 
> > I reworked the commit message and applied this patch.
> 
>  During testing i discovered a problem with this patch and vITS series by Andre.
>  The problem is that compute_pending_for_cpu() does not know anything about LPIs. Therefore, we can
> reset this bit even if some LPIs (and only LPIs) are pending. This causes LPI loss.

I haven't looked at the ITS series in detail yet so I cannot commetn on
this.

>  This is the confirmation of that clearing irq_pending_on_cpu anywhere else than
> __kvm_vgic_flush_hwstate() is a bad idea. I would suggest to stick back to v1 of the patch (without
> clearing this bit). We can add a clarifying description to the commit message like this:
> 
> --- cut ---
> In some situations level-sensitive IRQ disappears before it has been
> processed. This is normal, and in this situation we lose this IRQ, the same
> as real HW does. The aim of this patch is to handle this situation more
> correctly. However, dist->irq_pending_on_cpu stays set until the vCPU
> itself rechecks its status. Therefore, this bit does not guarantee that
> something is pending at the given moment, it should be treated as attention
> flag, saying that something has happened on this vCPU, and it could have
> been even gone since that, but wakeup and status recheck is needed.
> --- cut ---

I really don't want to have an inconsistent state in our data
structures, this whole thing is plenty fragile as it is.

> 
>  Would you be happy with this? An alternative would be to add a check for pending LPIs, but wouldn't
> it just be too complex for a simple problem?
> 

My concern at this point is to try to keep this thing stable.

It is really up to whoever adds support for LPIs to make sure it's done
correctly.  So I think this is for Andre to work out in his ITS series.

This patch fixes an issue with the current code in the correct way as
far as I can tell.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v2] KVM: arm/arm64: BUG FIX: Do not inject spurious interrupts
  2015-10-10 15:00     ` Christoffer Dall
@ 2015-10-12  7:14       ` Pavel Fedin
  0 siblings, 0 replies; 6+ messages in thread
From: Pavel Fedin @ 2015-10-12  7:14 UTC (permalink / raw)
  To: 'Christoffer Dall'
  Cc: kvmarm, kvm, 'Marc Zyngier', andre.przywara

 Hello!

> It is really up to whoever adds support for LPIs to make sure it's done
> correctly.  So I think this is for Andre to work out in his ITS series.
> 
> This patch fixes an issue with the current code in the correct way as
> far as I can tell.

 Ok. An alternate way is to introduce a function which would check for any pending LPIs. I will
suggest a fix on top of the current series after some time.
 I'm cc'ing this whole discussion to Andre.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-10-12  7:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-25 14:00 [PATCH v2] KVM: arm/arm64: BUG FIX: Do not inject spurious interrupts Pavel Fedin
2015-09-29  7:25 ` Christoffer Dall
2015-09-30 10:24   ` Pavel Fedin
2015-10-09 14:41   ` Pavel Fedin
2015-10-10 15:00     ` Christoffer Dall
2015-10-12  7:14       ` Pavel Fedin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.