All of lore.kernel.org
 help / color / mirror / Atom feed
* interrupt problem with multiple vnet devices
@ 2009-04-29 21:41 Chris Torek
  2009-04-29 22:04 ` David Miller
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Chris Torek @ 2009-04-29 21:41 UTC (permalink / raw)
  To: sparclinux

One of our guys is working on a problem with Linux as guest
domain client and multiple vnets bound to the same vswitch:

   ldm add-vnet vnet0 primary-vsw1 ldom1
   ldm add-vnet vnet1 primary-vsw1 ldom1

When configured this way, Linux does not boot properly, as
it winds up perpetually servicing vnet interrupts.

The fix he sent me is below, but I am very suspicious about
this as it essentially removes the linked-list of buckets
that the irq code runs.

I think the real problem is that, when multiple vnet devices are
on the same vswitch, the interrupts get re-enabled "too soon" so
that the sun4v_ivec.S code ends up making the linked list become
circular (by queueing work for a vnet cookie that is already queued).

It seems a little odd to me, though, that the sun4v_ivec.S code
builds up a backwards list (by pushing things onto the head of the
chain after setting the "next" based on the current head).  I guess
this is just because it is too hard to add to the end of the list.
(I would probably do this with the assembly code equivalent of:

    new->next = NULL;
    *tail = new;
    tail = &new->next;

but this requires adding a "tail" pointer to the per-cpu block,
and of course updating it in the irq_64.c code.  Note that we are
still working with older code that has a sparc64 directory.)

(This patch also includes code to handle irq redistribution
on ldom guests that have a small number of CPUs, with a kernel
built to run on many more CPUs.  It is still not ideal yet as
it does not do chip-then-strand distribution, I'm just including
it here out of laziness :-) / desire-not-to-break-diffs.)

Chris


diff --git a/arch/sparc64/kernel/sun4v_ivec.S b/arch/sparc64/kernel/sun4v_ivec.S
index e2f8e1b..edcd71e 100644
--- a/arch/sparc64/kernel/sun4v_ivec.S
+++ b/arch/sparc64/kernel/sun4v_ivec.S
@@ -108,8 +108,7 @@ sun4v_dev_mondo:
 	sllx	%g3, 4, %g3
 	add	%g4, %g3, %g4
 
-1:	ldx	[%g1], %g2
-	stxa	%g2, [%g4] ASI_PHYS_USE_EC
+1:	stxa	%g0, [%g4] ASI_PHYS_USE_EC
 	stx	%g4, [%g1]
 
 	/* Signal the interrupt by setting (1 << pil) in %softint.  */
diff --git a/arch/sparc64/kernel/irq.c b/arch/sparc64/kernel/irq.c
index 7872476..2584b9e 100644
--- a/arch/sparc64/kernel/irq.c
+++ b/arch/sparc64/kernel/irq.c
@@ -70,6 +70,7 @@ static unsigned long bucket_get_chain_pa(unsigned long bucket_pa)
 	return ret;
 }
 
+#if 0 /* This one is no longer needed. */
 static void bucket_clear_chain_pa(unsigned long bucket_pa)
 {
 	__asm__ __volatile__("stxa	%%g0, [%0] %1"
@@ -79,6 +80,7 @@ static void bucket_clear_chain_pa(unsigned long bucket_pa)
 					     __irq_chain_pa)),
 			       "i" (ASI_PHYS_USE_EC));
 }
+#endif
 
 static unsigned int bucket_get_virt_irq(unsigned long bucket_pa)
 {
@@ -251,7 +253,7 @@ static int irq_choose_cpu(unsigned int virt_irq)
 	cpumask_t mask = irq_desc[virt_irq].affinity;
 	int cpuid;
 
-	if (cpus_equal(mask, CPU_MASK_ALL)) {
+	if (cpus_subset(mask, CPU_MASK_ALL)) {
 		static int irq_rover;
 		static DEFINE_SPINLOCK(irq_rover_lock);
 		unsigned long flags;
@@ -735,10 +737,12 @@ void handler_irq(int irq, struct pt_regs *regs)
 
 		next_pa = bucket_get_chain_pa(bucket_pa);
 		virt_irq = bucket_get_virt_irq(bucket_pa);
-		bucket_clear_chain_pa(bucket_pa);
 
 		desc = irq_desc + virt_irq;
 
+		if (desc->chip->set_affinity)
+			desc->chip->set_affinity(virt_irq, cpu_online_map);
+
 		desc->handle_irq(virt_irq, desc);
 
 		bucket_pa = next_pa;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: interrupt problem with multiple vnet devices
  2009-04-29 21:41 interrupt problem with multiple vnet devices Chris Torek
@ 2009-04-29 22:04 ` David Miller
  2009-04-29 22:52 ` David Miller
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2009-04-29 22:04 UTC (permalink / raw)
  To: sparclinux

From: Chris Torek <chris.torek@windriver.com>
Date: Wed, 29 Apr 2009 15:41:48 -0600

> When configured this way, Linux does not boot properly, as
> it winds up perpetually servicing vnet interrupts.
> 
> The fix he sent me is below, but I am very suspicious about
> this as it essentially removes the linked-list of buckets
> that the irq code runs.
> 
> I think the real problem is that, when multiple vnet devices are
> on the same vswitch, the interrupts get re-enabled "too soon" so
> that the sun4v_ivec.S code ends up making the linked list become
> circular (by queueing work for a vnet cookie that is already queued).

Older versions of the hypervisor have a bug in it's LDOM code
wherein interrupt vectors from virtual devices can be sent
twice before an intervening CLEAR event.

But we should be working around that properly.  This check in
sun4v_irq_eoi() and sun4v_virq_eoi():

	if (unlikely(desc->status & (IRQ_DISABLED|IRQ_INPROGRESS)))
		return;

Should prevent any problems due to that bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: interrupt problem with multiple vnet devices
  2009-04-29 21:41 interrupt problem with multiple vnet devices Chris Torek
  2009-04-29 22:04 ` David Miller
@ 2009-04-29 22:52 ` David Miller
  2009-04-30  0:06 ` Chris Torek
  2009-04-30  0:18 ` David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2009-04-29 22:52 UTC (permalink / raw)
  To: sparclinux

From: David Miller <davem@davemloft.net>
Date: Wed, 29 Apr 2009 15:04:08 -0700 (PDT)

> But we should be working around that properly.  This check in
> sun4v_irq_eoi() and sun4v_virq_eoi():
> 
> 	if (unlikely(desc->status & (IRQ_DISABLED|IRQ_INPROGRESS)))
> 		return;
> 
> Should prevent any problems due to that bug.

This is the commit that test comes from.  I assume you guys
would check the commit logs for something like this, but just
in case...

commit 5a606b72a4309a656cd1a19ad137dc5557c4b8ea
Author: David S. Miller <davem@sunset.davemloft.net>
Date:   Mon Jul 9 22:40:36 2007 -0700

    [SPARC64]: Do not ACK an INO if it is disabled or inprogress.
    
    This is also a partial workaround for a bug in the LDOM firmware which
    double-transmits RX inos during high load.  Without this, such an
    event causes the kernel to loop forever in the interrupt call chain
    ACK'ing but never actually running the IRQ handler (and thus clearing
    the interrupt condition in the device).
    
    There is still a bad potential effect when double INOs occur,
    not covered by this changeset.  Namely, if the INO is already on
    the per-cpu INO vector list, we still blindly re-insert it and
    thus we can end up losing interrupts already linked in after
    it.
    
    We could deal with that by traversing the list before insertion,
    but that's too expensive for this edge case.
    
    Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/arch/sparc64/kernel/irq.c b/arch/sparc64/kernel/irq.c
index 6b6165d..634194e 100644
--- a/arch/sparc64/kernel/irq.c
+++ b/arch/sparc64/kernel/irq.c
@@ -309,6 +309,10 @@ static void sun4u_irq_disable(unsigned int virt_irq)
 static void sun4u_irq_end(unsigned int virt_irq)
 {
 	struct irq_handler_data *data = get_irq_chip_data(virt_irq);
+	struct irq_desc *desc = irq_desc + virt_irq;
+
+	if (unlikely(desc->status & (IRQ_DISABLED|IRQ_INPROGRESS)))
+		return;
 
 	if (likely(data))
 		upa_writeq(ICLR_IDLE, data->iclr);
@@ -373,6 +377,10 @@ static void sun4v_irq_end(unsigned int virt_irq)
 {
 	struct ino_bucket *bucket = virt_irq_to_bucket(virt_irq);
 	unsigned int ino = bucket - &ivector_table[0];
+	struct irq_desc *desc = irq_desc + virt_irq;
+
+	if (unlikely(desc->status & (IRQ_DISABLED|IRQ_INPROGRESS)))
+		return;
 
 	if (likely(bucket)) {
 		int err;
@@ -443,6 +451,10 @@ static void sun4v_virq_end(unsigned int virt_irq)
 {
 	struct ino_bucket *bucket = virt_irq_to_bucket(virt_irq);
 	unsigned int ino = bucket - &ivector_table[0];
+	struct irq_desc *desc = irq_desc + virt_irq;
+
+	if (unlikely(desc->status & (IRQ_DISABLED|IRQ_INPROGRESS)))
+		return;
 
 	if (likely(bucket)) {
 		unsigned long dev_handle, dev_ino;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: interrupt problem with multiple vnet devices
  2009-04-29 21:41 interrupt problem with multiple vnet devices Chris Torek
  2009-04-29 22:04 ` David Miller
  2009-04-29 22:52 ` David Miller
@ 2009-04-30  0:06 ` Chris Torek
  2009-04-30  0:18 ` David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: Chris Torek @ 2009-04-30  0:06 UTC (permalink / raw)
  To: sparclinux

>This is the commit that test comes from.  I assume you guys
>would check the commit logs for something like this, but just
>in case...
[snippage - will just retain commit ID]
>commit 5a606b72a4309a656cd1a19ad137dc5557c4b8ea

Yes, this one is in.  I did (and still do) suspect a hypervisor
bug.  I think in this case the problem is that this test is not
sufficent for the "two vnets on one vswitch" case, though,
because this results in the desc->status not having
IRQ_INPROGRESS set on the "other" vnet that gets a double
interrupt.

I'm also wondering now if it has something to do with the code we
added that attempts to redistribute interrupts, so that in the case
of a double interrupt, the second one goes to a different CPU
than the CPU already handling the first one.  Although we
should still see IRQ_INPROGRESS then...

Chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: interrupt problem with multiple vnet devices
  2009-04-29 21:41 interrupt problem with multiple vnet devices Chris Torek
                   ` (2 preceding siblings ...)
  2009-04-30  0:06 ` Chris Torek
@ 2009-04-30  0:18 ` David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2009-04-30  0:18 UTC (permalink / raw)
  To: sparclinux

From: Chris Torek <chris.torek@windriver.com>
Date: Wed, 29 Apr 2009 18:06:57 -0600

> Yes, this one is in.  I did (and still do) suspect a hypervisor
> bug.  I think in this case the problem is that this test is not
> sufficent for the "two vnets on one vswitch" case, though,
> because this results in the desc->status not having
> IRQ_INPROGRESS set on the "other" vnet that gets a double
> interrupt.

You can know if it's this hypervisor bug by simply updating
your hypervisor to the latest version available.

> I'm also wondering now if it has something to do with the code we
> added that attempts to redistribute interrupts, so that in the case
> of a double interrupt, the second one goes to a different CPU
> than the CPU already handling the first one.  Although we
> should still see IRQ_INPROGRESS then...

If you're making local changes on that scale, it's irresponsible
to make mention of them when reporting a bug you want this
list to look at :-(

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-04-30  0:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-29 21:41 interrupt problem with multiple vnet devices Chris Torek
2009-04-29 22:04 ` David Miller
2009-04-29 22:52 ` David Miller
2009-04-30  0:06 ` Chris Torek
2009-04-30  0:18 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.