linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] skge: fix occasional BUG during MTU change
@ 2009-04-07 16:36 Michal Schmidt
  2009-04-08 23:01 ` David Miller
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Michal Schmidt @ 2009-04-07 16:36 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, Stephen Hemminger

The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up()
was sometimes observed when setting MTU.

skge_down() disables the TX queue, but then reenables it by mistake via
skge_tx_clean().
Fix it by moving the waking of the queue from skge_tx_clean() to the
other caller. And to make sure start_xmit is not in progress on another
CPU, skge_down() should call netif_tx_disable().

The bug was reported to me by Jiri Jilek whose Debian system sometimes
failed to boot. He tested the patch and the bug did not happen anymore.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/skge.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/skge.c b/drivers/net/skge.c
index 952d37f..b2a05af 100644
--- a/drivers/net/skge.c
+++ b/drivers/net/skge.c
@@ -2674,7 +2674,7 @@ static int skge_down(struct net_device *dev)
 	if (netif_msg_ifdown(skge))
 		printk(KERN_INFO PFX "%s: disabling interface\n", dev->name);
 
-	netif_stop_queue(dev);
+	netif_tx_disable(dev);
 
 	if (hw->chip_id == CHIP_ID_GENESIS && hw->phy_type == SK_PHY_XMAC)
 		del_timer_sync(&skge->link_timer);
@@ -2881,7 +2881,6 @@ static void skge_tx_clean(struct net_device *dev)
 	}
 
 	skge->tx_ring.to_clean = e;
-	netif_wake_queue(dev);
 }
 
 static void skge_tx_timeout(struct net_device *dev)
@@ -2893,6 +2892,7 @@ static void skge_tx_timeout(struct net_device *dev)
 
 	skge_write8(skge->hw, Q_ADDR(txqaddr[skge->port], Q_CSR), CSR_STOP);
 	skge_tx_clean(dev);
+	netif_wake_queue(dev);
 }
 
 static int skge_change_mtu(struct net_device *dev, int new_mtu)
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] skge: fix occasional BUG during MTU change
  2009-04-07 16:36 [PATCH] skge: fix occasional BUG during MTU change Michal Schmidt
@ 2009-04-08 23:01 ` David Miller
  2009-04-08 23:06   ` Stephen Hemminger
  2009-04-10  4:59 ` Andrew Morton
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2009-04-08 23:01 UTC (permalink / raw)
  To: mschmidt; +Cc: netdev, linux-kernel, shemminger

From: Michal Schmidt <mschmidt@redhat.com>
Date: Tue, 7 Apr 2009 18:36:23 +0200

> The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up()
> was sometimes observed when setting MTU.
> 
> skge_down() disables the TX queue, but then reenables it by mistake via
> skge_tx_clean().
> Fix it by moving the waking of the queue from skge_tx_clean() to the
> other caller. And to make sure start_xmit is not in progress on another
> CPU, skge_down() should call netif_tx_disable().
> 
> The bug was reported to me by Jiri Jilek whose Debian system sometimes
> failed to boot. He tested the patch and the bug did not happen anymore.
> 
> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>

Stephen, an ACK possibly?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] skge: fix occasional BUG during MTU change
  2009-04-08 23:01 ` David Miller
@ 2009-04-08 23:06   ` Stephen Hemminger
  2009-04-08 23:08     ` David Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2009-04-08 23:06 UTC (permalink / raw)
  To: David Miller; +Cc: mschmidt, netdev, linux-kernel

On Wed, 08 Apr 2009 16:01:52 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Michal Schmidt <mschmidt@redhat.com>
> Date: Tue, 7 Apr 2009 18:36:23 +0200
> 
> > The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up()
> > was sometimes observed when setting MTU.
> > 
> > skge_down() disables the TX queue, but then reenables it by mistake via
> > skge_tx_clean().
> > Fix it by moving the waking of the queue from skge_tx_clean() to the
> > other caller. And to make sure start_xmit is not in progress on another
> > CPU, skge_down() should call netif_tx_disable().
> > 
> > The bug was reported to me by Jiri Jilek whose Debian system sometimes
> > failed to boot. He tested the patch and the bug did not happen anymore.
> > 
> > Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> 
> Stephen, an ACK possibly?

I wanted to test on real hardware, and am offsite this week.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] skge: fix occasional BUG during MTU change
  2009-04-08 23:06   ` Stephen Hemminger
@ 2009-04-08 23:08     ` David Miller
  0 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2009-04-08 23:08 UTC (permalink / raw)
  To: shemminger; +Cc: mschmidt, netdev, linux-kernel

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed, 8 Apr 2009 16:06:21 -0700

> On Wed, 08 Apr 2009 16:01:52 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
> 
>> From: Michal Schmidt <mschmidt@redhat.com>
>> Date: Tue, 7 Apr 2009 18:36:23 +0200
>> 
>> > The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up()
>> > was sometimes observed when setting MTU.
>> > 
>> > skge_down() disables the TX queue, but then reenables it by mistake via
>> > skge_tx_clean().
>> > Fix it by moving the waking of the queue from skge_tx_clean() to the
>> > other caller. And to make sure start_xmit is not in progress on another
>> > CPU, skge_down() should call netif_tx_disable().
>> > 
>> > The bug was reported to me by Jiri Jilek whose Debian system sometimes
>> > failed to boot. He tested the patch and the bug did not happen anymore.
>> > 
>> > Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
>> 
>> Stephen, an ACK possibly?
> 
> I wanted to test on real hardware, and am offsite this week.

Ok, I'll wait for that, thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] skge: fix occasional BUG during MTU change
  2009-04-07 16:36 [PATCH] skge: fix occasional BUG during MTU change Michal Schmidt
  2009-04-08 23:01 ` David Miller
@ 2009-04-10  4:59 ` Andrew Morton
  2009-04-13 23:23 ` David Miller
  2009-04-14 17:55 ` Stephen Hemminger
  3 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2009-04-10  4:59 UTC (permalink / raw)
  To: Michal Schmidt; +Cc: netdev, linux-kernel, Stephen Hemminger

On Tue, 7 Apr 2009 18:36:23 +0200 Michal Schmidt <mschmidt@redhat.com> wrote:

> The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up()
> was sometimes observed when setting MTU.
> 
> skge_down() disables the TX queue, but then reenables it by mistake via
> skge_tx_clean().
> Fix it by moving the waking of the queue from skge_tx_clean() to the
> other caller. And to make sure start_xmit is not in progress on another
> CPU, skge_down() should call netif_tx_disable().
> 
> The bug was reported to me by Jiri Jilek whose Debian system sometimes
> failed to boot. He tested the patch and the bug did not happen anymore.

It's conventional to add the reporter's "Reported-by:" tag to the
changelog in this situation.

> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>

As the bug is present in 2.6.29 (and possibly earlier?) it's
appropriate to add a Cc: <stable@kernel.org> too.  This makes davem go
mad at you, but I prefer getting madded at over possibly losing bugfixes ;)

> 
> diff --git a/drivers/net/skge.c b/drivers/net/skge.c
> index 952d37f..b2a05af 100644
> --- a/drivers/net/skge.c
> +++ b/drivers/net/skge.c
> @@ -2674,7 +2674,7 @@ static int skge_down(struct net_device *dev)
>  	if (netif_msg_ifdown(skge))
>  		printk(KERN_INFO PFX "%s: disabling interface\n", dev->name);
>  
> -	netif_stop_queue(dev);
> +	netif_tx_disable(dev);
>  
>  	if (hw->chip_id == CHIP_ID_GENESIS && hw->phy_type == SK_PHY_XMAC)
>  		del_timer_sync(&skge->link_timer);
> @@ -2881,7 +2881,6 @@ static void skge_tx_clean(struct net_device *dev)
>  	}
>  
>  	skge->tx_ring.to_clean = e;
> -	netif_wake_queue(dev);
>  }
>  
>  static void skge_tx_timeout(struct net_device *dev)
> @@ -2893,6 +2892,7 @@ static void skge_tx_timeout(struct net_device *dev)
>  
>  	skge_write8(skge->hw, Q_ADDR(txqaddr[skge->port], Q_CSR), CSR_STOP);
>  	skge_tx_clean(dev);
> +	netif_wake_queue(dev);
>  }
>  
>  static int skge_change_mtu(struct net_device *dev, int new_mtu)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] skge: fix occasional BUG during MTU change
  2009-04-07 16:36 [PATCH] skge: fix occasional BUG during MTU change Michal Schmidt
  2009-04-08 23:01 ` David Miller
  2009-04-10  4:59 ` Andrew Morton
@ 2009-04-13 23:23 ` David Miller
  2009-04-14 17:55 ` Stephen Hemminger
  3 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2009-04-13 23:23 UTC (permalink / raw)
  To: mschmidt; +Cc: netdev, linux-kernel, shemminger

From: Michal Schmidt <mschmidt@redhat.com>
Date: Tue, 7 Apr 2009 18:36:23 +0200

> The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up()
> was sometimes observed when setting MTU.
> 
> skge_down() disables the TX queue, but then reenables it by mistake via
> skge_tx_clean().
> Fix it by moving the waking of the queue from skge_tx_clean() to the
> other caller. And to make sure start_xmit is not in progress on another
> CPU, skge_down() should call netif_tx_disable().
> 
> The bug was reported to me by Jiri Jilek whose Debian system sometimes
> failed to boot. He tested the patch and the bug did not happen anymore.
> 
> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>

Stephen have you had a chance to test this yet?

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] skge: fix occasional BUG during MTU change
  2009-04-07 16:36 [PATCH] skge: fix occasional BUG during MTU change Michal Schmidt
                   ` (2 preceding siblings ...)
  2009-04-13 23:23 ` David Miller
@ 2009-04-14 17:55 ` Stephen Hemminger
  2009-04-14 22:17   ` David Miller
  3 siblings, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2009-04-14 17:55 UTC (permalink / raw)
  To: Michal Schmidt; +Cc: netdev, linux-kernel

On Tue, 7 Apr 2009 18:36:23 +0200
Michal Schmidt <mschmidt@redhat.com> wrote:

> The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up()
> was sometimes observed when setting MTU.
> 
> skge_down() disables the TX queue, but then reenables it by mistake via
> skge_tx_clean().
> Fix it by moving the waking of the queue from skge_tx_clean() to the
> other caller. And to make sure start_xmit is not in progress on another
> CPU, skge_down() should call netif_tx_disable().
> 
> The bug was reported to me by Jiri Jilek whose Debian system sometimes
> failed to boot. He tested the patch and the bug did not happen anymore.
> 
> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> ---
>  drivers/net/skge.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)

Tested fine. This should go to stable as well.

Acked-by: Stephen Hemminger <shemminger@vyatta.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] skge: fix occasional BUG during MTU change
  2009-04-14 17:55 ` Stephen Hemminger
@ 2009-04-14 22:17   ` David Miller
  0 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2009-04-14 22:17 UTC (permalink / raw)
  To: shemminger; +Cc: mschmidt, netdev, linux-kernel

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Tue, 14 Apr 2009 10:55:39 -0700

> On Tue, 7 Apr 2009 18:36:23 +0200
> Michal Schmidt <mschmidt@redhat.com> wrote:
> 
>> The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up()
>> was sometimes observed when setting MTU.
>> 
>> skge_down() disables the TX queue, but then reenables it by mistake via
>> skge_tx_clean().
>> Fix it by moving the waking of the queue from skge_tx_clean() to the
>> other caller. And to make sure start_xmit is not in progress on another
>> CPU, skge_down() should call netif_tx_disable().
>> 
>> The bug was reported to me by Jiri Jilek whose Debian system sometimes
>> failed to boot. He tested the patch and the bug did not happen anymore.
>> 
>> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
>> ---
>>  drivers/net/skge.c |    4 ++--
>>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> Tested fine. This should go to stable as well.
> 
> Acked-by: Stephen Hemminger <shemminger@vyatta.com>

Applied, thanks everyone.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-04-14 22:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-07 16:36 [PATCH] skge: fix occasional BUG during MTU change Michal Schmidt
2009-04-08 23:01 ` David Miller
2009-04-08 23:06   ` Stephen Hemminger
2009-04-08 23:08     ` David Miller
2009-04-10  4:59 ` Andrew Morton
2009-04-13 23:23 ` David Miller
2009-04-14 17:55 ` Stephen Hemminger
2009-04-14 22:17   ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).