netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] via-rhine: Fix tx_timeout handling
@ 2013-07-19 21:30 Richard Weinberger
  2013-07-21 16:18 ` Ben Hutchings
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Weinberger @ 2013-07-19 21:30 UTC (permalink / raw)
  To: rl; +Cc: netdev, linux-kernel, Richard Weinberger

rhine_reset_task() misses to call netif_stop_queue(),
this can lead to a crash if work is still scheduled while
we're resetting the tx queue.

Fixes:
[   93.591707] BUG: unable to handle kernel NULL pointer dereference at 0000004c
[   93.595514] IP: [<c119d10d>] rhine_napipoll+0x491/0x6e

Signed-off-by: Richard Weinberger <richard@nod.at>
---
 drivers/net/ethernet/via/via-rhine.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c
index b75eb9e..57e1b40 100644
--- a/drivers/net/ethernet/via/via-rhine.c
+++ b/drivers/net/ethernet/via/via-rhine.c
@@ -1615,6 +1615,7 @@ static void rhine_reset_task(struct work_struct *work)
 		goto out_unlock;
 
 	napi_disable(&rp->napi);
+	netif_stop_queue(dev);
 	spin_lock_bh(&rp->lock);
 
 	/* clear all descriptors */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] via-rhine: Fix tx_timeout handling
  2013-07-19 21:30 [PATCH] via-rhine: Fix tx_timeout handling Richard Weinberger
@ 2013-07-21 16:18 ` Ben Hutchings
  2013-07-21 16:32   ` Richard Weinberger
  0 siblings, 1 reply; 4+ messages in thread
From: Ben Hutchings @ 2013-07-21 16:18 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: rl, netdev, linux-kernel

On Fri, 2013-07-19 at 23:30 +0200, Richard Weinberger wrote:
> rhine_reset_task() misses to call netif_stop_queue(),
> this can lead to a crash if work is still scheduled while
> we're resetting the tx queue.
> 
> Fixes:
> [   93.591707] BUG: unable to handle kernel NULL pointer dereference at 0000004c
> [   93.595514] IP: [<c119d10d>] rhine_napipoll+0x491/0x6e
> 
> Signed-off-by: Richard Weinberger <richard@nod.at>
> ---
>  drivers/net/ethernet/via/via-rhine.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c
> index b75eb9e..57e1b40 100644
> --- a/drivers/net/ethernet/via/via-rhine.c
> +++ b/drivers/net/ethernet/via/via-rhine.c
> @@ -1615,6 +1615,7 @@ static void rhine_reset_task(struct work_struct *work)
>  		goto out_unlock;
>  
>  	napi_disable(&rp->napi);
> +	netif_stop_queue(dev);

This is not really fixing the bug because there is no synchronisation
with the TX scheduler.  You can call netif_tx_disable() instead to do
that.

(I also think that it is preferable to use
netif_device_{detach,attach}() to stop the queue during reconfiguration,
as this is independent of TX completions and the watchdog.)

Ben.

>  	spin_lock_bh(&rp->lock);
>  
>  	/* clear all descriptors */

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] via-rhine: Fix tx_timeout handling
  2013-07-21 16:18 ` Ben Hutchings
@ 2013-07-21 16:32   ` Richard Weinberger
  2013-07-21 20:24     ` Ben Hutchings
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Weinberger @ 2013-07-21 16:32 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: rl, netdev, linux-kernel

Am 21.07.2013 18:18, schrieb Ben Hutchings:
> On Fri, 2013-07-19 at 23:30 +0200, Richard Weinberger wrote:
>> rhine_reset_task() misses to call netif_stop_queue(),
>> this can lead to a crash if work is still scheduled while
>> we're resetting the tx queue.
>>
>> Fixes:
>> [   93.591707] BUG: unable to handle kernel NULL pointer dereference at 0000004c
>> [   93.595514] IP: [<c119d10d>] rhine_napipoll+0x491/0x6e
>>
>> Signed-off-by: Richard Weinberger <richard@nod.at>
>> ---
>>  drivers/net/ethernet/via/via-rhine.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c
>> index b75eb9e..57e1b40 100644
>> --- a/drivers/net/ethernet/via/via-rhine.c
>> +++ b/drivers/net/ethernet/via/via-rhine.c
>> @@ -1615,6 +1615,7 @@ static void rhine_reset_task(struct work_struct *work)
>>  		goto out_unlock;
>>  
>>  	napi_disable(&rp->napi);
>> +	netif_stop_queue(dev);
> 
> This is not really fixing the bug because there is no synchronisation
> with the TX scheduler.  You can call netif_tx_disable() instead to do
> that.

I guess other drivers suffer from that too.
A quick grep showed that not many drivers are using netif_tx_disable().

> (I also think that it is preferable to use
> netif_device_{detach,attach}() to stop the queue during reconfiguration,
> as this is independent of TX completions and the watchdog.)

So the correct down sequence is napi_disable() -> netif_tx_disable() -> netif_device_detach()?

Thanks,
//richard

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] via-rhine: Fix tx_timeout handling
  2013-07-21 16:32   ` Richard Weinberger
@ 2013-07-21 20:24     ` Ben Hutchings
  0 siblings, 0 replies; 4+ messages in thread
From: Ben Hutchings @ 2013-07-21 20:24 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: rl, netdev, linux-kernel

On Sun, 2013-07-21 at 18:32 +0200, Richard Weinberger wrote:
> Am 21.07.2013 18:18, schrieb Ben Hutchings:
> > On Fri, 2013-07-19 at 23:30 +0200, Richard Weinberger wrote:
> >> rhine_reset_task() misses to call netif_stop_queue(),
> >> this can lead to a crash if work is still scheduled while
> >> we're resetting the tx queue.
> >>
> >> Fixes:
> >> [   93.591707] BUG: unable to handle kernel NULL pointer dereference at 0000004c
> >> [   93.595514] IP: [<c119d10d>] rhine_napipoll+0x491/0x6e
> >>
> >> Signed-off-by: Richard Weinberger <richard@nod.at>
> >> ---
> >>  drivers/net/ethernet/via/via-rhine.c | 1 +
> >>  1 file changed, 1 insertion(+)
> >>
> >> diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c
> >> index b75eb9e..57e1b40 100644
> >> --- a/drivers/net/ethernet/via/via-rhine.c
> >> +++ b/drivers/net/ethernet/via/via-rhine.c
> >> @@ -1615,6 +1615,7 @@ static void rhine_reset_task(struct work_struct *work)
> >>  		goto out_unlock;
> >>  
> >>  	napi_disable(&rp->napi);
> >> +	netif_stop_queue(dev);
> > 
> > This is not really fixing the bug because there is no synchronisation
> > with the TX scheduler.  You can call netif_tx_disable() instead to do
> > that.
> 
> I guess other drivers suffer from that too.
> A quick grep showed that not many drivers are using netif_tx_disable().
> 
> > (I also think that it is preferable to use
> > netif_device_{detach,attach}() to stop the queue during reconfiguration,
> > as this is independent of TX completions and the watchdog.)

Actually, this is not independent of TX completions - netif_wake_queue()
will still start the TX scheduler while the device is not present, so
you have to avoid that.

> So the correct down sequence is napi_disable() -> netif_tx_disable() -> netif_device_detach()?

No, that's redundant.  You can do:

	napi_disable();
	netif_tx_lock_bh(); /* sync with TX scheduler */
	netif_device_detach();
	netif_tx_unlock_bh();

and then when the queue is ready to use again:

	netif_device_attach();
	napi_enable();

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-07-21 20:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-19 21:30 [PATCH] via-rhine: Fix tx_timeout handling Richard Weinberger
2013-07-21 16:18 ` Ben Hutchings
2013-07-21 16:32   ` Richard Weinberger
2013-07-21 20:24     ` Ben Hutchings

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).