All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock
@ 2011-08-25  1:34 Thomas Pedersen
  2011-08-25  9:48 ` Bob Copeland
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Thomas Pedersen @ 2011-08-25  1:34 UTC (permalink / raw)
  To: linux-wireless; +Cc: jirislaby, mickflemm, lrodriguez, me, Javier Cardona

From: Javier Cardona <javier@cozybit.com>

This driver reports transmission status to the upper layer
(ath5k_tx_frame_completed()) while holding the lock on the transmission
queue (txq->lock).  Under failure conditions, the mesh stack will
attempt to send PERR messages to the previous sender of the failed
frame.  When that happens the driver will attempt to re-acquire the
txq->lock lock causing a deadlock.  There are two possible fixes for
this, (1) we could defer the transmission of the PERR frame until the
lock is released or (2) release the lock before invoking
ieee80211_tx_status().  The ath9k driver implements the second approach
(see ath_tx_complete() in ath9k/xmit.c) as well as the rt2x00 and b43
drivers.  The iwl driver, on the other hand, avoids this problem by
invoking  ieee80211_tx_status_irqsafe() which effectively defers
processing of transmission feedback status.  This last approach is the
least intrusive is implemented here.

Reported by Pedro Larbig (ASPj)
---
 drivers/net/wireless/ath/ath5k/base.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/ath/ath5k/base.c b/drivers/net/wireless/ath/ath5k/base.c
index e9ea38d..9b488f9 100644
--- a/drivers/net/wireless/ath/ath5k/base.c
+++ b/drivers/net/wireless/ath/ath5k/base.c
@@ -1630,7 +1630,7 @@ ath5k_tx_frame_completed(struct ath5k_hw *ah, struct sk_buff *skb,
 		ah->stats.antenna_tx[0]++; /* invalid */
 
 	trace_ath5k_tx_complete(ah, skb, txq, ts);
-	ieee80211_tx_status(ah->hw, skb);
+	ieee80211_tx_status_irqsafe(ah->hw, skb);
 }
 
 static void
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock
  2011-08-25  1:34 [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock Thomas Pedersen
@ 2011-08-25  9:48 ` Bob Copeland
  2011-08-26 14:22 ` John W. Linville
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 19+ messages in thread
From: Bob Copeland @ 2011-08-25  9:48 UTC (permalink / raw)
  To: Thomas Pedersen
  Cc: linux-wireless, jirislaby, mickflemm, lrodriguez, Javier Cardona

On Wed, Aug 24, 2011 at 06:34:24PM -0700, Thomas Pedersen wrote:
> From: Javier Cardona <javier@cozybit.com>
> 
> This driver reports transmission status to the upper layer
> (ath5k_tx_frame_completed()) while holding the lock on the transmission
> queue (txq->lock).  Under failure conditions, the mesh stack will
> attempt to send PERR messages to the previous sender of the failed
> frame.  When that happens the driver will attempt to re-acquire the
> txq->lock lock causing a deadlock.  There are two possible fixes for
> this, (1) we could defer the transmission of the PERR frame until the
> lock is released or 

Of course, as a lazy driver writer I would like this one.  Esp. without
at least a comment in tx_status that it can call back into the driver.

> (2) release the lock before invoking ieee80211_tx_status().  

I was initially worried that we might race against draining buffers
that might also want to free the skb.  That is not the case (skb is
already removed from the structure) -- but there could potentially
be other races with drain since we're mucking with the list that
we're iterating over.  I'm least enthused with this approach.

> (see ath_tx_complete() in ath9k/xmit.c) as well as the rt2x00 and b43
> drivers.  The iwl driver, on the other hand, avoids this problem by
> invoking  ieee80211_tx_status_irqsafe() which effectively defers
> processing of transmission feedback status.  This last approach is the
> least intrusive is implemented here.

I guess it's ok.  The only real potential fallout would be worse rate
adaptation?  Perhaps needs a comment so no one fixes it later.

-- 
Bob Copeland %% www.bobcopeland.com


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock
  2011-08-25  1:34 [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock Thomas Pedersen
  2011-08-25  9:48 ` Bob Copeland
@ 2011-08-26 14:22 ` John W. Linville
  2011-08-29  2:07   ` Thomas Pedersen
  2011-08-30 12:18 ` Johannes Berg
  2011-08-30 12:21 ` Johannes Berg
  3 siblings, 1 reply; 19+ messages in thread
From: John W. Linville @ 2011-08-26 14:22 UTC (permalink / raw)
  To: Thomas Pedersen
  Cc: linux-wireless, jirislaby, mickflemm, lrodriguez, me, Javier Cardona

On Wed, Aug 24, 2011 at 06:34:24PM -0700, Thomas Pedersen wrote:
> From: Javier Cardona <javier@cozybit.com>
> 
> This driver reports transmission status to the upper layer
> (ath5k_tx_frame_completed()) while holding the lock on the transmission
> queue (txq->lock).  Under failure conditions, the mesh stack will
> attempt to send PERR messages to the previous sender of the failed
> frame.  When that happens the driver will attempt to re-acquire the
> txq->lock lock causing a deadlock.  There are two possible fixes for
> this, (1) we could defer the transmission of the PERR frame until the
> lock is released or (2) release the lock before invoking
> ieee80211_tx_status().  The ath9k driver implements the second approach
> (see ath_tx_complete() in ath9k/xmit.c) as well as the rt2x00 and b43
> drivers.  The iwl driver, on the other hand, avoids this problem by
> invoking  ieee80211_tx_status_irqsafe() which effectively defers
> processing of transmission feedback status.  This last approach is the
> least intrusive is implemented here.
> 
> Reported by Pedro Larbig (ASPj)
> ---
>  drivers/net/wireless/ath/ath5k/base.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)

Missing Signed-off-by...

-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock
  2011-08-26 14:22 ` John W. Linville
@ 2011-08-29  2:07   ` Thomas Pedersen
  2011-08-29 14:09     ` Bob Copeland
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Pedersen @ 2011-08-29  2:07 UTC (permalink / raw)
  To: John W. Linville
  Cc: linux-wireless, jirislaby, mickflemm, lrodriguez, me, Javier Cardona

On Fri, Aug 26, 2011 at 7:22 AM, John W. Linville
<linville@tuxdriver.com> wrote:
> On Wed, Aug 24, 2011 at 06:34:24PM -0700, Thomas Pedersen wrote:
>> From: Javier Cardona <javier@cozybit.com>
>>
>> This driver reports transmission status to the upper layer
>> (ath5k_tx_frame_completed()) while holding the lock on the transmission
>> queue (txq->lock).  Under failure conditions, the mesh stack will
>> attempt to send PERR messages to the previous sender of the failed
>> frame.  When that happens the driver will attempt to re-acquire the
>> txq->lock lock causing a deadlock.  There are two possible fixes for
>> this, (1) we could defer the transmission of the PERR frame until the
>> lock is released or (2) release the lock before invoking
>> ieee80211_tx_status().  The ath9k driver implements the second approach
>> (see ath_tx_complete() in ath9k/xmit.c) as well as the rt2x00 and b43
>> drivers.  The iwl driver, on the other hand, avoids this problem by
>> invoking  ieee80211_tx_status_irqsafe() which effectively defers
>> processing of transmission feedback status.  This last approach is the
>> least intrusive is implemented here.
>>
>> Reported by Pedro Larbig (ASPj)
>> ---
>>  drivers/net/wireless/ath/ath5k/base.c |    2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> Missing Signed-off-by...
>
Yikes. Also, it looks like ieee80211_tx_status() should not be called
from irq context. Will resubmit a v2 with signoff and comment shortly.

Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock
  2011-08-29  2:07   ` Thomas Pedersen
@ 2011-08-29 14:09     ` Bob Copeland
  2011-08-29 18:13       ` Thomas Pedersen
  0 siblings, 1 reply; 19+ messages in thread
From: Bob Copeland @ 2011-08-29 14:09 UTC (permalink / raw)
  To: Thomas Pedersen
  Cc: John W. Linville, linux-wireless, jirislaby, mickflemm,
	lrodriguez, Javier Cardona

On Sun, Aug 28, 2011 at 10:07 PM, Thomas Pedersen <thomas@cozybit.com> wrote:
> On Fri, Aug 26, 2011 at 7:22 AM, John W. Linville
> <linville@tuxdriver.com> wrote:
>> On Wed, Aug 24, 2011 at 06:34:24PM -0700, Thomas Pedersen wrote:
>>> From: Javier Cardona <javier@cozybit.com>
>>>
>>> This driver reports transmission status to the upper layer
>>> (ath5k_tx_frame_completed()) while holding the lock on the transmission
>>> queue (txq->lock).  Under failure conditions, the mesh stack will
>>> attempt to send PERR messages to the previous sender of the failed
>>> frame.  When that happens the driver will attempt to re-acquire the
>>> txq->lock lock causing a deadlock.  There are two possible fixes for
>>> this, (1) we could defer the transmission of the PERR frame until the
>>> lock is released or (2) release the lock before invoking
>>> ieee80211_tx_status().  The ath9k driver implements the second approach
>>> (see ath_tx_complete() in ath9k/xmit.c) as well as the rt2x00 and b43
>>> drivers.  The iwl driver, on the other hand, avoids this problem by
>>> invoking  ieee80211_tx_status_irqsafe() which effectively defers
>>> processing of transmission feedback status.  This last approach is the
>>> least intrusive is implemented here.
>>>
>>> Reported by Pedro Larbig (ASPj)
>>> ---
>>>  drivers/net/wireless/ath/ath5k/base.c |    2 +-
>>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> Missing Signed-off-by...
>>
> Yikes. Also, it looks like ieee80211_tx_status() should not be called
> from irq context. Will resubmit a v2 with signoff and comment shortly.

I don't get the last statement -- ieee80211_tx_status() -> irqsafe was
the very change the patch added.  It's running in a bottom half
though, not a hard irq.

-- 
Bob Copeland %% www.bobcopeland.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock
  2011-08-29 14:09     ` Bob Copeland
@ 2011-08-29 18:13       ` Thomas Pedersen
  2011-08-30 11:50         ` Bob Copeland
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Pedersen @ 2011-08-29 18:13 UTC (permalink / raw)
  To: Bob Copeland
  Cc: John W. Linville, linux-wireless, jirislaby, mickflemm,
	lrodriguez, Javier Cardona

On Mon, Aug 29, 2011 at 7:09 AM, Bob Copeland <me@bobcopeland.com> wrote:
> On Sun, Aug 28, 2011 at 10:07 PM, Thomas Pedersen <thomas@cozybit.com> wrote:
>> On Fri, Aug 26, 2011 at 7:22 AM, John W. Linville
>> <linville@tuxdriver.com> wrote:
>>> On Wed, Aug 24, 2011 at 06:34:24PM -0700, Thomas Pedersen wrote:
>>>> From: Javier Cardona <javier@cozybit.com>
>>>>
>>>> This driver reports transmission status to the upper layer
>>>> (ath5k_tx_frame_completed()) while holding the lock on the transmission
>>>> queue (txq->lock).  Under failure conditions, the mesh stack will
>>>> attempt to send PERR messages to the previous sender of the failed
>>>> frame.  When that happens the driver will attempt to re-acquire the
>>>> txq->lock lock causing a deadlock.  There are two possible fixes for
>>>> this, (1) we could defer the transmission of the PERR frame until the
>>>> lock is released or (2) release the lock before invoking
>>>> ieee80211_tx_status().  The ath9k driver implements the second approach
>>>> (see ath_tx_complete() in ath9k/xmit.c) as well as the rt2x00 and b43
>>>> drivers.  The iwl driver, on the other hand, avoids this problem by
>>>> invoking  ieee80211_tx_status_irqsafe() which effectively defers
>>>> processing of transmission feedback status.  This last approach is the
>>>> least intrusive is implemented here.
>>>>
>>>> Reported by Pedro Larbig (ASPj)
>>>> ---
>>>>  drivers/net/wireless/ath/ath5k/base.c |    2 +-
>>>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>>
>>> Missing Signed-off-by...
>>>
>> Yikes. Also, it looks like ieee80211_tx_status() should not be called
>> from irq context. Will resubmit a v2 with signoff and comment shortly.
>
> I don't get the last statement -- ieee80211_tx_status() -> irqsafe was

I meant to say "In addition to the above discussion,
ieee80211_tx_status() should not be called from interrupt context
anyway".

> the very change the patch added.  It's running in a bottom half
> though, not a hard irq.
>

Even in a bottom half we're still in "interrupt" context, right?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock
  2011-08-29 18:13       ` Thomas Pedersen
@ 2011-08-30 11:50         ` Bob Copeland
  0 siblings, 0 replies; 19+ messages in thread
From: Bob Copeland @ 2011-08-30 11:50 UTC (permalink / raw)
  To: Thomas Pedersen
  Cc: John W. Linville, linux-wireless, jirislaby, mickflemm,
	lrodriguez, Javier Cardona

On Mon, Aug 29, 2011 at 11:13:10AM -0700, Thomas Pedersen wrote:
> I meant to say "In addition to the above discussion,
> ieee80211_tx_status() should not be called from interrupt context
> anyway".
> 
> > the very change the patch added.  It's running in a bottom half
> > though, not a hard irq.
> 
> Even in a bottom half we're still in "interrupt" context, right?

Yes, but my interpretation is that _irqsafe() in this case refers to
hard irq context and not softirq context.  So it's still atomic,
but tx_status() does things like kfree_skb() directly instead of
deferring that to another softirq.

There are 3 contexts in Linux: process, softirq, and hardirq, and
given that we have 3 corresponding tx_status functions (tx_status_ni,
tx_status, tx_status_irqsafe), this interpretation makes sense, no?
Someone correct me if I'm wrong here.

It seems to me this patch warrants some discussion or some comments
in the .h files about how drivers can be called back into themselves;
otherwise it's very hard for a driver writer to get locking right
without examining the stack.

And dropping locks around various API calls means you have to validate
the protected state when you get the lock back, it's not a panacea.

-- 
Bob Copeland %% www.bobcopeland.com


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock
  2011-08-25  1:34 [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock Thomas Pedersen
  2011-08-25  9:48 ` Bob Copeland
  2011-08-26 14:22 ` John W. Linville
@ 2011-08-30 12:18 ` Johannes Berg
  2011-08-30 12:21 ` Johannes Berg
  3 siblings, 0 replies; 19+ messages in thread
From: Johannes Berg @ 2011-08-30 12:18 UTC (permalink / raw)
  To: Thomas Pedersen
  Cc: linux-wireless, jirislaby, mickflemm, lrodriguez, me, Javier Cardona

Well, NACK. You shouldn't mix rx() with tx_status_irqsafe().

johannes


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock
  2011-08-25  1:34 [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock Thomas Pedersen
                   ` (2 preceding siblings ...)
  2011-08-30 12:18 ` Johannes Berg
@ 2011-08-30 12:21 ` Johannes Berg
  2011-08-30 16:22   ` Javier Cardona
  2011-08-30 18:29   ` [PATCH] mac80211: Defer tranmission of mesh path errors Javier Cardona
  3 siblings, 2 replies; 19+ messages in thread
From: Johannes Berg @ 2011-08-30 12:21 UTC (permalink / raw)
  To: Thomas Pedersen
  Cc: linux-wireless, jirislaby, mickflemm, lrodriguez, me, Javier Cardona

On Wed, 2011-08-24 at 18:34 -0700, Thomas Pedersen wrote:
> From: Javier Cardona <javier@cozybit.com>
> 
> This driver reports transmission status to the upper layer
> (ath5k_tx_frame_completed()) while holding the lock on the transmission
> queue (txq->lock).  Under failure conditions, the mesh stack will
> attempt to send PERR messages to the previous sender of the failed
> frame.

I'd feel a lot more comfortable if the mesh stack didn't do that. I
suppose it could use ieee80211_add_pending_skb() instead?

johannes


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock
  2011-08-30 12:21 ` Johannes Berg
@ 2011-08-30 16:22   ` Javier Cardona
  2011-08-30 18:29   ` [PATCH] mac80211: Defer tranmission of mesh path errors Javier Cardona
  1 sibling, 0 replies; 19+ messages in thread
From: Javier Cardona @ 2011-08-30 16:22 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Thomas Pedersen, linux-wireless, jirislaby, mickflemm, lrodriguez, me

On Tue, Aug 30, 2011 at 5:21 AM, Johannes Berg
<johannes@sipsolutions.net> wrote:
> On Wed, 2011-08-24 at 18:34 -0700, Thomas Pedersen wrote:
>> From: Javier Cardona <javier@cozybit.com>
>>
>> This driver reports transmission status to the upper layer
>> (ath5k_tx_frame_completed()) while holding the lock on the transmission
>> queue (txq->lock).  Under failure conditions, the mesh stack will
>> attempt to send PERR messages to the previous sender of the failed
>> frame.
>
> I'd feel a lot more comfortable if the mesh stack didn't do that. I
> suppose it could use ieee80211_add_pending_skb() instead?

Suggestion taken.  We'll implement that.

Cheers,

Javier

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] mac80211: Defer tranmission of mesh path errors
  2011-08-30 12:21 ` Johannes Berg
  2011-08-30 16:22   ` Javier Cardona
@ 2011-08-30 18:29   ` Javier Cardona
  2011-08-30 18:43     ` Johannes Berg
  1 sibling, 1 reply; 19+ messages in thread
From: Javier Cardona @ 2011-08-30 18:29 UTC (permalink / raw)
  To: John W. Linville
  Cc: Javier Cardona, Thomas Pedersen, devel, Johannes Berg,
	linux-wireless, jlopex

Under failure conditions, the mesh stack sends PERR messages to the
previous sender of the failed frame.  This happens in the tx feedback
path, in which the transmission queue lock may be taken.  Avoid a
deadlock by sending the path error via the pending queue.

Signed-off-by: Javier Cardona <javier@cozybit.com>
---
 net/mac80211/mesh_hwmp.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
index fd4f76a..f760679 100644
--- a/net/mac80211/mesh_hwmp.c
+++ b/net/mac80211/mesh_hwmp.c
@@ -209,6 +209,10 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags,
  * @target_sn: SN of the broken destination
  * @target_rcode: reason code for this PERR
  * @ra: node this frame is addressed to
+ *
+ * Note: This function may be called from the tx feedback path, possibly with
+ * the transmission queue lock taken.  To avoid a deadlock we don't transmit
+ * the frame directly but add it to the pending queue instead.
  */
 int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn,
 		       __le16 target_rcode, const u8 *ra,
@@ -263,7 +267,8 @@ int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn,
 	pos += 4;
 	memcpy(pos, &target_rcode, 2);
 
-	ieee80211_tx_skb(sdata, skb);
+	/* see note in function header */
+	ieee80211_add_pending_skb(local, skb);
 	return 0;
 }
 
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] mac80211: Defer tranmission of mesh path errors
  2011-08-30 18:29   ` [PATCH] mac80211: Defer tranmission of mesh path errors Javier Cardona
@ 2011-08-30 18:43     ` Johannes Berg
  2011-08-30 21:38       ` Javier Cardona
  0 siblings, 1 reply; 19+ messages in thread
From: Johannes Berg @ 2011-08-30 18:43 UTC (permalink / raw)
  To: Javier Cardona
  Cc: John W. Linville, Thomas Pedersen, devel, linux-wireless, jlopex

On Tue, 2011-08-30 at 11:29 -0700, Javier Cardona wrote:
> Under failure conditions, the mesh stack sends PERR messages to the
> previous sender of the failed frame.  This happens in the tx feedback
> path, in which the transmission queue lock may be taken.  Avoid a
> deadlock by sending the path error via the pending queue.
> 
> Signed-off-by: Javier Cardona <javier@cozybit.com>
> ---
>  net/mac80211/mesh_hwmp.c |    7 ++++++-
>  1 files changed, 6 insertions(+), 1 deletions(-)
> 
> diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
> index fd4f76a..f760679 100644
> --- a/net/mac80211/mesh_hwmp.c
> +++ b/net/mac80211/mesh_hwmp.c
> @@ -209,6 +209,10 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags,
>   * @target_sn: SN of the broken destination
>   * @target_rcode: reason code for this PERR
>   * @ra: node this frame is addressed to
> + *
> + * Note: This function may be called from the tx feedback path, possibly with
> + * the transmission queue lock taken.  To avoid a deadlock we don't transmit
> + * the frame directly but add it to the pending queue instead.

I'd change that to say "with driver locks taken that the driver also
acquires in the TX path" or something like that maybe?

But that's not a big thing. Have you tested it? I'm wondering if because
we take a lock here we might run into lock dependencies issues (lockdep
would say) but I don't think so because stop_queue etc. all take the
same lock from an arbitrary driver context already.

Reviewed-by: Johannes Berg <johannes@sipsolutions.net>

johannes


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mac80211: Defer tranmission of mesh path errors
  2011-08-30 18:43     ` Johannes Berg
@ 2011-08-30 21:38       ` Javier Cardona
  2011-08-31  1:50         ` Javier Cardona
  0 siblings, 1 reply; 19+ messages in thread
From: Javier Cardona @ 2011-08-30 21:38 UTC (permalink / raw)
  To: Johannes Berg
  Cc: John W. Linville, Thomas Pedersen, devel, linux-wireless, jlopex

On Tue, Aug 30, 2011 at 11:43 AM, Johannes Berg
<johannes@sipsolutions.net> wrote:
> On Tue, 2011-08-30 at 11:29 -0700, Javier Cardona wrote:
>> Under failure conditions, the mesh stack sends PERR messages to the
>> previous sender of the failed frame.  This happens in the tx feedback
>> path, in which the transmission queue lock may be taken.  Avoid a
>> deadlock by sending the path error via the pending queue.
>>
>> Signed-off-by: Javier Cardona <javier@cozybit.com>
>> ---
>>  net/mac80211/mesh_hwmp.c |    7 ++++++-
>>  1 files changed, 6 insertions(+), 1 deletions(-)
>>
>> diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
>> index fd4f76a..f760679 100644
>> --- a/net/mac80211/mesh_hwmp.c
>> +++ b/net/mac80211/mesh_hwmp.c
>> @@ -209,6 +209,10 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags,
>>   * @target_sn: SN of the broken destination
>>   * @target_rcode: reason code for this PERR
>>   * @ra: node this frame is addressed to
>> + *
>> + * Note: This function may be called from the tx feedback path, possibly with
>> + * the transmission queue lock taken.  To avoid a deadlock we don't transmit
>> + * the frame directly but add it to the pending queue instead.
>
> I'd change that to say "with driver locks taken that the driver also
> acquires in the TX path" or something like that maybe?

OK

> But that's not a big thing. Have you tested it? I'm wondering if because
> we take a lock here we might run into lock dependencies issues (lockdep
> would say) but I don't think so because stop_queue etc. all take the
> same lock from an arbitrary driver context already.

We don't have a test specific for that particular perr.  Other tests
run fine and lockdep does not complain.
I can resubmit the patch with the clearer comment once we've run our
whole test suite if that gives you peace of mind.

Javier

> Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
>
> johannes
>
>



-- 
Javier Cardona
cozybit Inc.
http://www.cozybit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mac80211: Defer tranmission of mesh path errors
  2011-08-30 21:38       ` Javier Cardona
@ 2011-08-31  1:50         ` Javier Cardona
  2011-08-31  5:11           ` Johannes Berg
  0 siblings, 1 reply; 19+ messages in thread
From: Javier Cardona @ 2011-08-31  1:50 UTC (permalink / raw)
  To: Johannes Berg
  Cc: John W. Linville, Thomas Pedersen, devel, linux-wireless, jlopex

On Tue, Aug 30, 2011 at 2:38 PM, Javier Cardona <javier@cozybit.com> wrote:
> On Tue, Aug 30, 2011 at 11:43 AM, Johannes Berg
>> But that's not a big thing. Have you tested it? I'm wondering if because
>> we take a lock here we might run into lock dependencies issues (lockdep
>> would say) but I don't think so because stop_queue etc. all take the
>> same lock from an arbitrary driver context already.
>
> We don't have a test specific for that particular perr.  Other tests
> run fine and lockdep does not complain.
> I can resubmit the patch with the clearer comment once we've run our
> whole test suite if that gives you peace of mind.

The stress tests do trigger perrs and, with this change, we see a
warning due to missing info->control.vif

[14202.988077] ------------[ cut here ]------------
[14202.988351] WARNING: at net/mac80211/util.c:358
ieee80211_add_pending_skb+0x97/0xa0()
[14202.988353] Hardware name: Bochs
[14202.988355] Modules linked in: mac80211_hwsim [last unloaded: mac80211_hwsim]
[14202.988359] Pid: 15051, comm: iperf Tainted: G        W   3.1.0-rc4-wl+ #47
[14202.988361] Call Trace:
[14202.988364]  [<c103ad2d>] warn_slowpath_common+0x6d/0xa0
[14202.988367]  [<c15fbca7>] ? ieee80211_add_pending_skb+0x97/0xa0
[14202.988370]  [<c15fbca7>] ? ieee80211_add_pending_skb+0x97/0xa0
[14202.988373]  [<c103ad7d>] warn_slowpath_null+0x1d/0x20
[14202.988376]  [<c15fbca7>] ieee80211_add_pending_skb+0x97/0xa0
[14202.988379]  [<c1605ab1>] mesh_path_error_tx+0x151/0x190
[14202.988382]  [<c160362b>] mesh_path_discard_frame+0xfb/0x100
[14202.988385]  [<c1603580>] ? mesh_path_discard_frame+0x50/0x100
[14202.988387]  [<c1605dd7>] mesh_nexthop_lookup+0x157/0x1d0
[14202.988390]  [<c1605c80>] ? mesh_queue_preq+0x190/0x190
[14202.988393]  [<c15f826a>] ieee80211_xmit+0x10a/0x240
[14202.988395]  [<c15f8160>] ? ieee80211_tx+0xe0/0xe0
[14202.988398]  [<c15f5c3a>] ? ieee80211_skb_resize+0x7a/0x100
[14202.988401]  [<c15f8ba7>] ieee80211_subif_start_xmit+0x307/0x810
[14202.988404]  [<c15f8cf8>] ? ieee80211_subif_start_xmit+0x458/0x810

I'll look into it tomorrow.

Cheers,

j



-- 
Javier Cardona
cozybit Inc.
http://www.cozybit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mac80211: Defer tranmission of mesh path errors
  2011-08-31  1:50         ` Javier Cardona
@ 2011-08-31  5:11           ` Johannes Berg
  2011-09-01 17:04             ` Javier Cardona
  0 siblings, 1 reply; 19+ messages in thread
From: Johannes Berg @ 2011-08-31  5:11 UTC (permalink / raw)
  To: Javier Cardona
  Cc: John W. Linville, Thomas Pedersen, devel, linux-wireless, jlopex

On Tue, 2011-08-30 at 18:50 -0700, Javier Cardona wrote:

> The stress tests do trigger perrs and, with this change, we see a
> warning due to missing info->control.vif
> 
> [14202.988077] ------------[ cut here ]------------
> [14202.988351] WARNING: at net/mac80211/util.c:358
> ieee80211_add_pending_skb+0x97/0xa0()

Interesting. Ok, that means we can't just use add_pending_skb(), we need
to do some setup before, all the setup that ieee80211_tx_skb() and
ieee80211_xmit() would do up to ieee80211_tx()...

I guess we need a little bit of helper code to enable this.

johannes


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mac80211: Defer tranmission of mesh path errors
  2011-08-31  5:11           ` Johannes Berg
@ 2011-09-01 17:04             ` Javier Cardona
  2011-09-01 17:04               ` [PATCH v2] " Javier Cardona
  0 siblings, 1 reply; 19+ messages in thread
From: Javier Cardona @ 2011-09-01 17:04 UTC (permalink / raw)
  To: John W. Linville
  Cc: Javier Cardona, Thomas Pedersen, devel, Johannes Berg,
	linux-wireless, jlopex

On Tue, Aug 30, 2011 at 10:11 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Tue, 2011-08-30 at 18:50 -0700, Javier Cardona wrote:
>
>> The stress tests do trigger perrs and, with this change, we see a
>> warning due to missing info->control.vif
>>
>> [14202.988077] ------------[ cut here ]------------
>> [14202.988351] WARNING: at net/mac80211/util.c:358
>> ieee80211_add_pending_skb+0x97/0xa0()
>
> Interesting. Ok, that means we can't just use add_pending_skb(), we need
> to do some setup before, all the setup that ieee80211_tx_skb() and
> ieee80211_xmit() would do up to ieee80211_tx()...
>
> I guess we need a little bit of helper code to enable this.

The patch that follows seems to do it.  Passes our stress test without
warnings.  I increased the headroom on allocation to avoid a resize.  Does it
look correct to you?

Cheers,

Javier

-- 
1.7.6

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2] mac80211: Defer tranmission of mesh path errors
  2011-09-01 17:04             ` Javier Cardona
@ 2011-09-01 17:04               ` Javier Cardona
  2011-09-02 10:59                 ` Johannes Berg
  0 siblings, 1 reply; 19+ messages in thread
From: Javier Cardona @ 2011-09-01 17:04 UTC (permalink / raw)
  To: John W. Linville
  Cc: Javier Cardona, Thomas Pedersen, devel, Johannes Berg,
	linux-wireless, jlopex

Under failure conditions, the mesh stack sends PERR messages to the
previous sender of the failed frame.  This happens in the tx feedback
path, in which the transmission queue lock may be taken.  Avoid a
deadlock by sending the path error via the pending queue.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
---
v2: - Helper to prepare skb for deferred transmission (Johannes)
    - Reword comment (Johannes)

 net/mac80211/mesh_hwmp.c |   30 ++++++++++++++++++++++++++++--
 1 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
index fd4f76a..cda4818 100644
--- a/net/mac80211/mesh_hwmp.c
+++ b/net/mac80211/mesh_hwmp.c
@@ -8,6 +8,7 @@
  */
 
 #include <linux/slab.h>
+#include "wme.h"
 #include "mesh.h"
 
 #ifdef CONFIG_MAC80211_VERBOSE_MHWMP_DEBUG
@@ -202,6 +203,25 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags,
 	return 0;
 }
 
+
+static void prepare_frame_for_deferred_tx(struct ieee80211_sub_if_data *sdata,
+		struct sk_buff *skb)
+{
+	struct ieee80211_local *local = sdata->local;
+	struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
+
+	skb_set_mac_header(skb, 0);
+	skb_set_network_header(skb, 0);
+	skb_set_transport_header(skb, 0);
+
+	/* Send all internal mgmt frames on VO. Accordingly set TID to 7. */
+	skb_set_queue_mapping(skb, IEEE80211_AC_VO);
+	skb->priority = 7;
+
+	info->control.vif = &sdata->vif;
+	ieee80211_set_qos_hdr(local, skb);
+}
+
 /**
  * mesh_send_path error - Sends a PERR mesh management frame
  *
@@ -209,6 +229,10 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags,
  * @target_sn: SN of the broken destination
  * @target_rcode: reason code for this PERR
  * @ra: node this frame is addressed to
+ *
+ * Note: This function may be called with driver locks taken that the driver
+ * also acquires in the TX path.  To avoid a deadlock we don't transmit the
+ * frame directly but add it to the pending queue instead.
  */
 int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn,
 		       __le16 target_rcode, const u8 *ra,
@@ -222,7 +246,7 @@ int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn,
 
 	if (!skb)
 		return -1;
-	skb_reserve(skb, local->hw.extra_tx_headroom);
+	skb_reserve(skb, local->tx_headroom + local->hw.extra_tx_headroom);
 	/* 25 is the size of the common mgmt part (24) plus the size of the
 	 * common action part (1)
 	 */
@@ -263,7 +287,9 @@ int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn,
 	pos += 4;
 	memcpy(pos, &target_rcode, 2);
 
-	ieee80211_tx_skb(sdata, skb);
+	/* see note in function header */
+	prepare_frame_for_deferred_tx(sdata, skb);
+	ieee80211_add_pending_skb(local, skb);
 	return 0;
 }
 
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] mac80211: Defer tranmission of mesh path errors
  2011-09-01 17:04               ` [PATCH v2] " Javier Cardona
@ 2011-09-02 10:59                 ` Johannes Berg
  2011-09-06 19:10                   ` [PATCH v3] " Javier Cardona
  0 siblings, 1 reply; 19+ messages in thread
From: Johannes Berg @ 2011-09-02 10:59 UTC (permalink / raw)
  To: Javier Cardona
  Cc: John W. Linville, Thomas Pedersen, devel, linux-wireless, jlopex

On Thu, 2011-09-01 at 10:04 -0700, Javier Cardona wrote:
> Under failure conditions, the mesh stack sends PERR messages to the
> previous sender of the failed frame.  This happens in the tx feedback
> path, in which the transmission queue lock may be taken.  Avoid a
> deadlock by sending the path error via the pending queue.
> 
> Signed-off-by: Javier Cardona <javier@cozybit.com>
> Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
> ---
> v2: - Helper to prepare skb for deferred transmission (Johannes)
>     - Reword comment (Johannes)
> 
>  net/mac80211/mesh_hwmp.c |   30 ++++++++++++++++++++++++++++--
>  1 files changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
> index fd4f76a..cda4818 100644
> --- a/net/mac80211/mesh_hwmp.c
> +++ b/net/mac80211/mesh_hwmp.c
> @@ -8,6 +8,7 @@
>   */
>  
>  #include <linux/slab.h>
> +#include "wme.h"
>  #include "mesh.h"
>  
>  #ifdef CONFIG_MAC80211_VERBOSE_MHWMP_DEBUG
> @@ -202,6 +203,25 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags,
>  	return 0;
>  }
>  
> +
> +static void prepare_frame_for_deferred_tx(struct ieee80211_sub_if_data *sdata,
> +		struct sk_buff *skb)
> +{
> +	struct ieee80211_local *local = sdata->local;
> +	struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
> +
> +	skb_set_mac_header(skb, 0);
> +	skb_set_network_header(skb, 0);
> +	skb_set_transport_header(skb, 0);
> +
> +	/* Send all internal mgmt frames on VO. Accordingly set TID to 7. */
> +	skb_set_queue_mapping(skb, IEEE80211_AC_VO);
> +	skb->priority = 7;
> +
> +	info->control.vif = &sdata->vif;
> +	ieee80211_set_qos_hdr(local, skb);
> +}

Maybe add a note that if the frame might be encrypted enough headroom
needs to be there?

johannes


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3] mac80211: Defer tranmission of mesh path errors
  2011-09-02 10:59                 ` Johannes Berg
@ 2011-09-06 19:10                   ` Javier Cardona
  0 siblings, 0 replies; 19+ messages in thread
From: Javier Cardona @ 2011-09-06 19:10 UTC (permalink / raw)
  To: John W. Linville
  Cc: Javier Cardona, Thomas Pedersen, devel, Johannes Berg,
	linux-wireless, jlopex

Under failure conditions, the mesh stack sends PERR messages to the
previous sender of the failed frame.  This happens in the tx feedback
path, in which the transmission queue lock may be taken.  Avoid a
deadlock by sending the path error via the pending queue.

---
v3: - Comment on headroom size (Johannes)
v2: - Prepare skb for deferred transmission (Johannes)
    - Reword comment (Johannes)

Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>

 net/mac80211/mesh_hwmp.c |   32 ++++++++++++++++++++++++++++++--
 1 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
index fd4f76a..63df0bc 100644
--- a/net/mac80211/mesh_hwmp.c
+++ b/net/mac80211/mesh_hwmp.c
@@ -8,6 +8,7 @@
  */
 
 #include <linux/slab.h>
+#include "wme.h"
 #include "mesh.h"
 
 #ifdef CONFIG_MAC80211_VERBOSE_MHWMP_DEBUG
@@ -202,6 +203,27 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags,
 	return 0;
 }
 
+
+/*  Headroom is not adjusted.  Caller should ensure that skb has sufficient
+ *  headroom in case the frame is encrypted. */
+static void prepare_frame_for_deferred_tx(struct ieee80211_sub_if_data *sdata,
+		struct sk_buff *skb)
+{
+	struct ieee80211_local *local = sdata->local;
+	struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
+
+	skb_set_mac_header(skb, 0);
+	skb_set_network_header(skb, 0);
+	skb_set_transport_header(skb, 0);
+
+	/* Send all internal mgmt frames on VO. Accordingly set TID to 7. */
+	skb_set_queue_mapping(skb, IEEE80211_AC_VO);
+	skb->priority = 7;
+
+	info->control.vif = &sdata->vif;
+	ieee80211_set_qos_hdr(local, skb);
+}
+
 /**
  * mesh_send_path error - Sends a PERR mesh management frame
  *
@@ -209,6 +231,10 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags,
  * @target_sn: SN of the broken destination
  * @target_rcode: reason code for this PERR
  * @ra: node this frame is addressed to
+ *
+ * Note: This function may be called with driver locks taken that the driver
+ * also acquires in the TX path.  To avoid a deadlock we don't transmit the
+ * frame directly but add it to the pending queue instead.
  */
 int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn,
 		       __le16 target_rcode, const u8 *ra,
@@ -222,7 +248,7 @@ int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn,
 
 	if (!skb)
 		return -1;
-	skb_reserve(skb, local->hw.extra_tx_headroom);
+	skb_reserve(skb, local->tx_headroom + local->hw.extra_tx_headroom);
 	/* 25 is the size of the common mgmt part (24) plus the size of the
 	 * common action part (1)
 	 */
@@ -263,7 +289,9 @@ int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn,
 	pos += 4;
 	memcpy(pos, &target_rcode, 2);
 
-	ieee80211_tx_skb(sdata, skb);
+	/* see note in function header */
+	prepare_frame_for_deferred_tx(sdata, skb);
+	ieee80211_add_pending_skb(local, skb);
 	return 0;
 }
 
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2011-09-06 19:10 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-25  1:34 [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock Thomas Pedersen
2011-08-25  9:48 ` Bob Copeland
2011-08-26 14:22 ` John W. Linville
2011-08-29  2:07   ` Thomas Pedersen
2011-08-29 14:09     ` Bob Copeland
2011-08-29 18:13       ` Thomas Pedersen
2011-08-30 11:50         ` Bob Copeland
2011-08-30 12:18 ` Johannes Berg
2011-08-30 12:21 ` Johannes Berg
2011-08-30 16:22   ` Javier Cardona
2011-08-30 18:29   ` [PATCH] mac80211: Defer tranmission of mesh path errors Javier Cardona
2011-08-30 18:43     ` Johannes Berg
2011-08-30 21:38       ` Javier Cardona
2011-08-31  1:50         ` Javier Cardona
2011-08-31  5:11           ` Johannes Berg
2011-09-01 17:04             ` Javier Cardona
2011-09-01 17:04               ` [PATCH v2] " Javier Cardona
2011-09-02 10:59                 ` Johannes Berg
2011-09-06 19:10                   ` [PATCH v3] " Javier Cardona

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.