* [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock @ 2011-08-25 1:34 Thomas Pedersen 2011-08-25 9:48 ` Bob Copeland ` (3 more replies) 0 siblings, 4 replies; 19+ messages in thread From: Thomas Pedersen @ 2011-08-25 1:34 UTC (permalink / raw) To: linux-wireless; +Cc: jirislaby, mickflemm, lrodriguez, me, Javier Cardona From: Javier Cardona <javier@cozybit.com> This driver reports transmission status to the upper layer (ath5k_tx_frame_completed()) while holding the lock on the transmission queue (txq->lock). Under failure conditions, the mesh stack will attempt to send PERR messages to the previous sender of the failed frame. When that happens the driver will attempt to re-acquire the txq->lock lock causing a deadlock. There are two possible fixes for this, (1) we could defer the transmission of the PERR frame until the lock is released or (2) release the lock before invoking ieee80211_tx_status(). The ath9k driver implements the second approach (see ath_tx_complete() in ath9k/xmit.c) as well as the rt2x00 and b43 drivers. The iwl driver, on the other hand, avoids this problem by invoking ieee80211_tx_status_irqsafe() which effectively defers processing of transmission feedback status. This last approach is the least intrusive is implemented here. Reported by Pedro Larbig (ASPj) --- drivers/net/wireless/ath/ath5k/base.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/net/wireless/ath/ath5k/base.c b/drivers/net/wireless/ath/ath5k/base.c index e9ea38d..9b488f9 100644 --- a/drivers/net/wireless/ath/ath5k/base.c +++ b/drivers/net/wireless/ath/ath5k/base.c @@ -1630,7 +1630,7 @@ ath5k_tx_frame_completed(struct ath5k_hw *ah, struct sk_buff *skb, ah->stats.antenna_tx[0]++; /* invalid */ trace_ath5k_tx_complete(ah, skb, txq, ts); - ieee80211_tx_status(ah->hw, skb); + ieee80211_tx_status_irqsafe(ah->hw, skb); } static void -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock 2011-08-25 1:34 [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock Thomas Pedersen @ 2011-08-25 9:48 ` Bob Copeland 2011-08-26 14:22 ` John W. Linville ` (2 subsequent siblings) 3 siblings, 0 replies; 19+ messages in thread From: Bob Copeland @ 2011-08-25 9:48 UTC (permalink / raw) To: Thomas Pedersen Cc: linux-wireless, jirislaby, mickflemm, lrodriguez, Javier Cardona On Wed, Aug 24, 2011 at 06:34:24PM -0700, Thomas Pedersen wrote: > From: Javier Cardona <javier@cozybit.com> > > This driver reports transmission status to the upper layer > (ath5k_tx_frame_completed()) while holding the lock on the transmission > queue (txq->lock). Under failure conditions, the mesh stack will > attempt to send PERR messages to the previous sender of the failed > frame. When that happens the driver will attempt to re-acquire the > txq->lock lock causing a deadlock. There are two possible fixes for > this, (1) we could defer the transmission of the PERR frame until the > lock is released or Of course, as a lazy driver writer I would like this one. Esp. without at least a comment in tx_status that it can call back into the driver. > (2) release the lock before invoking ieee80211_tx_status(). I was initially worried that we might race against draining buffers that might also want to free the skb. That is not the case (skb is already removed from the structure) -- but there could potentially be other races with drain since we're mucking with the list that we're iterating over. I'm least enthused with this approach. > (see ath_tx_complete() in ath9k/xmit.c) as well as the rt2x00 and b43 > drivers. The iwl driver, on the other hand, avoids this problem by > invoking ieee80211_tx_status_irqsafe() which effectively defers > processing of transmission feedback status. This last approach is the > least intrusive is implemented here. I guess it's ok. The only real potential fallout would be worse rate adaptation? Perhaps needs a comment so no one fixes it later. -- Bob Copeland %% www.bobcopeland.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock 2011-08-25 1:34 [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock Thomas Pedersen 2011-08-25 9:48 ` Bob Copeland @ 2011-08-26 14:22 ` John W. Linville 2011-08-29 2:07 ` Thomas Pedersen 2011-08-30 12:18 ` Johannes Berg 2011-08-30 12:21 ` Johannes Berg 3 siblings, 1 reply; 19+ messages in thread From: John W. Linville @ 2011-08-26 14:22 UTC (permalink / raw) To: Thomas Pedersen Cc: linux-wireless, jirislaby, mickflemm, lrodriguez, me, Javier Cardona On Wed, Aug 24, 2011 at 06:34:24PM -0700, Thomas Pedersen wrote: > From: Javier Cardona <javier@cozybit.com> > > This driver reports transmission status to the upper layer > (ath5k_tx_frame_completed()) while holding the lock on the transmission > queue (txq->lock). Under failure conditions, the mesh stack will > attempt to send PERR messages to the previous sender of the failed > frame. When that happens the driver will attempt to re-acquire the > txq->lock lock causing a deadlock. There are two possible fixes for > this, (1) we could defer the transmission of the PERR frame until the > lock is released or (2) release the lock before invoking > ieee80211_tx_status(). The ath9k driver implements the second approach > (see ath_tx_complete() in ath9k/xmit.c) as well as the rt2x00 and b43 > drivers. The iwl driver, on the other hand, avoids this problem by > invoking ieee80211_tx_status_irqsafe() which effectively defers > processing of transmission feedback status. This last approach is the > least intrusive is implemented here. > > Reported by Pedro Larbig (ASPj) > --- > drivers/net/wireless/ath/ath5k/base.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) Missing Signed-off-by... -- John W. Linville Someday the world will need a hero, and you linville@tuxdriver.com might be all we have. Be ready. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock 2011-08-26 14:22 ` John W. Linville @ 2011-08-29 2:07 ` Thomas Pedersen 2011-08-29 14:09 ` Bob Copeland 0 siblings, 1 reply; 19+ messages in thread From: Thomas Pedersen @ 2011-08-29 2:07 UTC (permalink / raw) To: John W. Linville Cc: linux-wireless, jirislaby, mickflemm, lrodriguez, me, Javier Cardona On Fri, Aug 26, 2011 at 7:22 AM, John W. Linville <linville@tuxdriver.com> wrote: > On Wed, Aug 24, 2011 at 06:34:24PM -0700, Thomas Pedersen wrote: >> From: Javier Cardona <javier@cozybit.com> >> >> This driver reports transmission status to the upper layer >> (ath5k_tx_frame_completed()) while holding the lock on the transmission >> queue (txq->lock). Under failure conditions, the mesh stack will >> attempt to send PERR messages to the previous sender of the failed >> frame. When that happens the driver will attempt to re-acquire the >> txq->lock lock causing a deadlock. There are two possible fixes for >> this, (1) we could defer the transmission of the PERR frame until the >> lock is released or (2) release the lock before invoking >> ieee80211_tx_status(). The ath9k driver implements the second approach >> (see ath_tx_complete() in ath9k/xmit.c) as well as the rt2x00 and b43 >> drivers. The iwl driver, on the other hand, avoids this problem by >> invoking ieee80211_tx_status_irqsafe() which effectively defers >> processing of transmission feedback status. This last approach is the >> least intrusive is implemented here. >> >> Reported by Pedro Larbig (ASPj) >> --- >> drivers/net/wireless/ath/ath5k/base.c | 2 +- >> 1 files changed, 1 insertions(+), 1 deletions(-) > > Missing Signed-off-by... > Yikes. Also, it looks like ieee80211_tx_status() should not be called from irq context. Will resubmit a v2 with signoff and comment shortly. Thomas ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock 2011-08-29 2:07 ` Thomas Pedersen @ 2011-08-29 14:09 ` Bob Copeland 2011-08-29 18:13 ` Thomas Pedersen 0 siblings, 1 reply; 19+ messages in thread From: Bob Copeland @ 2011-08-29 14:09 UTC (permalink / raw) To: Thomas Pedersen Cc: John W. Linville, linux-wireless, jirislaby, mickflemm, lrodriguez, Javier Cardona On Sun, Aug 28, 2011 at 10:07 PM, Thomas Pedersen <thomas@cozybit.com> wrote: > On Fri, Aug 26, 2011 at 7:22 AM, John W. Linville > <linville@tuxdriver.com> wrote: >> On Wed, Aug 24, 2011 at 06:34:24PM -0700, Thomas Pedersen wrote: >>> From: Javier Cardona <javier@cozybit.com> >>> >>> This driver reports transmission status to the upper layer >>> (ath5k_tx_frame_completed()) while holding the lock on the transmission >>> queue (txq->lock). Under failure conditions, the mesh stack will >>> attempt to send PERR messages to the previous sender of the failed >>> frame. When that happens the driver will attempt to re-acquire the >>> txq->lock lock causing a deadlock. There are two possible fixes for >>> this, (1) we could defer the transmission of the PERR frame until the >>> lock is released or (2) release the lock before invoking >>> ieee80211_tx_status(). The ath9k driver implements the second approach >>> (see ath_tx_complete() in ath9k/xmit.c) as well as the rt2x00 and b43 >>> drivers. The iwl driver, on the other hand, avoids this problem by >>> invoking ieee80211_tx_status_irqsafe() which effectively defers >>> processing of transmission feedback status. This last approach is the >>> least intrusive is implemented here. >>> >>> Reported by Pedro Larbig (ASPj) >>> --- >>> drivers/net/wireless/ath/ath5k/base.c | 2 +- >>> 1 files changed, 1 insertions(+), 1 deletions(-) >> >> Missing Signed-off-by... >> > Yikes. Also, it looks like ieee80211_tx_status() should not be called > from irq context. Will resubmit a v2 with signoff and comment shortly. I don't get the last statement -- ieee80211_tx_status() -> irqsafe was the very change the patch added. It's running in a bottom half though, not a hard irq. -- Bob Copeland %% www.bobcopeland.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock 2011-08-29 14:09 ` Bob Copeland @ 2011-08-29 18:13 ` Thomas Pedersen 2011-08-30 11:50 ` Bob Copeland 0 siblings, 1 reply; 19+ messages in thread From: Thomas Pedersen @ 2011-08-29 18:13 UTC (permalink / raw) To: Bob Copeland Cc: John W. Linville, linux-wireless, jirislaby, mickflemm, lrodriguez, Javier Cardona On Mon, Aug 29, 2011 at 7:09 AM, Bob Copeland <me@bobcopeland.com> wrote: > On Sun, Aug 28, 2011 at 10:07 PM, Thomas Pedersen <thomas@cozybit.com> wrote: >> On Fri, Aug 26, 2011 at 7:22 AM, John W. Linville >> <linville@tuxdriver.com> wrote: >>> On Wed, Aug 24, 2011 at 06:34:24PM -0700, Thomas Pedersen wrote: >>>> From: Javier Cardona <javier@cozybit.com> >>>> >>>> This driver reports transmission status to the upper layer >>>> (ath5k_tx_frame_completed()) while holding the lock on the transmission >>>> queue (txq->lock). Under failure conditions, the mesh stack will >>>> attempt to send PERR messages to the previous sender of the failed >>>> frame. When that happens the driver will attempt to re-acquire the >>>> txq->lock lock causing a deadlock. There are two possible fixes for >>>> this, (1) we could defer the transmission of the PERR frame until the >>>> lock is released or (2) release the lock before invoking >>>> ieee80211_tx_status(). The ath9k driver implements the second approach >>>> (see ath_tx_complete() in ath9k/xmit.c) as well as the rt2x00 and b43 >>>> drivers. The iwl driver, on the other hand, avoids this problem by >>>> invoking ieee80211_tx_status_irqsafe() which effectively defers >>>> processing of transmission feedback status. This last approach is the >>>> least intrusive is implemented here. >>>> >>>> Reported by Pedro Larbig (ASPj) >>>> --- >>>> drivers/net/wireless/ath/ath5k/base.c | 2 +- >>>> 1 files changed, 1 insertions(+), 1 deletions(-) >>> >>> Missing Signed-off-by... >>> >> Yikes. Also, it looks like ieee80211_tx_status() should not be called >> from irq context. Will resubmit a v2 with signoff and comment shortly. > > I don't get the last statement -- ieee80211_tx_status() -> irqsafe was I meant to say "In addition to the above discussion, ieee80211_tx_status() should not be called from interrupt context anyway". > the very change the patch added. It's running in a bottom half > though, not a hard irq. > Even in a bottom half we're still in "interrupt" context, right? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock 2011-08-29 18:13 ` Thomas Pedersen @ 2011-08-30 11:50 ` Bob Copeland 0 siblings, 0 replies; 19+ messages in thread From: Bob Copeland @ 2011-08-30 11:50 UTC (permalink / raw) To: Thomas Pedersen Cc: John W. Linville, linux-wireless, jirislaby, mickflemm, lrodriguez, Javier Cardona On Mon, Aug 29, 2011 at 11:13:10AM -0700, Thomas Pedersen wrote: > I meant to say "In addition to the above discussion, > ieee80211_tx_status() should not be called from interrupt context > anyway". > > > the very change the patch added. It's running in a bottom half > > though, not a hard irq. > > Even in a bottom half we're still in "interrupt" context, right? Yes, but my interpretation is that _irqsafe() in this case refers to hard irq context and not softirq context. So it's still atomic, but tx_status() does things like kfree_skb() directly instead of deferring that to another softirq. There are 3 contexts in Linux: process, softirq, and hardirq, and given that we have 3 corresponding tx_status functions (tx_status_ni, tx_status, tx_status_irqsafe), this interpretation makes sense, no? Someone correct me if I'm wrong here. It seems to me this patch warrants some discussion or some comments in the .h files about how drivers can be called back into themselves; otherwise it's very hard for a driver writer to get locking right without examining the stack. And dropping locks around various API calls means you have to validate the protected state when you get the lock back, it's not a panacea. -- Bob Copeland %% www.bobcopeland.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock 2011-08-25 1:34 [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock Thomas Pedersen 2011-08-25 9:48 ` Bob Copeland 2011-08-26 14:22 ` John W. Linville @ 2011-08-30 12:18 ` Johannes Berg 2011-08-30 12:21 ` Johannes Berg 3 siblings, 0 replies; 19+ messages in thread From: Johannes Berg @ 2011-08-30 12:18 UTC (permalink / raw) To: Thomas Pedersen Cc: linux-wireless, jirislaby, mickflemm, lrodriguez, me, Javier Cardona Well, NACK. You shouldn't mix rx() with tx_status_irqsafe(). johannes ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock 2011-08-25 1:34 [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock Thomas Pedersen ` (2 preceding siblings ...) 2011-08-30 12:18 ` Johannes Berg @ 2011-08-30 12:21 ` Johannes Berg 2011-08-30 16:22 ` Javier Cardona 2011-08-30 18:29 ` [PATCH] mac80211: Defer tranmission of mesh path errors Javier Cardona 3 siblings, 2 replies; 19+ messages in thread From: Johannes Berg @ 2011-08-30 12:21 UTC (permalink / raw) To: Thomas Pedersen Cc: linux-wireless, jirislaby, mickflemm, lrodriguez, me, Javier Cardona On Wed, 2011-08-24 at 18:34 -0700, Thomas Pedersen wrote: > From: Javier Cardona <javier@cozybit.com> > > This driver reports transmission status to the upper layer > (ath5k_tx_frame_completed()) while holding the lock on the transmission > queue (txq->lock). Under failure conditions, the mesh stack will > attempt to send PERR messages to the previous sender of the failed > frame. I'd feel a lot more comfortable if the mesh stack didn't do that. I suppose it could use ieee80211_add_pending_skb() instead? johannes ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock 2011-08-30 12:21 ` Johannes Berg @ 2011-08-30 16:22 ` Javier Cardona 2011-08-30 18:29 ` [PATCH] mac80211: Defer tranmission of mesh path errors Javier Cardona 1 sibling, 0 replies; 19+ messages in thread From: Javier Cardona @ 2011-08-30 16:22 UTC (permalink / raw) To: Johannes Berg Cc: Thomas Pedersen, linux-wireless, jirislaby, mickflemm, lrodriguez, me On Tue, Aug 30, 2011 at 5:21 AM, Johannes Berg <johannes@sipsolutions.net> wrote: > On Wed, 2011-08-24 at 18:34 -0700, Thomas Pedersen wrote: >> From: Javier Cardona <javier@cozybit.com> >> >> This driver reports transmission status to the upper layer >> (ath5k_tx_frame_completed()) while holding the lock on the transmission >> queue (txq->lock). Under failure conditions, the mesh stack will >> attempt to send PERR messages to the previous sender of the failed >> frame. > > I'd feel a lot more comfortable if the mesh stack didn't do that. I > suppose it could use ieee80211_add_pending_skb() instead? Suggestion taken. We'll implement that. Cheers, Javier ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] mac80211: Defer tranmission of mesh path errors 2011-08-30 12:21 ` Johannes Berg 2011-08-30 16:22 ` Javier Cardona @ 2011-08-30 18:29 ` Javier Cardona 2011-08-30 18:43 ` Johannes Berg 1 sibling, 1 reply; 19+ messages in thread From: Javier Cardona @ 2011-08-30 18:29 UTC (permalink / raw) To: John W. Linville Cc: Javier Cardona, Thomas Pedersen, devel, Johannes Berg, linux-wireless, jlopex Under failure conditions, the mesh stack sends PERR messages to the previous sender of the failed frame. This happens in the tx feedback path, in which the transmission queue lock may be taken. Avoid a deadlock by sending the path error via the pending queue. Signed-off-by: Javier Cardona <javier@cozybit.com> --- net/mac80211/mesh_hwmp.c | 7 ++++++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c index fd4f76a..f760679 100644 --- a/net/mac80211/mesh_hwmp.c +++ b/net/mac80211/mesh_hwmp.c @@ -209,6 +209,10 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags, * @target_sn: SN of the broken destination * @target_rcode: reason code for this PERR * @ra: node this frame is addressed to + * + * Note: This function may be called from the tx feedback path, possibly with + * the transmission queue lock taken. To avoid a deadlock we don't transmit + * the frame directly but add it to the pending queue instead. */ int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn, __le16 target_rcode, const u8 *ra, @@ -263,7 +267,8 @@ int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn, pos += 4; memcpy(pos, &target_rcode, 2); - ieee80211_tx_skb(sdata, skb); + /* see note in function header */ + ieee80211_add_pending_skb(local, skb); return 0; } -- 1.7.6 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] mac80211: Defer tranmission of mesh path errors 2011-08-30 18:29 ` [PATCH] mac80211: Defer tranmission of mesh path errors Javier Cardona @ 2011-08-30 18:43 ` Johannes Berg 2011-08-30 21:38 ` Javier Cardona 0 siblings, 1 reply; 19+ messages in thread From: Johannes Berg @ 2011-08-30 18:43 UTC (permalink / raw) To: Javier Cardona Cc: John W. Linville, Thomas Pedersen, devel, linux-wireless, jlopex On Tue, 2011-08-30 at 11:29 -0700, Javier Cardona wrote: > Under failure conditions, the mesh stack sends PERR messages to the > previous sender of the failed frame. This happens in the tx feedback > path, in which the transmission queue lock may be taken. Avoid a > deadlock by sending the path error via the pending queue. > > Signed-off-by: Javier Cardona <javier@cozybit.com> > --- > net/mac80211/mesh_hwmp.c | 7 ++++++- > 1 files changed, 6 insertions(+), 1 deletions(-) > > diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c > index fd4f76a..f760679 100644 > --- a/net/mac80211/mesh_hwmp.c > +++ b/net/mac80211/mesh_hwmp.c > @@ -209,6 +209,10 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags, > * @target_sn: SN of the broken destination > * @target_rcode: reason code for this PERR > * @ra: node this frame is addressed to > + * > + * Note: This function may be called from the tx feedback path, possibly with > + * the transmission queue lock taken. To avoid a deadlock we don't transmit > + * the frame directly but add it to the pending queue instead. I'd change that to say "with driver locks taken that the driver also acquires in the TX path" or something like that maybe? But that's not a big thing. Have you tested it? I'm wondering if because we take a lock here we might run into lock dependencies issues (lockdep would say) but I don't think so because stop_queue etc. all take the same lock from an arbitrary driver context already. Reviewed-by: Johannes Berg <johannes@sipsolutions.net> johannes ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mac80211: Defer tranmission of mesh path errors 2011-08-30 18:43 ` Johannes Berg @ 2011-08-30 21:38 ` Javier Cardona 2011-08-31 1:50 ` Javier Cardona 0 siblings, 1 reply; 19+ messages in thread From: Javier Cardona @ 2011-08-30 21:38 UTC (permalink / raw) To: Johannes Berg Cc: John W. Linville, Thomas Pedersen, devel, linux-wireless, jlopex On Tue, Aug 30, 2011 at 11:43 AM, Johannes Berg <johannes@sipsolutions.net> wrote: > On Tue, 2011-08-30 at 11:29 -0700, Javier Cardona wrote: >> Under failure conditions, the mesh stack sends PERR messages to the >> previous sender of the failed frame. This happens in the tx feedback >> path, in which the transmission queue lock may be taken. Avoid a >> deadlock by sending the path error via the pending queue. >> >> Signed-off-by: Javier Cardona <javier@cozybit.com> >> --- >> net/mac80211/mesh_hwmp.c | 7 ++++++- >> 1 files changed, 6 insertions(+), 1 deletions(-) >> >> diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c >> index fd4f76a..f760679 100644 >> --- a/net/mac80211/mesh_hwmp.c >> +++ b/net/mac80211/mesh_hwmp.c >> @@ -209,6 +209,10 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags, >> * @target_sn: SN of the broken destination >> * @target_rcode: reason code for this PERR >> * @ra: node this frame is addressed to >> + * >> + * Note: This function may be called from the tx feedback path, possibly with >> + * the transmission queue lock taken. To avoid a deadlock we don't transmit >> + * the frame directly but add it to the pending queue instead. > > I'd change that to say "with driver locks taken that the driver also > acquires in the TX path" or something like that maybe? OK > But that's not a big thing. Have you tested it? I'm wondering if because > we take a lock here we might run into lock dependencies issues (lockdep > would say) but I don't think so because stop_queue etc. all take the > same lock from an arbitrary driver context already. We don't have a test specific for that particular perr. Other tests run fine and lockdep does not complain. I can resubmit the patch with the clearer comment once we've run our whole test suite if that gives you peace of mind. Javier > Reviewed-by: Johannes Berg <johannes@sipsolutions.net> > > johannes > > -- Javier Cardona cozybit Inc. http://www.cozybit.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mac80211: Defer tranmission of mesh path errors 2011-08-30 21:38 ` Javier Cardona @ 2011-08-31 1:50 ` Javier Cardona 2011-08-31 5:11 ` Johannes Berg 0 siblings, 1 reply; 19+ messages in thread From: Javier Cardona @ 2011-08-31 1:50 UTC (permalink / raw) To: Johannes Berg Cc: John W. Linville, Thomas Pedersen, devel, linux-wireless, jlopex On Tue, Aug 30, 2011 at 2:38 PM, Javier Cardona <javier@cozybit.com> wrote: > On Tue, Aug 30, 2011 at 11:43 AM, Johannes Berg >> But that's not a big thing. Have you tested it? I'm wondering if because >> we take a lock here we might run into lock dependencies issues (lockdep >> would say) but I don't think so because stop_queue etc. all take the >> same lock from an arbitrary driver context already. > > We don't have a test specific for that particular perr. Other tests > run fine and lockdep does not complain. > I can resubmit the patch with the clearer comment once we've run our > whole test suite if that gives you peace of mind. The stress tests do trigger perrs and, with this change, we see a warning due to missing info->control.vif [14202.988077] ------------[ cut here ]------------ [14202.988351] WARNING: at net/mac80211/util.c:358 ieee80211_add_pending_skb+0x97/0xa0() [14202.988353] Hardware name: Bochs [14202.988355] Modules linked in: mac80211_hwsim [last unloaded: mac80211_hwsim] [14202.988359] Pid: 15051, comm: iperf Tainted: G W 3.1.0-rc4-wl+ #47 [14202.988361] Call Trace: [14202.988364] [<c103ad2d>] warn_slowpath_common+0x6d/0xa0 [14202.988367] [<c15fbca7>] ? ieee80211_add_pending_skb+0x97/0xa0 [14202.988370] [<c15fbca7>] ? ieee80211_add_pending_skb+0x97/0xa0 [14202.988373] [<c103ad7d>] warn_slowpath_null+0x1d/0x20 [14202.988376] [<c15fbca7>] ieee80211_add_pending_skb+0x97/0xa0 [14202.988379] [<c1605ab1>] mesh_path_error_tx+0x151/0x190 [14202.988382] [<c160362b>] mesh_path_discard_frame+0xfb/0x100 [14202.988385] [<c1603580>] ? mesh_path_discard_frame+0x50/0x100 [14202.988387] [<c1605dd7>] mesh_nexthop_lookup+0x157/0x1d0 [14202.988390] [<c1605c80>] ? mesh_queue_preq+0x190/0x190 [14202.988393] [<c15f826a>] ieee80211_xmit+0x10a/0x240 [14202.988395] [<c15f8160>] ? ieee80211_tx+0xe0/0xe0 [14202.988398] [<c15f5c3a>] ? ieee80211_skb_resize+0x7a/0x100 [14202.988401] [<c15f8ba7>] ieee80211_subif_start_xmit+0x307/0x810 [14202.988404] [<c15f8cf8>] ? ieee80211_subif_start_xmit+0x458/0x810 I'll look into it tomorrow. Cheers, j -- Javier Cardona cozybit Inc. http://www.cozybit.com ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mac80211: Defer tranmission of mesh path errors 2011-08-31 1:50 ` Javier Cardona @ 2011-08-31 5:11 ` Johannes Berg 2011-09-01 17:04 ` Javier Cardona 0 siblings, 1 reply; 19+ messages in thread From: Johannes Berg @ 2011-08-31 5:11 UTC (permalink / raw) To: Javier Cardona Cc: John W. Linville, Thomas Pedersen, devel, linux-wireless, jlopex On Tue, 2011-08-30 at 18:50 -0700, Javier Cardona wrote: > The stress tests do trigger perrs and, with this change, we see a > warning due to missing info->control.vif > > [14202.988077] ------------[ cut here ]------------ > [14202.988351] WARNING: at net/mac80211/util.c:358 > ieee80211_add_pending_skb+0x97/0xa0() Interesting. Ok, that means we can't just use add_pending_skb(), we need to do some setup before, all the setup that ieee80211_tx_skb() and ieee80211_xmit() would do up to ieee80211_tx()... I guess we need a little bit of helper code to enable this. johannes ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mac80211: Defer tranmission of mesh path errors 2011-08-31 5:11 ` Johannes Berg @ 2011-09-01 17:04 ` Javier Cardona 2011-09-01 17:04 ` [PATCH v2] " Javier Cardona 0 siblings, 1 reply; 19+ messages in thread From: Javier Cardona @ 2011-09-01 17:04 UTC (permalink / raw) To: John W. Linville Cc: Javier Cardona, Thomas Pedersen, devel, Johannes Berg, linux-wireless, jlopex On Tue, Aug 30, 2011 at 10:11 PM, Johannes Berg <johannes@sipsolutions.net> wrote: > On Tue, 2011-08-30 at 18:50 -0700, Javier Cardona wrote: > >> The stress tests do trigger perrs and, with this change, we see a >> warning due to missing info->control.vif >> >> [14202.988077] ------------[ cut here ]------------ >> [14202.988351] WARNING: at net/mac80211/util.c:358 >> ieee80211_add_pending_skb+0x97/0xa0() > > Interesting. Ok, that means we can't just use add_pending_skb(), we need > to do some setup before, all the setup that ieee80211_tx_skb() and > ieee80211_xmit() would do up to ieee80211_tx()... > > I guess we need a little bit of helper code to enable this. The patch that follows seems to do it. Passes our stress test without warnings. I increased the headroom on allocation to avoid a resize. Does it look correct to you? Cheers, Javier -- 1.7.6 ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2] mac80211: Defer tranmission of mesh path errors 2011-09-01 17:04 ` Javier Cardona @ 2011-09-01 17:04 ` Javier Cardona 2011-09-02 10:59 ` Johannes Berg 0 siblings, 1 reply; 19+ messages in thread From: Javier Cardona @ 2011-09-01 17:04 UTC (permalink / raw) To: John W. Linville Cc: Javier Cardona, Thomas Pedersen, devel, Johannes Berg, linux-wireless, jlopex Under failure conditions, the mesh stack sends PERR messages to the previous sender of the failed frame. This happens in the tx feedback path, in which the transmission queue lock may be taken. Avoid a deadlock by sending the path error via the pending queue. Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> --- v2: - Helper to prepare skb for deferred transmission (Johannes) - Reword comment (Johannes) net/mac80211/mesh_hwmp.c | 30 ++++++++++++++++++++++++++++-- 1 files changed, 28 insertions(+), 2 deletions(-) diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c index fd4f76a..cda4818 100644 --- a/net/mac80211/mesh_hwmp.c +++ b/net/mac80211/mesh_hwmp.c @@ -8,6 +8,7 @@ */ #include <linux/slab.h> +#include "wme.h" #include "mesh.h" #ifdef CONFIG_MAC80211_VERBOSE_MHWMP_DEBUG @@ -202,6 +203,25 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags, return 0; } + +static void prepare_frame_for_deferred_tx(struct ieee80211_sub_if_data *sdata, + struct sk_buff *skb) +{ + struct ieee80211_local *local = sdata->local; + struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); + + skb_set_mac_header(skb, 0); + skb_set_network_header(skb, 0); + skb_set_transport_header(skb, 0); + + /* Send all internal mgmt frames on VO. Accordingly set TID to 7. */ + skb_set_queue_mapping(skb, IEEE80211_AC_VO); + skb->priority = 7; + + info->control.vif = &sdata->vif; + ieee80211_set_qos_hdr(local, skb); +} + /** * mesh_send_path error - Sends a PERR mesh management frame * @@ -209,6 +229,10 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags, * @target_sn: SN of the broken destination * @target_rcode: reason code for this PERR * @ra: node this frame is addressed to + * + * Note: This function may be called with driver locks taken that the driver + * also acquires in the TX path. To avoid a deadlock we don't transmit the + * frame directly but add it to the pending queue instead. */ int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn, __le16 target_rcode, const u8 *ra, @@ -222,7 +246,7 @@ int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn, if (!skb) return -1; - skb_reserve(skb, local->hw.extra_tx_headroom); + skb_reserve(skb, local->tx_headroom + local->hw.extra_tx_headroom); /* 25 is the size of the common mgmt part (24) plus the size of the * common action part (1) */ @@ -263,7 +287,9 @@ int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn, pos += 4; memcpy(pos, &target_rcode, 2); - ieee80211_tx_skb(sdata, skb); + /* see note in function header */ + prepare_frame_for_deferred_tx(sdata, skb); + ieee80211_add_pending_skb(local, skb); return 0; } -- 1.7.6 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2] mac80211: Defer tranmission of mesh path errors 2011-09-01 17:04 ` [PATCH v2] " Javier Cardona @ 2011-09-02 10:59 ` Johannes Berg 2011-09-06 19:10 ` [PATCH v3] " Javier Cardona 0 siblings, 1 reply; 19+ messages in thread From: Johannes Berg @ 2011-09-02 10:59 UTC (permalink / raw) To: Javier Cardona Cc: John W. Linville, Thomas Pedersen, devel, linux-wireless, jlopex On Thu, 2011-09-01 at 10:04 -0700, Javier Cardona wrote: > Under failure conditions, the mesh stack sends PERR messages to the > previous sender of the failed frame. This happens in the tx feedback > path, in which the transmission queue lock may be taken. Avoid a > deadlock by sending the path error via the pending queue. > > Signed-off-by: Javier Cardona <javier@cozybit.com> > Reviewed-by: Johannes Berg <johannes@sipsolutions.net> > --- > v2: - Helper to prepare skb for deferred transmission (Johannes) > - Reword comment (Johannes) > > net/mac80211/mesh_hwmp.c | 30 ++++++++++++++++++++++++++++-- > 1 files changed, 28 insertions(+), 2 deletions(-) > > diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c > index fd4f76a..cda4818 100644 > --- a/net/mac80211/mesh_hwmp.c > +++ b/net/mac80211/mesh_hwmp.c > @@ -8,6 +8,7 @@ > */ > > #include <linux/slab.h> > +#include "wme.h" > #include "mesh.h" > > #ifdef CONFIG_MAC80211_VERBOSE_MHWMP_DEBUG > @@ -202,6 +203,25 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags, > return 0; > } > > + > +static void prepare_frame_for_deferred_tx(struct ieee80211_sub_if_data *sdata, > + struct sk_buff *skb) > +{ > + struct ieee80211_local *local = sdata->local; > + struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); > + > + skb_set_mac_header(skb, 0); > + skb_set_network_header(skb, 0); > + skb_set_transport_header(skb, 0); > + > + /* Send all internal mgmt frames on VO. Accordingly set TID to 7. */ > + skb_set_queue_mapping(skb, IEEE80211_AC_VO); > + skb->priority = 7; > + > + info->control.vif = &sdata->vif; > + ieee80211_set_qos_hdr(local, skb); > +} Maybe add a note that if the frame might be encrypted enough headroom needs to be there? johannes ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v3] mac80211: Defer tranmission of mesh path errors 2011-09-02 10:59 ` Johannes Berg @ 2011-09-06 19:10 ` Javier Cardona 0 siblings, 0 replies; 19+ messages in thread From: Javier Cardona @ 2011-09-06 19:10 UTC (permalink / raw) To: John W. Linville Cc: Javier Cardona, Thomas Pedersen, devel, Johannes Berg, linux-wireless, jlopex Under failure conditions, the mesh stack sends PERR messages to the previous sender of the failed frame. This happens in the tx feedback path, in which the transmission queue lock may be taken. Avoid a deadlock by sending the path error via the pending queue. --- v3: - Comment on headroom size (Johannes) v2: - Prepare skb for deferred transmission (Johannes) - Reword comment (Johannes) Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> net/mac80211/mesh_hwmp.c | 32 ++++++++++++++++++++++++++++++-- 1 files changed, 30 insertions(+), 2 deletions(-) diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c index fd4f76a..63df0bc 100644 --- a/net/mac80211/mesh_hwmp.c +++ b/net/mac80211/mesh_hwmp.c @@ -8,6 +8,7 @@ */ #include <linux/slab.h> +#include "wme.h" #include "mesh.h" #ifdef CONFIG_MAC80211_VERBOSE_MHWMP_DEBUG @@ -202,6 +203,27 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags, return 0; } + +/* Headroom is not adjusted. Caller should ensure that skb has sufficient + * headroom in case the frame is encrypted. */ +static void prepare_frame_for_deferred_tx(struct ieee80211_sub_if_data *sdata, + struct sk_buff *skb) +{ + struct ieee80211_local *local = sdata->local; + struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); + + skb_set_mac_header(skb, 0); + skb_set_network_header(skb, 0); + skb_set_transport_header(skb, 0); + + /* Send all internal mgmt frames on VO. Accordingly set TID to 7. */ + skb_set_queue_mapping(skb, IEEE80211_AC_VO); + skb->priority = 7; + + info->control.vif = &sdata->vif; + ieee80211_set_qos_hdr(local, skb); +} + /** * mesh_send_path error - Sends a PERR mesh management frame * @@ -209,6 +231,10 @@ static int mesh_path_sel_frame_tx(enum mpath_frame_type action, u8 flags, * @target_sn: SN of the broken destination * @target_rcode: reason code for this PERR * @ra: node this frame is addressed to + * + * Note: This function may be called with driver locks taken that the driver + * also acquires in the TX path. To avoid a deadlock we don't transmit the + * frame directly but add it to the pending queue instead. */ int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn, __le16 target_rcode, const u8 *ra, @@ -222,7 +248,7 @@ int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn, if (!skb) return -1; - skb_reserve(skb, local->hw.extra_tx_headroom); + skb_reserve(skb, local->tx_headroom + local->hw.extra_tx_headroom); /* 25 is the size of the common mgmt part (24) plus the size of the * common action part (1) */ @@ -263,7 +289,9 @@ int mesh_path_error_tx(u8 ttl, u8 *target, __le32 target_sn, pos += 4; memcpy(pos, &target_rcode, 2); - ieee80211_tx_skb(sdata, skb); + /* see note in function header */ + prepare_frame_for_deferred_tx(sdata, skb); + ieee80211_add_pending_skb(local, skb); return 0; } -- 1.7.6 ^ permalink raw reply related [flat|nested] 19+ messages in thread
end of thread, other threads:[~2011-09-06 19:10 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-08-25 1:34 [PATCH] ath5k: Invoke irqsafe version of ieee80211_tx_status() to avoid deadlock Thomas Pedersen 2011-08-25 9:48 ` Bob Copeland 2011-08-26 14:22 ` John W. Linville 2011-08-29 2:07 ` Thomas Pedersen 2011-08-29 14:09 ` Bob Copeland 2011-08-29 18:13 ` Thomas Pedersen 2011-08-30 11:50 ` Bob Copeland 2011-08-30 12:18 ` Johannes Berg 2011-08-30 12:21 ` Johannes Berg 2011-08-30 16:22 ` Javier Cardona 2011-08-30 18:29 ` [PATCH] mac80211: Defer tranmission of mesh path errors Javier Cardona 2011-08-30 18:43 ` Johannes Berg 2011-08-30 21:38 ` Javier Cardona 2011-08-31 1:50 ` Javier Cardona 2011-08-31 5:11 ` Johannes Berg 2011-09-01 17:04 ` Javier Cardona 2011-09-01 17:04 ` [PATCH v2] " Javier Cardona 2011-09-02 10:59 ` Johannes Berg 2011-09-06 19:10 ` [PATCH v3] " Javier Cardona
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.