All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mac80211: remove ignore_plink_timer flag
@ 2014-06-04 13:27 Bob Copeland
  2014-06-04 14:08 ` Johannes Berg
  0 siblings, 1 reply; 3+ messages in thread
From: Bob Copeland @ 2014-06-04 13:27 UTC (permalink / raw)
  To: johannes; +Cc: linux-wireless, Bob Copeland

The mesh_plink code is doing some interesting things with the
ignore_plink_timer flag.  It seems the original intent was to
handle this race:

cpu 0                           cpu 1
-----                           -----
                                start timer handler for state X
acquire sta_lock
change state from X to Y
mod_timer() / del_timer()
release sta_lock
                                acquire sta_lock
                                execute state Y timer too soon

However, using the mod_timer()/del_timer() return values to
detect these cases is broken.  As a result, timers get ignored
unnecessarily, and stations can get stuck in the peering state
machine.

Instead, we can detect the case by looking at the timer expiration.
In the case of del_timer, just ignore the timers in the following
(LISTEN/ESTAB) states since they won't have timers anyway.

Signed-off-by: Bob Copeland <me@bobcopeland.com>
---
 net/mac80211/mesh_plink.c | 30 +++++++++++++++++++++++-------
 net/mac80211/sta_info.h   |  2 --
 2 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/net/mac80211/mesh_plink.c b/net/mac80211/mesh_plink.c
index e8f60aa..63b8741 100644
--- a/net/mac80211/mesh_plink.c
+++ b/net/mac80211/mesh_plink.c
@@ -551,11 +551,30 @@ static void mesh_plink_timer(unsigned long data)
 		return;
 
 	spin_lock_bh(&sta->lock);
-	if (sta->ignore_plink_timer) {
-		sta->ignore_plink_timer = false;
+
+	/* If a timer fires just before a state transition on another CPU,
+	 * we may have already extended the timeout and changed state by the
+	 * time we've acquired the lock and arrived  here.  In that case,
+	 * skip this timer and wait for the new one.
+	 */
+	if (time_before(jiffies, sta->plink_timer.expires)) {
+		mpl_dbg(sta->sdata,
+			"Ignoring timer for %pM in state %s (timer adjusted)",
+			sta->sta.addr, mplstates[sta->plink_state]);
 		spin_unlock_bh(&sta->lock);
 		return;
 	}
+
+	/* del_timer() and handler may race when entering these states */
+	if (sta->plink_state == NL80211_PLINK_LISTEN ||
+	    sta->plink_state == NL80211_PLINK_ESTAB) {
+		mpl_dbg(sta->sdata,
+			"Ignoring timer for %pM in state %s (timer deleted)",
+			sta->sta.addr, mplstates[sta->plink_state]);
+		spin_unlock_bh(&sta->lock);
+		return;
+	}
+
 	mpl_dbg(sta->sdata,
 		"Mesh plink timer for %pM fired on state %s\n",
 		sta->sta.addr, mplstates[sta->plink_state]);
@@ -773,9 +792,7 @@ static u32 mesh_plink_fsm(struct ieee80211_sub_if_data *sdata,
 			break;
 		case CNF_ACPT:
 			sta->plink_state = NL80211_PLINK_CNF_RCVD;
-			if (!mod_plink_timer(sta,
-					     mshcfg->dot11MeshConfirmTimeout))
-				sta->ignore_plink_timer = true;
+			mod_plink_timer(sta, mshcfg->dot11MeshConfirmTimeout);
 			break;
 		default:
 			break;
@@ -834,8 +851,7 @@ static u32 mesh_plink_fsm(struct ieee80211_sub_if_data *sdata,
 	case NL80211_PLINK_HOLDING:
 		switch (event) {
 		case CLS_ACPT:
-			if (del_timer(&sta->plink_timer))
-				sta->ignore_plink_timer = 1;
+			del_timer(&sta->plink_timer);
 			mesh_plink_fsm_restart(sta);
 			break;
 		case OPN_ACPT:
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index dee0b64..159cac9 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -306,7 +306,6 @@ struct ieee80211_tx_latency_stat {
  * @plid: Peer link ID
  * @reason: Cancel reason on PLINK_HOLDING state
  * @plink_retries: Retries in establishment
- * @ignore_plink_timer: ignore the peer-link timer (used internally)
  * @plink_state: peer link state
  * @plink_timeout: timeout of peer link
  * @plink_timer: peer link watch timer
@@ -421,7 +420,6 @@ struct sta_info {
 	u16 plid;
 	u16 reason;
 	u8 plink_retries;
-	bool ignore_plink_timer;
 	enum nl80211_plink_state plink_state;
 	u32 plink_timeout;
 	struct timer_list plink_timer;
-- 
1.9.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] mac80211: remove ignore_plink_timer flag
  2014-06-04 13:27 [PATCH] mac80211: remove ignore_plink_timer flag Bob Copeland
@ 2014-06-04 14:08 ` Johannes Berg
  2014-06-04 14:53   ` Bob Copeland
  0 siblings, 1 reply; 3+ messages in thread
From: Johannes Berg @ 2014-06-04 14:08 UTC (permalink / raw)
  To: Bob Copeland; +Cc: linux-wireless

On Wed, 2014-06-04 at 09:27 -0400, Bob Copeland wrote:
> The mesh_plink code is doing some interesting things with the
> ignore_plink_timer flag.  It seems the original intent was to
> handle this race:
> 
> cpu 0                           cpu 1
> -----                           -----
>                                 start timer handler for state X
> acquire sta_lock
> change state from X to Y
> mod_timer() / del_timer()
> release sta_lock
>                                 acquire sta_lock
>                                 execute state Y timer too soon
> 
> However, using the mod_timer()/del_timer() return values to
> detect these cases is broken.  As a result, timers get ignored
> unnecessarily, and stations can get stuck in the peering state
> machine.
> 
> Instead, we can detect the case by looking at the timer expiration.
> In the case of del_timer, just ignore the timers in the following
> (LISTEN/ESTAB) states since they won't have timers anyway.

I'm not entirely sure about the expiration thing - doesn't seem
different from the outside flag? But anyway - applied.

johannes


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mac80211: remove ignore_plink_timer flag
  2014-06-04 14:08 ` Johannes Berg
@ 2014-06-04 14:53   ` Bob Copeland
  0 siblings, 0 replies; 3+ messages in thread
From: Bob Copeland @ 2014-06-04 14:53 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-wireless

On Wed, Jun 04, 2014 at 04:08:28PM +0200, Johannes Berg wrote:
> On Wed, 2014-06-04 at 09:27 -0400, Bob Copeland wrote:
> > However, using the mod_timer()/del_timer() return values to
> > detect these cases is broken.  As a result, timers get ignored
> > unnecessarily, and stations can get stuck in the peering state
> > machine.
> > 
> > Instead, we can detect the case by looking at the timer expiration.
> > In the case of del_timer, just ignore the timers in the following
> > (LISTEN/ESTAB) states since they won't have timers anyway.
> 
> I'm not entirely sure about the expiration thing - doesn't seem
> different from the outside flag? But anyway - applied.

Happy to be enlightened -- for what it's worth this is my reasoning:

My understanding of mod_timer() return value is that it just means
the timer was scheduled, not actually running, and the original code
assumed the latter.  So let's say the timer is already scheduled to
happen 5 seconds from now, the old code would do:

lock
/* changing state, timer now reflects a different state timeout */
rv = mod_timer(10 secs from now);
if (rv) /* timer was already scheduled */
   ignore_plink_timer = 1;
unlock

So when the timer runs in 10 seconds it would skip execution even though
the first handler hadn't executed yet.  It got the race case right, but
not the race-free case.

-- 
Bob Copeland %% www.bobcopeland.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-06-04 14:54 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-04 13:27 [PATCH] mac80211: remove ignore_plink_timer flag Bob Copeland
2014-06-04 14:08 ` Johannes Berg
2014-06-04 14:53   ` Bob Copeland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.