All of lore.kernel.org
 help / color / mirror / Atom feed
* [REGRESSION] mac80211: IBSS vif queue stopped when started after 11s vif
@ 2017-03-22  9:15 Sven Eckelmann
  2017-03-29  7:49 ` Johannes Berg
  0 siblings, 1 reply; 6+ messages in thread
From: Sven Eckelmann @ 2017-03-22  9:15 UTC (permalink / raw)
  To: Johannes Berg, linux-wireless; +Cc: netdev

[-- Attachment #1: Type: text/plain, Size: 1705 bytes --]

Hi,

I had following "simple" setup with LEDE with a single ath9k phy and multiple 
vif:

 * encrypted AP
 * encrypted 802.11s meshpoint
 * encrypted IBSS

Everything was started in the order by hostapd/wpa_supplicant (but immediately 
after each other).

The problem which I've experienced was that IBSS was not able to communicate 
with its link partner. The reason for that problem was that the IBSS 
interface's queue was stopped (QUEUE_STATE_DRV_XOFF). This problem disappeared 
when either the IBSS or meshpoint interface was changed to unencrypted (which 
disables wpa_supplicant in LEDE).

It looks like ieee80211_do_open didn't start the queues via 
netif_start_subqueue because local->queue_stop_reasons was for all queues set 
to IEEE80211_QUEUE_STOP_REASON_OFFCHANNEL. This happened because 
ieee80211_offchannel_stop_vifs was called from somewhere in the scan code at 
that time and ieee80211_offchannel_return was not yet called.


This behavior seems to be introduced by 2b730daacee6 ("mac80211: don't start 
new netdev queues if driver stopped"). I have therefore chosen to call it for 
now a regression by this change. Especially because it is rather odd that the 
commit talked about not starting the queues for AP_VLAN and 2b436312f091 
("mac80211: fix queue handling crash") introduced extra code to use the old 
behavior again for AP_VLAN.

But I could be completely wrong about it. It would therefore be interesting 
for me to know who would be responsible to start the queues when 
ieee80211_do_open rejected it for IBSS.

I am now simply using this setup with a revert of 
2b436312f0919c05804fed5aa4b7f255db196e7a and 
2b730daacee6c318bce7b6373c19909e36a74590.

Kind regards,
	Sven

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] mac80211: IBSS vif queue stopped when started after 11s vif
  2017-03-22  9:15 [REGRESSION] mac80211: IBSS vif queue stopped when started after 11s vif Sven Eckelmann
@ 2017-03-29  7:49 ` Johannes Berg
  2017-03-29 11:07   ` Sven Eckelmann
  0 siblings, 1 reply; 6+ messages in thread
From: Johannes Berg @ 2017-03-29  7:49 UTC (permalink / raw)
  To: Sven Eckelmann, linux-wireless; +Cc: netdev

Hi Sven,


> But I could be completely wrong about it. It would therefore be
> interesting for me to know who would be responsible to start the
> queues when ieee80211_do_open rejected it for IBSS.

Well, once ieee80211_offchannel_return() is called, that should do the
needful and end up in ieee80211_propagate_queue_wake().

Can you check what the IBSS vif's queues are (vif->hw_queue[...])?

However, I also don't understand the difference between encrypted and
unencrypted here.

johannes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] mac80211: IBSS vif queue stopped when started after 11s vif
  2017-03-29  7:49 ` Johannes Berg
@ 2017-03-29 11:07   ` Sven Eckelmann
  2017-03-29 11:53       ` Johannes Berg
  0 siblings, 1 reply; 6+ messages in thread
From: Sven Eckelmann @ 2017-03-29 11:07 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-wireless, netdev, Michal Kazior


[-- Attachment #1.1: Type: text/plain, Size: 3245 bytes --]

On Mittwoch, 29. März 2017 09:49:21 CEST Johannes Berg wrote:
> > But I could be completely wrong about it. It would therefore be
> > interesting for me to know who would be responsible to start the
> > queues when ieee80211_do_open rejected it for IBSS.
> 
> Well, once ieee80211_offchannel_return() is called, that should do the
> needful and end up in ieee80211_propagate_queue_wake().
> 
> Can you check what the IBSS vif's queues are (vif->hw_queue[...])?

I've just dumped the data in ieee80211_propagate_queue_wake and checked when 
the function returns. The test patch (sorry, really ugly debug printk stuff) 
is attached. The most interesting part is that

    if (local->ops->wake_tx_queue)
    		return;

evaluates to true. The rest rest of the function is therefore always skipped 
for ath9k.

This was noticed when looking at the debug output:

    root@lede:/# dmesg|grep ieee80211_propagate_queue_wake
    [   20.865005] ieee80211_propagate_queue_wake:248 queue 00000000
    [   20.870839] ieee80211_propagate_queue_wake:248 queue 00000001
    [   20.876661] ieee80211_propagate_queue_wake:248 queue 00000002
    [   20.882487] ieee80211_propagate_queue_wake:248 queue 00000003
    [   21.794795] ieee80211_propagate_queue_wake:248 queue 00000000
    [   21.800629] ieee80211_propagate_queue_wake:248 queue 00000001
    [   21.806452] ieee80211_propagate_queue_wake:248 queue 00000002
    [   21.812278] ieee80211_propagate_queue_wake:248 queue 00000003
    [   21.830078] ieee80211_propagate_queue_wake:248 queue 00000000
    [   21.835918] ieee80211_propagate_queue_wake:248 queue 00000001
    [   21.841740] ieee80211_propagate_queue_wake:248 queue 00000002
    [   21.847566] ieee80211_propagate_queue_wake:248 queue 00000003
    [   23.320814] ieee80211_propagate_queue_wake:248 queue 00000000
    [   23.326643] ieee80211_propagate_queue_wake:248 queue 00000001
    [   23.332469] ieee80211_propagate_queue_wake:248 queue 00000002
    [   23.338294] ieee80211_propagate_queue_wake:248 queue 00000003
    [   41.930942] ieee80211_propagate_queue_wake:248 queue 00000000
    [   41.940709] ieee80211_propagate_queue_wake:248 queue 00000002
    [   46.949087] ieee80211_propagate_queue_wake:248 queue 00000000
    [   82.999021] ieee80211_propagate_queue_wake:248 queue 00000000

Removing this is enough to fix the problem. And now you will propably say 
"hey, this is not my code". And this is the reason why I have now CC'ed the 
author of 80a83cfc434b ("mac80211: skip netdev queue control with software 
queuing"). This change in ieee80211_propagate_queue_wake is basically breaking 
the (delayed) startup of the ibss netdev queue [1] when the device was offchan 
during the ieee80211_do_open of the ibss interface.

Not sure whether removing it in ieee80211_propagate_queue_wake will have other 
odd side effects with software queuing. Maybe Michal Kazior can tell us if it 
is safe to remove it.

> However, I also don't understand the difference between encrypted and
> unencrypted here.

My best guess is timing. LEDE is not using wpa_supplicant when encryption is 
disabled.

Kind regards,
	Sven

[1] https://lkml.kernel.org/r/1978424.XTv2Qph05K@bentobox

[-- Attachment #1.2: 999-test.patch --]
[-- Type: text/x-patch, Size: 5230 bytes --]

diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 036fa1d..9a1079f 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -517,6 +517,10 @@ int ieee80211_do_open(struct wireless_dev *wdev, bool coming_up)
 	u32 changed = 0;
 	int res;
 	u32 hw_reconf_flags = 0;
+	const char *ifname = "unknown";
+
+	if (sdata->dev)
+		ifname = sdata->dev->name;
 
 	switch (sdata->vif.type) {
 	case NL80211_IFTYPE_WDS:
@@ -745,11 +749,14 @@ int ieee80211_do_open(struct wireless_dev *wdev, bool coming_up)
 	if (sdata->vif.type == NL80211_IFTYPE_MONITOR ||
 	    sdata->vif.type == NL80211_IFTYPE_AP_VLAN) {
 		/* XXX: for AP_VLAN, actually track AP queues */
+		
+		printk("%s:%u netif_tx_start_all_queues %s\n", __func__, __LINE__, ifname);
 		netif_tx_start_all_queues(dev);
 	} else if (dev) {
 		unsigned long flags;
 		int n_acs = IEEE80211_NUM_ACS;
 		int ac;
+		int started = 0;
 
 		if (local->hw.queues < IEEE80211_NUM_ACS)
 			n_acs = 1;
@@ -762,11 +769,20 @@ int ieee80211_do_open(struct wireless_dev *wdev, bool coming_up)
 				int ac_queue = sdata->vif.hw_queue[ac];
 
 				if (local->queue_stop_reasons[ac_queue] == 0 &&
-				    skb_queue_empty(&local->pending[ac_queue]))
+				    skb_queue_empty(&local->pending[ac_queue])) {
+		//printk("%s:%u netif_start_subqueue type %u %s\n", __func__, __LINE__, sdata->vif.type, ifname);
 					netif_start_subqueue(dev, ac);
+					started = 1;
+				} else {
+		printk("%s:%u NOT netif_start_subqueue type %u stop_reasons %d queue_empty %d %s\n", __func__, __LINE__, sdata->vif.type, local->queue_stop_reasons[ac_queue], skb_queue_empty(&local->pending[ac_queue]), ifname);
+				}
 			}
+		} else {
+			printk("%s:%u NOT netif_start_subqueue type %u cab_queue %d stop_reasons %d queue_empty %d %s\n", __func__, __LINE__, sdata->vif.type, sdata->vif.cab_queue, local->queue_stop_reasons[sdata->vif.cab_queue], skb_queue_empty(&local->pending[sdata->vif.cab_queue]), ifname);
+
 		}
 		spin_unlock_irqrestore(&local->queue_stop_reason_lock, flags);
+		WARN_ON(started);
 	}
 
 	return 0;
@@ -816,6 +832,10 @@ static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata,
 	struct cfg80211_chan_def chandef;
 	bool cancel_scan;
 	struct cfg80211_nan_func *func;
+	const char *ifname = "unknown";
+
+	if (sdata->dev)
+		ifname = sdata->dev->name;
 
 	clear_bit(SDATA_STATE_RUNNING, &sdata->state);
 
@@ -826,8 +846,10 @@ static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata,
 	/*
 	 * Stop TX on this interface first.
 	 */
-	if (sdata->dev)
+	if (sdata->dev) {
+		printk("%s:%u netif_tx_stop_all_queues %s\n", __func__, __LINE__, ifname);
 		netif_tx_stop_all_queues(sdata->dev);
+	}
 
 	ieee80211_roc_purge(local, sdata);
 
diff --git a/net/mac80211/offchannel.c b/net/mac80211/offchannel.c
index fc2a601..6681c5c 100644
--- a/net/mac80211/offchannel.c
+++ b/net/mac80211/offchannel.c
@@ -118,6 +118,7 @@ void ieee80211_offchannel_stop_vifs(struct ieee80211_local *local)
 	 * Stop queues and transmit all frames queued by the driver
 	 * before sending nullfunc to enable powersave at the AP.
 	 */
+	printk("%s:%u ieee80211_stop_queues_by_reason\n", __func__, __LINE__);
 	ieee80211_stop_queues_by_reason(&local->hw, IEEE80211_MAX_QUEUE_MAP,
 					IEEE80211_QUEUE_STOP_REASON_OFFCHANNEL,
 					false);
@@ -183,6 +184,7 @@ void ieee80211_offchannel_return(struct ieee80211_local *local)
 	}
 	mutex_unlock(&local->iflist_mtx);
 
+	printk("%s:%u ieee80211_offchannel_return\n", __func__, __LINE__);
 	ieee80211_wake_queues_by_reason(&local->hw, IEEE80211_MAX_QUEUE_MAP,
 					IEEE80211_QUEUE_STOP_REASON_OFFCHANNEL,
 					false);
diff --git a/net/mac80211/util.c b/net/mac80211/util.c
index 73069f9..8bb0853 100644
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -243,10 +243,15 @@ void ieee80211_propagate_queue_wake(struct ieee80211_local *local, int queue)
 {
 	struct ieee80211_sub_if_data *sdata;
 	int n_acs = IEEE80211_NUM_ACS;
+	const char *ifname = "unknown";
+
+	printk("%s:%u queue %08x\n", __func__, __LINE__, queue);
 
 	if (local->ops->wake_tx_queue)
 		return;
 
+	printk("%s:%u queue %08x\n", __func__, __LINE__, queue);
+
 	if (local->hw.queues < IEEE80211_NUM_ACS)
 		n_acs = 1;
 
@@ -256,13 +261,24 @@ void ieee80211_propagate_queue_wake(struct ieee80211_local *local, int queue)
 		if (!sdata->dev)
 			continue;
 
+		ifname = sdata->dev->name;
+
+		printk("%s:%u %s queue %08x\n", __func__, __LINE__, ifname, queue);
+
 		if (sdata->vif.cab_queue != IEEE80211_INVAL_HW_QUEUE &&
 		    local->queue_stop_reasons[sdata->vif.cab_queue] != 0)
 			continue;
 
+
+		printk("%s:%u %s\n", __func__, __LINE__, ifname);
+
 		for (ac = 0; ac < n_acs; ac++) {
 			int ac_queue = sdata->vif.hw_queue[ac];
 
+			printk("%s:%u %s queue %08x sdata->vif.hw_queue[ac] %08x\n", __func__, __LINE__, ifname, queue, ac_queue);
+			printk("%s:%u %s queue %08x local->queue_stop_reasons[ac_queue] %08x\n", __func__, __LINE__, ifname, queue, local->queue_stop_reasons[ac_queue]);
+			printk("%s:%u %s queue %08x skb_queue_empty(...) %08x\n", __func__, __LINE__, ifname, queue, skb_queue_empty(&local->pending[ac_queue]));
+
 			if (ac_queue == queue ||
 			    (sdata->vif.cab_queue == queue &&
 			     local->queue_stop_reasons[ac_queue] == 0 &&

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] mac80211: IBSS vif queue stopped when started after 11s vif
  2017-03-29 11:07   ` Sven Eckelmann
@ 2017-03-29 11:53       ` Johannes Berg
  0 siblings, 0 replies; 6+ messages in thread
From: Johannes Berg @ 2017-03-29 11:53 UTC (permalink / raw)
  To: Sven Eckelmann; +Cc: linux-wireless, netdev, Michal Kazior


>     if (local->ops->wake_tx_queue)
>     		return;
> 
> evaluates to true. The rest rest of the function is therefore always
> skipped for ath9k.

Ahh, yes, ok.

> Removing this is enough to fix the problem. And now you will propably
> say "hey, this is not my code". And this is the reason why I have now
> CC'ed the author of 80a83cfc434b ("mac80211: skip netdev queue
> control with software queuing"). This change in
> ieee80211_propagate_queue_wake is basically breaking 
> the (delayed) startup of the ibss netdev queue [1] when the device
> was offchan during the ieee80211_do_open of the ibss interface.
> 
> Not sure whether removing it in ieee80211_propagate_queue_wake will
> have other odd side effects with software queuing. Maybe Michal
> Kazior can tell us if it is safe to remove it.

No, it's the other way around.

Michal's patches correctly added a test for this to
__ieee80211_stop_queue(), the only missing thing is that this test
should also be in ieee80211_do_open() like this:

diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 40813dd3301c..5bb0c5012819 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -718,7 +718,8 @@ int ieee80211_do_open(struct wireless_dev *wdev, bool coming_up)
 	ieee80211_recalc_ps(local);
 
 	if (sdata->vif.type == NL80211_IFTYPE_MONITOR ||
-	    sdata->vif.type == NL80211_IFTYPE_AP_VLAN) {
+	    sdata->vif.type == NL80211_IFTYPE_AP_VLAN ||
+	    local->ops->wake_tx_queue) {
 		/* XXX: for AP_VLAN, actually track AP queues */
 		netif_tx_start_all_queues(dev);
 	} else if (dev) {

johannes

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] mac80211: IBSS vif queue stopped when started after 11s vif
@ 2017-03-29 11:53       ` Johannes Berg
  0 siblings, 0 replies; 6+ messages in thread
From: Johannes Berg @ 2017-03-29 11:53 UTC (permalink / raw)
  To: Sven Eckelmann
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Michal Kazior


>     if (local->ops->wake_tx_queue)
>     		return;
> 
> evaluates to true. The rest rest of the function is therefore always
> skipped for ath9k.

Ahh, yes, ok.

> Removing this is enough to fix the problem. And now you will propably
> say "hey, this is not my code". And this is the reason why I have now
> CC'ed the author of 80a83cfc434b ("mac80211: skip netdev queue
> control with software queuing"). This change in
> ieee80211_propagate_queue_wake is basically breaking 
> the (delayed) startup of the ibss netdev queue [1] when the device
> was offchan during the ieee80211_do_open of the ibss interface.
> 
> Not sure whether removing it in ieee80211_propagate_queue_wake will
> have other odd side effects with software queuing. Maybe Michal
> Kazior can tell us if it is safe to remove it.

No, it's the other way around.

Michal's patches correctly added a test for this to
__ieee80211_stop_queue(), the only missing thing is that this test
should also be in ieee80211_do_open() like this:

diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 40813dd3301c..5bb0c5012819 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -718,7 +718,8 @@ int ieee80211_do_open(struct wireless_dev *wdev, bool coming_up)
 	ieee80211_recalc_ps(local);
 
 	if (sdata->vif.type == NL80211_IFTYPE_MONITOR ||
-	    sdata->vif.type == NL80211_IFTYPE_AP_VLAN) {
+	    sdata->vif.type == NL80211_IFTYPE_AP_VLAN ||
+	    local->ops->wake_tx_queue) {
 		/* XXX: for AP_VLAN, actually track AP queues */
 		netif_tx_start_all_queues(dev);
 	} else if (dev) {

johannes

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] mac80211: IBSS vif queue stopped when started after 11s vif
  2017-03-29 11:53       ` Johannes Berg
  (?)
@ 2017-03-29 12:11       ` Sven Eckelmann
  -1 siblings, 0 replies; 6+ messages in thread
From: Sven Eckelmann @ 2017-03-29 12:11 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-wireless, netdev, Michal Kazior

[-- Attachment #1: Type: text/plain, Size: 1139 bytes --]

On Mittwoch, 29. März 2017 13:53:06 CEST Johannes Berg wrote:
[...]
> > Not sure whether removing it in ieee80211_propagate_queue_wake will
> > have other odd side effects with software queuing. Maybe Michal
> > Kazior can tell us if it is safe to remove it.
> 
> No, it's the other way around.
> 
> Michal's patches correctly added a test for this to
> __ieee80211_stop_queue(), the only missing thing is that this test
> should also be in ieee80211_do_open() like this:
> 
> diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
> index 40813dd3301c..5bb0c5012819 100644
> --- a/net/mac80211/iface.c
> +++ b/net/mac80211/iface.c
> @@ -718,7 +718,8 @@ int ieee80211_do_open(struct wireless_dev *wdev, bool coming_up)
>  	ieee80211_recalc_ps(local);
>  
>  	if (sdata->vif.type == NL80211_IFTYPE_MONITOR ||
> -	    sdata->vif.type == NL80211_IFTYPE_AP_VLAN) {
> +	    sdata->vif.type == NL80211_IFTYPE_AP_VLAN ||
> +	    local->ops->wake_tx_queue) {
>  		/* XXX: for AP_VLAN, actually track AP queues */
>  		netif_tx_start_all_queues(dev);
>  	} else if (dev) {

Yes, this also works.

Kind regards,
	Sven

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-03-29 12:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-22  9:15 [REGRESSION] mac80211: IBSS vif queue stopped when started after 11s vif Sven Eckelmann
2017-03-29  7:49 ` Johannes Berg
2017-03-29 11:07   ` Sven Eckelmann
2017-03-29 11:53     ` Johannes Berg
2017-03-29 11:53       ` Johannes Berg
2017-03-29 12:11       ` Sven Eckelmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.