All of lore.kernel.org
 help / color / mirror / Atom feed
* 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
@ 2019-10-21 12:11 Krzysztof Hałasa
  2019-10-21 12:18 ` [PATCH] " Krzysztof Hałasa
                   ` (7 more replies)
  0 siblings, 8 replies; 24+ messages in thread
From: Krzysztof Hałasa @ 2019-10-21 12:11 UTC (permalink / raw)
  To: Johannes Berg; +Cc: David S. Miller, linux-wireless, netdev, linux-kernel

Johannes,

it seems I've encountered a bug in mac80211 RX aggregation handler.
The hw is a pair of stations using AR9580 (PCI ID 168c:0033) PCIe
adapters. Linux 5.4-rc4.
The driver shows the chip is Atheros AR9300 Rev:4.
I'm using (on both ends):
	iw wlan0 set type ibss
	ip link set wlan0 up
	iw dev wlan0 ibss join $ESSID $FREQ HT20

The problem manifests itself after one of the stations is restarted
(or the ath9k driver is reloaded, or a station is out of range for
some time etc).
It appears that the mac80211 RX aggregation code sets a new aggregation
"session" at the remote station's request, but the head_seq_num
(the sequence number the receiver expects to receive) isn't reset.
I've added some debugging code to ___ieee80211_start_rx_ba_session()
and ieee80211_sta_manage_reorder_buf() and it produced the following:

Both stations boot and join the IBSS, packets get through:
[   61.123131] AGG RX OK: ssn 1
[   61.125346] SEQ OK: 1 vs 1
[   61.125484] SEQ OK: 2 vs 2
[   62.100841] SEQ OK: 3 vs 3
...
[  180.124210] SEQ OK: 130 vs 130
[  181.123888] SEQ OK: 131 vs 131
[  182.126046] SEQ OK: 132 vs 132

Now I'm rebooting the remote station. It joins IBSS, packets can be seen
on mon0 monitoring interface (on the local station), but they aren't
arriving on wlan0:

[  192.131102] SEQ BAD: 0 vs 133
[  192.151243] AGG RX no change - OK: ssn 1
[  192.242760] SEQ BAD: 1 vs 133
[  193.133819] SEQ BAD: 2 vs 133
[  193.272802] SEQ BAD: 3 vs 133
...
[  421.272374] SEQ BAD: 130 vs 133
[  421.303630] SEQ BAD: 131 vs 133
[  422.327924] SEQ BAD: 132 vs 133

Then the sequence number catches up and the communication is
reestablished:
[  423.167023] SEQ OK: 133 vs 133
[  423.169061] SEQ OK: 134 vs 134
[  423.351618] SEQ OK: 135 vs 135

I'll attach a patch in a separate mail but I'm not sure if it's
the optimal fix - one packet (the "SEQ BAD: 0 vs 133) is still dropped,
and I guess it won't work if the sender decides to not request
aggregation anymore.

Comments?

The debugging code:
--- a/net/mac80211/agg-rx.c
+++ b/net/mac80211/agg-rx.c
@@ -354,9 +354,10 @@ void ___ieee80211_start_rx_ba_session(struct sta_info *sta,
 			 */
 			rcu_read_lock();
 			tid_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[tid]);
-			if (tid_rx && tid_rx->timeout == timeout)
+			if (tid_rx && tid_rx->timeout == timeout) {
 				status = WLAN_STATUS_SUCCESS;
-			else
+				printk(KERN_DEBUG "AGG RX no change - OK: ssn %u\n", start_seq_num);
+			} else
 				status = WLAN_STATUS_REQUEST_DECLINED;
 			rcu_read_unlock();
 			goto end;
@@ -434,6 +437,7 @@ void ___ieee80211_start_rx_ba_session(struct sta_info *sta,
 	tid_agg_rx->tid = tid;
 	tid_agg_rx->sta = sta;
 	status = WLAN_STATUS_SUCCESS;
+	printk(KERN_DEBUG "AGG RX OK: ssn %u\n", start_seq_num);
 
 	/* activate it for RX */
 	rcu_assign_pointer(sta->ampdu_mlme.tid_rx[tid], tid_agg_rx);
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -1298,9 +1298,11 @@ static bool ieee80211_sta_manage_reorder_buf(struct ieee80211_sub_if_data *sdata
 
 	/* frame with out of date sequence number */
 	if (ieee80211_sn_less(mpdu_seq_num, head_seq_num)) {
+		printk(KERN_DEBUG "SEQ BAD: %u vs %u\n", mpdu_seq_num, head_seq_num);
 		dev_kfree_skb(skb);
 		goto out;
-	}
+	} else
+		printk(KERN_DEBUG "SEQ OK: %u vs %u\n", mpdu_seq_num, head_seq_num);
 
 	/*
 	 * If frame the sequence number exceeds our buffering window

-- 
Krzysztof Hałasa

ŁUKASIEWICZ Research Network
Industrial Research Institute for Automation and Measurements PIAP
Al. Jerozolimskie 202, 02-486 Warsaw, Poland

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-21 12:11 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot Krzysztof Hałasa
  2019-10-21 12:18 ` [PATCH] " Krzysztof Hałasa
@ 2019-10-21 12:18 ` Krzysztof Hałasa
  2019-10-25 10:21 ` [PATCH v2] " Krzysztof Hałasa
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Krzysztof Hałasa @ 2019-10-21 12:18 UTC (permalink / raw)
  To: Johannes Berg; +Cc: David S. Miller, linux-wireless, netdev, linux-kernel

Fix a bug where the mac80211 RX aggregation code sets a new aggregation
"session" at the remote station's request, but the head_seq_num
(the sequence number the receiver expects to receive) isn't reset.

Spotted on a pair of AR9580 in IBSS mode.

diff --git a/net/mac80211/agg-rx.c b/net/mac80211/agg-rx.c
index 4d1c335e06e5..775a51cc51c9 100644
--- a/net/mac80211/agg-rx.c
+++ b/net/mac80211/agg-rx.c
@@ -354,9 +354,11 @@ void ___ieee80211_start_rx_ba_session(struct sta_info *sta,
 			 */
 			rcu_read_lock();
 			tid_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[tid]);
-			if (tid_rx && tid_rx->timeout == timeout)
+			if (tid_rx && tid_rx->timeout == timeout) {
+				tid_rx->ssn = start_seq_num;
+				tid_rx->head_seq_num = start_seq_num;
 				status = WLAN_STATUS_SUCCESS;
-			else
+			} else
 				status = WLAN_STATUS_REQUEST_DECLINED;
 			rcu_read_unlock();
 			goto end;

-- 
Krzysztof Hałasa

ŁUKASIEWICZ Research Network
Industrial Research Institute for Automation and Measurements PIAP
Al. Jerozolimskie 202, 02-486 Warsaw, Poland

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-21 12:11 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot Krzysztof Hałasa
@ 2019-10-21 12:18 ` Krzysztof Hałasa
  2019-10-22  9:42   ` Sergei Shtylyov
  2019-10-21 12:18 ` Krzysztof Hałasa
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Krzysztof Hałasa @ 2019-10-21 12:18 UTC (permalink / raw)
  To: Johannes Berg; +Cc: David S. Miller, linux-wireless, netdev, linux-kernel

Fix a bug where the mac80211 RX aggregation code sets a new aggregation
"session" at the remote station's request, but the head_seq_num
(the sequence number the receiver expects to receive) isn't reset.

Spotted on a pair of AR9580 in IBSS mode.

Signed-off-by: Krzysztof Halasa <khalasa@piap.pl>

diff --git a/net/mac80211/agg-rx.c b/net/mac80211/agg-rx.c
index 4d1c335e06e5..775a51cc51c9 100644
--- a/net/mac80211/agg-rx.c
+++ b/net/mac80211/agg-rx.c
@@ -354,9 +354,11 @@ void ___ieee80211_start_rx_ba_session(struct sta_info *sta,
 			 */
 			rcu_read_lock();
 			tid_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[tid]);
-			if (tid_rx && tid_rx->timeout == timeout)
+			if (tid_rx && tid_rx->timeout == timeout) {
+				tid_rx->ssn = start_seq_num;
+				tid_rx->head_seq_num = start_seq_num;
 				status = WLAN_STATUS_SUCCESS;
-			else
+			} else
 				status = WLAN_STATUS_REQUEST_DECLINED;
 			rcu_read_unlock();
 			goto end;

-- 
Krzysztof Hałasa

ŁUKASIEWICZ Research Network
Industrial Research Institute for Automation and Measurements PIAP
Al. Jerozolimskie 202, 02-486 Warsaw, Poland

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-21 12:18 ` [PATCH] " Krzysztof Hałasa
@ 2019-10-22  9:42   ` Sergei Shtylyov
  0 siblings, 0 replies; 24+ messages in thread
From: Sergei Shtylyov @ 2019-10-22  9:42 UTC (permalink / raw)
  To: Krzysztof Hałasa, Johannes Berg
  Cc: David S. Miller, linux-wireless, netdev, linux-kernel

Hello!

On 21.10.2019 15:18, Krzysztof Hałasa wrote:

> Fix a bug where the mac80211 RX aggregation code sets a new aggregation
> "session" at the remote station's request, but the head_seq_num
> (the sequence number the receiver expects to receive) isn't reset.
> 
> Spotted on a pair of AR9580 in IBSS mode.
> 
> Signed-off-by: Krzysztof Halasa <khalasa@piap.pl>
> 
> diff --git a/net/mac80211/agg-rx.c b/net/mac80211/agg-rx.c
> index 4d1c335e06e5..775a51cc51c9 100644
> --- a/net/mac80211/agg-rx.c
> +++ b/net/mac80211/agg-rx.c
> @@ -354,9 +354,11 @@ void ___ieee80211_start_rx_ba_session(struct sta_info *sta,
>   			 */
>   			rcu_read_lock();
>   			tid_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[tid]);
> -			if (tid_rx && tid_rx->timeout == timeout)
> +			if (tid_rx && tid_rx->timeout == timeout) {
> +				tid_rx->ssn = start_seq_num;
> +				tid_rx->head_seq_num = start_seq_num;
>   				status = WLAN_STATUS_SUCCESS;
> -			else
> +			} else

    If you add {} on one branch of *if*, you also need to add {} to all other 
branches, says CodingStyle...

[...]

MBR, Sergei

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-21 12:11 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot Krzysztof Hałasa
  2019-10-21 12:18 ` [PATCH] " Krzysztof Hałasa
  2019-10-21 12:18 ` Krzysztof Hałasa
@ 2019-10-25 10:21 ` Krzysztof Hałasa
  2019-10-28 12:21   ` Johannes Berg
  2019-12-09 10:28 ` [RFC] Allow userspace to reset IBSS stations to fix aggregation issue Nicolas Cavallari
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Krzysztof Hałasa @ 2019-10-25 10:21 UTC (permalink / raw)
  To: Johannes Berg; +Cc: David S. Miller, linux-wireless, netdev, linux-kernel

Fix a bug where the mac80211 RX aggregation code sets a new aggregation
"session" at the remote station's request, but the head_seq_num
(the sequence number the receiver expects to receive) isn't reset.

Spotted on a pair of AR9580 in IBSS mode.

Signed-off-by: Krzysztof Halasa <khalasa@piap.pl>

diff --git a/net/mac80211/agg-rx.c b/net/mac80211/agg-rx.c
index 4d1c335e06e5..67733bd61297 100644
--- a/net/mac80211/agg-rx.c
+++ b/net/mac80211/agg-rx.c
@@ -354,10 +354,13 @@ void ___ieee80211_start_rx_ba_session(struct sta_info *sta,
 			 */
 			rcu_read_lock();
 			tid_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[tid]);
-			if (tid_rx && tid_rx->timeout == timeout)
+			if (tid_rx && tid_rx->timeout == timeout) {
+				tid_rx->ssn = start_seq_num;
+				tid_rx->head_seq_num = start_seq_num;
 				status = WLAN_STATUS_SUCCESS;
-			else
+			} else {
 				status = WLAN_STATUS_REQUEST_DECLINED;
+			}
 			rcu_read_unlock();
 			goto end;
 		}

-- 
Krzysztof Hałasa

ŁUKASIEWICZ Research Network
Industrial Research Institute for Automation and Measurements PIAP
Al. Jerozolimskie 202, 02-486 Warsaw, Poland

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-25 10:21 ` [PATCH v2] " Krzysztof Hałasa
@ 2019-10-28 12:21   ` Johannes Berg
  2019-10-29  8:41     ` Koen Vandeputte
  2019-10-29  8:54     ` Krzysztof Hałasa
  0 siblings, 2 replies; 24+ messages in thread
From: Johannes Berg @ 2019-10-28 12:21 UTC (permalink / raw)
  To: Krzysztof Hałasa
  Cc: David S. Miller, linux-wireless, netdev, linux-kernel

On Fri, 2019-10-25 at 12:21 +0200, Krzysztof Hałasa wrote:
> Fix a bug where the mac80211 RX aggregation code sets a new aggregation
> "session" at the remote station's request, but the head_seq_num
> (the sequence number the receiver expects to receive) isn't reset.
> 
> Spotted on a pair of AR9580 in IBSS mode.
> 
> Signed-off-by: Krzysztof Halasa <khalasa@piap.pl>
> 
> diff --git a/net/mac80211/agg-rx.c b/net/mac80211/agg-rx.c
> index 4d1c335e06e5..67733bd61297 100644
> --- a/net/mac80211/agg-rx.c
> +++ b/net/mac80211/agg-rx.c
> @@ -354,10 +354,13 @@ void ___ieee80211_start_rx_ba_session(struct sta_info *sta,
>  			 */
>  			rcu_read_lock();
>  			tid_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[tid]);
> -			if (tid_rx && tid_rx->timeout == timeout)
> +			if (tid_rx && tid_rx->timeout == timeout) {
> +				tid_rx->ssn = start_seq_num;
> +				tid_rx->head_seq_num = start_seq_num;
>  				status = WLAN_STATUS_SUCCESS;

This is wrong, this is the case of *updating an existing session*, we
must not reset the head SN then.

I think you just got very lucky (or unlucky) to have the same dialog
token, because we start from 0 - maybe we should initialize it to a
random value to flush out such issues.

Really what I think probably happened is that one of your stations lost
the connection to the other, and didn't tell it about it in any way - so
the other kept all the status alive.

I suspect to make all this work well we need to not only have the fixes
I made recently to actually send and parse deauth frames, but also to
even send an auth and reset the state when we receive that, so if we
move out of range and even the deauth frame is lost, we can still reset
properly.

In any case, this is not the right approach - we need to handle the
"lost connection" case better I suspect, but since you don't say what
really happened I don't really know that that's what you're seeing.

johannes


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-28 12:21   ` Johannes Berg
@ 2019-10-29  8:41     ` Koen Vandeputte
  2019-10-29  8:58       ` Sebastian Gottschall
  2019-10-29  9:03       ` Johannes Berg
  2019-10-29  8:54     ` Krzysztof Hałasa
  1 sibling, 2 replies; 24+ messages in thread
From: Koen Vandeputte @ 2019-10-29  8:41 UTC (permalink / raw)
  To: Johannes Berg, Krzysztof Hałasa
  Cc: David S. Miller, linux-wireless, netdev, linux-kernel


On 28.10.19 13:21, Johannes Berg wrote:
> On Fri, 2019-10-25 at 12:21 +0200, Krzysztof Hałasa wrote:
>> Fix a bug where the mac80211 RX aggregation code sets a new aggregation
>> "session" at the remote station's request, but the head_seq_num
>> (the sequence number the receiver expects to receive) isn't reset.
>>
>> Spotted on a pair of AR9580 in IBSS mode.
>>
>> Signed-off-by: Krzysztof Halasa <khalasa@piap.pl>
>>
>> diff --git a/net/mac80211/agg-rx.c b/net/mac80211/agg-rx.c
>> index 4d1c335e06e5..67733bd61297 100644
>> --- a/net/mac80211/agg-rx.c
>> +++ b/net/mac80211/agg-rx.c
>> @@ -354,10 +354,13 @@ void ___ieee80211_start_rx_ba_session(struct sta_info *sta,
>>   			 */
>>   			rcu_read_lock();
>>   			tid_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[tid]);
>> -			if (tid_rx && tid_rx->timeout == timeout)
>> +			if (tid_rx && tid_rx->timeout == timeout) {
>> +				tid_rx->ssn = start_seq_num;
>> +				tid_rx->head_seq_num = start_seq_num;
>>   				status = WLAN_STATUS_SUCCESS;
> This is wrong, this is the case of *updating an existing session*, we
> must not reset the head SN then.
>
> I think you just got very lucky (or unlucky) to have the same dialog
> token, because we start from 0 - maybe we should initialize it to a
> random value to flush out such issues.
>
> Really what I think probably happened is that one of your stations lost
> the connection to the other, and didn't tell it about it in any way - so
> the other kept all the status alive.
>
> I suspect to make all this work well we need to not only have the fixes
> I made recently to actually send and parse deauth frames, but also to
> even send an auth and reset the state when we receive that, so if we
> move out of range and even the deauth frame is lost, we can still reset
> properly.
>
> In any case, this is not the right approach - we need to handle the
> "lost connection" case better I suspect, but since you don't say what
> really happened I don't really know that that's what you're seeing.
>
> johannes

Hi all,

I can confirm the issue as I'm also seeing this sometimes in the field here.

Sometimes when a devices goes out of range and then re-enters,
the link refuses to "come up", as in rx looks to be "stuck" without any 
reports in system log or locking issues (lockdep enabled)

I have dozens of devices installed offshore (802.11n based), both on 
static and moving assets,
which cover from short (250m) up to very long distances (~35km)

So .. while there is some momentum for this issue,
I'm more than happy to provide extensive testing should fixes be posted 
regarding IBSS in general.

Regards,

Koen


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-28 12:21   ` Johannes Berg
  2019-10-29  8:41     ` Koen Vandeputte
@ 2019-10-29  8:54     ` Krzysztof Hałasa
  2019-10-29  9:07       ` Johannes Berg
  1 sibling, 1 reply; 24+ messages in thread
From: Krzysztof Hałasa @ 2019-10-29  8:54 UTC (permalink / raw)
  To: Johannes Berg; +Cc: David S. Miller, linux-wireless, netdev, linux-kernel

Johannes Berg <johannes@sipsolutions.net> writes:

> I think you just got very lucky (or unlucky) to have the same dialog
> token, because we start from 0

Right, it seems to be the case.

> - maybe we should initialize it to a
> random value to flush out such issues.

The problem I can see is that the dialog_tokens are 8-bit, way too small
to eliminate conflicts.

> Really what I think probably happened is that one of your stations lost
> the connection to the other, and didn't tell it about it in any way - so
> the other kept all the status alive.

You must have missed my previous mail - I simply rebooted that station,
and alternatively rmmoded/modprobed ath9k. But the problem originated in
a station going out of and back in range, in fact.

> I suspect to make all this work well we need to not only have the fixes
> I made recently to actually send and parse deauth frames, but also to
> even send an auth and reset the state when we receive that, so if we
> move out of range and even the deauth frame is lost, we can still reset
> properly.

That's one thing. The other is a station trying ADDBA for the first time
after boot (while the local station has seen it before that reboot).

> In any case, this is not the right approach - we need to handle the
> "lost connection" case better I suspect, but since you don't say what
> really happened I don't really know that that's what you're seeing.

I guess we need to identify "new connection" reliably. Otherwise,
the new connections are treated as old ones and it doesn't work.

Now how can it be fixed?
-- 
Krzysztof Halasa

ŁUKASIEWICZ Research Network
Industrial Research Institute for Automation and Measurements PIAP
Al. Jerozolimskie 202, 02-486 Warsaw, Poland

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-29  8:41     ` Koen Vandeputte
@ 2019-10-29  8:58       ` Sebastian Gottschall
  2019-10-29  9:40         ` Koen Vandeputte
  2019-10-29  9:03       ` Johannes Berg
  1 sibling, 1 reply; 24+ messages in thread
From: Sebastian Gottschall @ 2019-10-29  8:58 UTC (permalink / raw)
  To: Koen Vandeputte, Johannes Berg, Krzysztof Hałasa
  Cc: David S. Miller, linux-wireless, netdev, linux-kernel

35 km? for 802.11n with ht40 this is out of the ack timing range the 
chipset supports. so this should be considered at any troubles with 
connections

Am 29.10.2019 um 09:41 schrieb Koen Vandeputte:
>
> On 28.10.19 13:21, Johannes Berg wrote:
>> On Fri, 2019-10-25 at 12:21 +0200, Krzysztof Hałasa wrote:
>>> Fix a bug where the mac80211 RX aggregation code sets a new aggregation
>>> "session" at the remote station's request, but the head_seq_num
>>> (the sequence number the receiver expects to receive) isn't reset.
>>>
>>> Spotted on a pair of AR9580 in IBSS mode.
>>>
>>> Signed-off-by: Krzysztof Halasa <khalasa@piap.pl>
>>>
>>> diff --git a/net/mac80211/agg-rx.c b/net/mac80211/agg-rx.c
>>> index 4d1c335e06e5..67733bd61297 100644
>>> --- a/net/mac80211/agg-rx.c
>>> +++ b/net/mac80211/agg-rx.c
>>> @@ -354,10 +354,13 @@ void ___ieee80211_start_rx_ba_session(struct 
>>> sta_info *sta,
>>>                */
>>>               rcu_read_lock();
>>>               tid_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[tid]);
>>> -            if (tid_rx && tid_rx->timeout == timeout)
>>> +            if (tid_rx && tid_rx->timeout == timeout) {
>>> +                tid_rx->ssn = start_seq_num;
>>> +                tid_rx->head_seq_num = start_seq_num;
>>>                   status = WLAN_STATUS_SUCCESS;
>> This is wrong, this is the case of *updating an existing session*, we
>> must not reset the head SN then.
>>
>> I think you just got very lucky (or unlucky) to have the same dialog
>> token, because we start from 0 - maybe we should initialize it to a
>> random value to flush out such issues.
>>
>> Really what I think probably happened is that one of your stations lost
>> the connection to the other, and didn't tell it about it in any way - so
>> the other kept all the status alive.
>>
>> I suspect to make all this work well we need to not only have the fixes
>> I made recently to actually send and parse deauth frames, but also to
>> even send an auth and reset the state when we receive that, so if we
>> move out of range and even the deauth frame is lost, we can still reset
>> properly.
>>
>> In any case, this is not the right approach - we need to handle the
>> "lost connection" case better I suspect, but since you don't say what
>> really happened I don't really know that that's what you're seeing.
>>
>> johannes
>
> Hi all,
>
> I can confirm the issue as I'm also seeing this sometimes in the field 
> here.
>
> Sometimes when a devices goes out of range and then re-enters,
> the link refuses to "come up", as in rx looks to be "stuck" without 
> any reports in system log or locking issues (lockdep enabled)
>
> I have dozens of devices installed offshore (802.11n based), both on 
> static and moving assets,
> which cover from short (250m) up to very long distances (~35km)
>
> So .. while there is some momentum for this issue,
> I'm more than happy to provide extensive testing should fixes be 
> posted regarding IBSS in general.
>
> Regards,
>
> Koen
>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-29  8:41     ` Koen Vandeputte
  2019-10-29  8:58       ` Sebastian Gottschall
@ 2019-10-29  9:03       ` Johannes Berg
  2019-10-29  9:47         ` Koen Vandeputte
  1 sibling, 1 reply; 24+ messages in thread
From: Johannes Berg @ 2019-10-29  9:03 UTC (permalink / raw)
  To: Koen Vandeputte, Krzysztof Hałasa
  Cc: David S. Miller, linux-wireless, netdev, linux-kernel

On Tue, 2019-10-29 at 09:41 +0100, Koen Vandeputte wrote:

> I can confirm the issue as I'm also seeing this sometimes in the field here.
> 
> Sometimes when a devices goes out of range and then re-enters,
> the link refuses to "come up", as in rx looks to be "stuck" without any 
> reports in system log or locking issues (lockdep enabled)

Right. I've recently debugged this due to issues in distributed
beaconing (rather than moving in/out of range), but I guess it would be
relatively simple to reproduce this with wmediumd, if that can be
controlled dynamically?

What kernel are you running? You could check if you have

95697f9907bf ("mac80211: accept deauth frames in IBSS mode")
4b08d1b6a994 ("mac80211: IBSS: send deauth when expiring inactive STAs")

which might help somewhat, but don't fully cover the case of moving out
of range.

johannes


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-29  8:54     ` Krzysztof Hałasa
@ 2019-10-29  9:07       ` Johannes Berg
  2019-10-29 10:51         ` Krzysztof Hałasa
  0 siblings, 1 reply; 24+ messages in thread
From: Johannes Berg @ 2019-10-29  9:07 UTC (permalink / raw)
  To: Krzysztof Hałasa
  Cc: David S. Miller, linux-wireless, netdev, linux-kernel

On Tue, 2019-10-29 at 09:54 +0100, Krzysztof Hałasa wrote:

> The problem I can see is that the dialog_tokens are 8-bit, way too small
> to eliminate conflicts.

Well, they're also per station, we could just randomize the start and
then we'd delete the old session and start a new one, on the receiver.

So that would improve robustness somewhat (down to a 1/256 chance to hit
this problem).

> > Really what I think probably happened is that one of your stations lost
> > the connection to the other, and didn't tell it about it in any way - so
> > the other kept all the status alive.
> 
> You must have missed my previous mail - I simply rebooted that station,
> and alternatively rmmoded/modprobed ath9k. But the problem originated in
> a station going out of and back in range, in fact.

I was on vacation, so yeah, quite possible I missed it.

Sounds like we need not just
4b08d1b6a994 ("mac80211: IBSS: send deauth when expiring inactive STAs")

but also send a deauth when we disconnect. Surprising we don't do that,
actually.

> > I suspect to make all this work well we need to not only have the fixes
> > I made recently to actually send and parse deauth frames, but also to
> > even send an auth and reset the state when we receive that, so if we
> > move out of range and even the deauth frame is lost, we can still reset
> > properly.
> 
> That's one thing. The other is a station trying ADDBA for the first time
> after boot (while the local station has seen it before that reboot).

That's the situation though - the local station needs to know that it
has in fact *not* seen the same instance of the station, but that the
station has reset and needs to be removed & re-added.

> I guess we need to identify "new connection" reliably. Otherwise,
> the new connections are treated as old ones and it doesn't work.

Right. But we can implement the (optional) authentication (which you
actually already get when you implement [encrypted] IBSS with wpa_s),
and reset the station state when we get an authentication frame.

johannes


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-29  8:58       ` Sebastian Gottschall
@ 2019-10-29  9:40         ` Koen Vandeputte
  0 siblings, 0 replies; 24+ messages in thread
From: Koen Vandeputte @ 2019-10-29  9:40 UTC (permalink / raw)
  To: Sebastian Gottschall, Johannes Berg, Krzysztof Hałasa
  Cc: David S. Miller, linux-wireless, netdev, linux-kernel


On 29.10.19 09:58, Sebastian Gottschall wrote:
> 35 km? for 802.11n with ht40 this is out of the ack timing range the 
> chipset supports. so this should be considered at any troubles with 
> connections
>
(Please don't top-post)

When we know a link can exceed ~ 21km, it's set to HT20 for this reason.

Koen

> Am 29.10.2019 um 09:41 schrieb Koen Vandeputte:
>>
>> On 28.10.19 13:21, Johannes Berg wrote:
>>> On Fri, 2019-10-25 at 12:21 +0200, Krzysztof Hałasa wrote:
>>>> Fix a bug where the mac80211 RX aggregation code sets a new 
>>>> aggregation
>>>> "session" at the remote station's request, but the head_seq_num
>>>> (the sequence number the receiver expects to receive) isn't reset.
>>>>
>>>> Spotted on a pair of AR9580 in IBSS mode.
>>>>
>>>> Signed-off-by: Krzysztof Halasa <khalasa@piap.pl>
>>>>
>>>> diff --git a/net/mac80211/agg-rx.c b/net/mac80211/agg-rx.c
>>>> index 4d1c335e06e5..67733bd61297 100644
>>>> --- a/net/mac80211/agg-rx.c
>>>> +++ b/net/mac80211/agg-rx.c
>>>> @@ -354,10 +354,13 @@ void ___ieee80211_start_rx_ba_session(struct 
>>>> sta_info *sta,
>>>>                */
>>>>               rcu_read_lock();
>>>>               tid_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[tid]);
>>>> -            if (tid_rx && tid_rx->timeout == timeout)
>>>> +            if (tid_rx && tid_rx->timeout == timeout) {
>>>> +                tid_rx->ssn = start_seq_num;
>>>> +                tid_rx->head_seq_num = start_seq_num;
>>>>                   status = WLAN_STATUS_SUCCESS;
>>> This is wrong, this is the case of *updating an existing session*, we
>>> must not reset the head SN then.
>>>
>>> I think you just got very lucky (or unlucky) to have the same dialog
>>> token, because we start from 0 - maybe we should initialize it to a
>>> random value to flush out such issues.
>>>
>>> Really what I think probably happened is that one of your stations lost
>>> the connection to the other, and didn't tell it about it in any way 
>>> - so
>>> the other kept all the status alive.
>>>
>>> I suspect to make all this work well we need to not only have the fixes
>>> I made recently to actually send and parse deauth frames, but also to
>>> even send an auth and reset the state when we receive that, so if we
>>> move out of range and even the deauth frame is lost, we can still reset
>>> properly.
>>>
>>> In any case, this is not the right approach - we need to handle the
>>> "lost connection" case better I suspect, but since you don't say what
>>> really happened I don't really know that that's what you're seeing.
>>>
>>> johannes
>>
>> Hi all,
>>
>> I can confirm the issue as I'm also seeing this sometimes in the 
>> field here.
>>
>> Sometimes when a devices goes out of range and then re-enters,
>> the link refuses to "come up", as in rx looks to be "stuck" without 
>> any reports in system log or locking issues (lockdep enabled)
>>
>> I have dozens of devices installed offshore (802.11n based), both on 
>> static and moving assets,
>> which cover from short (250m) up to very long distances (~35km)
>>
>> So .. while there is some momentum for this issue,
>> I'm more than happy to provide extensive testing should fixes be 
>> posted regarding IBSS in general.
>>
>> Regards,
>>
>> Koen
>>
>>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-29  9:03       ` Johannes Berg
@ 2019-10-29  9:47         ` Koen Vandeputte
  0 siblings, 0 replies; 24+ messages in thread
From: Koen Vandeputte @ 2019-10-29  9:47 UTC (permalink / raw)
  To: Johannes Berg, Krzysztof Hałasa
  Cc: David S. Miller, linux-wireless, netdev, linux-kernel


On 29.10.19 10:03, Johannes Berg wrote:
> On Tue, 2019-10-29 at 09:41 +0100, Koen Vandeputte wrote:
>
>> I can confirm the issue as I'm also seeing this sometimes in the field here.
>>
>> Sometimes when a devices goes out of range and then re-enters,
>> the link refuses to "come up", as in rx looks to be "stuck" without any
>> reports in system log or locking issues (lockdep enabled)
> Right. I've recently debugged this due to issues in distributed
> beaconing (rather than moving in/out of range), but I guess it would be
> relatively simple to reproduce this with wmediumd, if that can be
> controlled dynamically?
>
> What kernel are you running? You could check if you have
>
> 95697f9907bf ("mac80211: accept deauth frames in IBSS mode")
> 4b08d1b6a994 ("mac80211: IBSS: send deauth when expiring inactive STAs")
>
> which might help somewhat, but don't fully cover the case of moving out
> of range.
>
> johannes
>
I'm running OpenWrt (kernel 4.14.150 with 4.19.79 mac80211)
I noticed these fixes last week and made a build 2 days ago with them 
backported to it.
Running in the field on roughly 4 devices since a day.

Koen


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-29  9:07       ` Johannes Berg
@ 2019-10-29 10:51         ` Krzysztof Hałasa
  2019-10-29 10:57           ` Johannes Berg
  0 siblings, 1 reply; 24+ messages in thread
From: Krzysztof Hałasa @ 2019-10-29 10:51 UTC (permalink / raw)
  To: Johannes Berg; +Cc: David S. Miller, linux-wireless, netdev, linux-kernel

Johannes Berg <johannes@sipsolutions.net> writes:

>> The problem I can see is that the dialog_tokens are 8-bit, way too small
>> to eliminate conflicts.
>
> Well, they're also per station, we could just randomize the start and
> then we'd delete the old session and start a new one, on the receiver.
>
> So that would improve robustness somewhat (down to a 1/256 chance to hit
> this problem).

That was what I meant. Still, 1/256 seems hardly acceptable to me -
unless there is some work around (a short timeout or something similar).
Remember that when it doesn't work, it doesn't work - it won't recover
until the sequence catches up, which may mean basically forever.

Or, maybe the remote station can request de-aggregation first, so the
subsequent aggregation request is always treated as new?

Alternatively, perhaps the remote can signal that it's a new request and
not merely an existing session?

> That's the situation though - the local station needs to know that it
> has in fact *not* seen the same instance of the station, but that the
> station has reset and needs to be removed & re-added.

Precisely. And it seems to me that the first time the local station
learns of this is when a new, regular, non-aggregated packet arrives.
Or, when a new aggregation request arrives.
-- 
Krzysztof Halasa

ŁUKASIEWICZ Research Network
Industrial Research Institute for Automation and Measurements PIAP
Al. Jerozolimskie 202, 02-486 Warsaw, Poland

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot
  2019-10-29 10:51         ` Krzysztof Hałasa
@ 2019-10-29 10:57           ` Johannes Berg
  0 siblings, 0 replies; 24+ messages in thread
From: Johannes Berg @ 2019-10-29 10:57 UTC (permalink / raw)
  To: Krzysztof Hałasa
  Cc: David S. Miller, linux-wireless, netdev, linux-kernel

On Tue, 2019-10-29 at 11:51 +0100, Krzysztof Hałasa wrote:
> Johannes Berg <johannes@sipsolutions.net> writes:
> 
> > > The problem I can see is that the dialog_tokens are 8-bit, way too small
> > > to eliminate conflicts.
> > 
> > Well, they're also per station, we could just randomize the start and
> > then we'd delete the old session and start a new one, on the receiver.
> > 
> > So that would improve robustness somewhat (down to a 1/256 chance to hit
> > this problem).
> 
> That was what I meant. Still, 1/256 seems hardly acceptable to me -
> unless there is some work around (a short timeout or something similar).
> Remember that when it doesn't work, it doesn't work - it won't recover
> until the sequence catches up, which may mean basically forever.

Agree, it just helps in "most" cases to do this. Perhaps we shouldn't do
this then so that we find the problem more easily...

> Or, maybe the remote station can request de-aggregation first, so the
> subsequent aggregation request is always treated as new?

> Alternatively, perhaps the remote can signal that it's a new request and
> not merely an existing session?

I think we should just implement authentication and reset of the station
properly, instead of fudging around with aggregation. This is just one
possible problematic scenario ... what if the station was reconfigured
with a different number of antennas in the meantime, for example, or
whatnot. There's a lot of state we keep for each station.

> > That's the situation though - the local station needs to know that it
> > has in fact *not* seen the same instance of the station, but that the
> > station has reset and needs to be removed & re-added.
> 
> Precisely. And it seems to me that the first time the local station
> learns of this is when a new, regular, non-aggregated packet arrives.
> Or, when a new aggregation request arrives.

Well, it should learn about the station when there's a beacon from it,
or if not ... we have a patch to force a probe request/response cycle so
we have all the capabilities properly. We should upstream that patch,
but need to do something to avoid being able to use this for traffic
amplification attacks.

johannes


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC] Allow userspace to reset IBSS stations to fix aggregation issue
  2019-10-21 12:11 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot Krzysztof Hałasa
                   ` (2 preceding siblings ...)
  2019-10-25 10:21 ` [PATCH v2] " Krzysztof Hałasa
@ 2019-12-09 10:28 ` Nicolas Cavallari
  2019-12-11  6:58   ` Krzysztof Hałasa
  2019-12-09 10:28 ` [RFC PATCH v1 1/4] ath6kl: Do not allow deleting station in IBSS mode Nicolas Cavallari
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Nicolas Cavallari @ 2019-12-09 10:28 UTC (permalink / raw)
  To: Krzysztof Hałasa, Johannes Berg; +Cc: linux-wireless, Kalle Valo

I encountered the same issue in an IBSS-RSN network, where quick reboot
of a station would cause issues with aggregation because the kernel is
not aware of the reboot.

I figured out that since wpa_supplicant already detect reboots, the
simplest way to fix it would be for wpa_supplicant to reset the entire
state of the station in the kernel, instead of just resetting keys and
port.

This means extending NL80211_CMD_DEL_STATION to work in IBSS mode too.
My current wpa_supplicant patch tries this new API but fall back to the
old methods if an error is returned.

The changes required for drivers would be as follow:
- mac80211: will just work
- wil6210: does not seem to support IBSS mode
- ath6kl: need a patch
- brcmfmac: need a patch
- mwifiex: need a patch
- qtnfmac: no IBSS support
- staging/wilc1000: no IBSS support
- staging/rtl8723bs: already returns EINVAL in this case.

I made patches for ath6kl, brcmfmac and mwifiex that plainly reject
the request. If there is a better way, i'll glad to hear it.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH v1 1/4] ath6kl: Do not allow deleting station in IBSS mode.
  2019-10-21 12:11 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot Krzysztof Hałasa
                   ` (3 preceding siblings ...)
  2019-12-09 10:28 ` [RFC] Allow userspace to reset IBSS stations to fix aggregation issue Nicolas Cavallari
@ 2019-12-09 10:28 ` Nicolas Cavallari
  2019-12-09 10:28 ` [RFC PATCH v1 2/4] brcmfmac: " Nicolas Cavallari
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Nicolas Cavallari @ 2019-12-09 10:28 UTC (permalink / raw)
  To: Krzysztof Hałasa, Johannes Berg; +Cc: linux-wireless, Kalle Valo

In preparation of allowing userspace to explicitly reset station state
in IBSS mode, reject attempts to delete stations in IBSS mode with error
ENOTSUPP.

Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
--
I do not know if the driver could possibly support this, so take the
safe route and reject it.
---
 drivers/net/wireless/ath/ath6kl/cfg80211.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/wireless/ath/ath6kl/cfg80211.c b/drivers/net/wireless/ath/ath6kl/cfg80211.c
index 37cf602d8adf..c894b9756605 100644
--- a/drivers/net/wireless/ath/ath6kl/cfg80211.c
+++ b/drivers/net/wireless/ath/ath6kl/cfg80211.c
@@ -2993,6 +2993,9 @@ static int ath6kl_del_station(struct wiphy *wiphy, struct net_device *dev,
 	struct ath6kl *ar = ath6kl_priv(dev);
 	struct ath6kl_vif *vif = netdev_priv(dev);
 	const u8 *addr = params->mac ? params->mac : bcast_addr;
+	if (vif->wdev.iftype != NL80211_IFTYPE_AP &&
+	    vif->wdev.iftype != NL80211_IFTYPE_P2P_GO)
+		return -ENOTSUPP;
 
 	return ath6kl_wmi_ap_set_mlme(ar->wmi, vif->fw_vif_idx, WMI_AP_DEAUTH,
 				      addr, WLAN_REASON_PREV_AUTH_NOT_VALID);
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v1 2/4] brcmfmac: Do not allow deleting station in IBSS mode.
  2019-10-21 12:11 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot Krzysztof Hałasa
                   ` (4 preceding siblings ...)
  2019-12-09 10:28 ` [RFC PATCH v1 1/4] ath6kl: Do not allow deleting station in IBSS mode Nicolas Cavallari
@ 2019-12-09 10:28 ` Nicolas Cavallari
  2019-12-09 10:28 ` [RFC PATCH v1 3/4] mwifiex: " Nicolas Cavallari
  2019-12-09 10:28 ` [RFC PATCH v1 4/4] nl80211: Allow deleting stations in ibss mode to reset their state Nicolas Cavallari
  7 siblings, 0 replies; 24+ messages in thread
From: Nicolas Cavallari @ 2019-12-09 10:28 UTC (permalink / raw)
  To: Krzysztof Hałasa, Johannes Berg; +Cc: linux-wireless, Kalle Valo

In preparation of allowing userspace to explicitly reset station state
in IBSS mode, reject attempts to delete stations in IBSS mode with error
EOPNOTSUPP.

Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
--
I do not know if the driver could possibly support this, so take the
safe route and reject it.
---
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
index 5598bbd09b62..5436ffd74ecb 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
@@ -4848,6 +4848,8 @@ brcmf_cfg80211_del_station(struct wiphy *wiphy, struct net_device *ndev,
 
 	if (!params->mac)
 		return -EFAULT;
+	if (!brcmf_is_apmode(ifp->vif))
+		return -EOPNOTSUPP;
 
 	brcmf_dbg(TRACE, "Enter %pM\n", params->mac);
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v1 3/4] mwifiex: Do not allow deleting station in IBSS mode.
  2019-10-21 12:11 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot Krzysztof Hałasa
                   ` (5 preceding siblings ...)
  2019-12-09 10:28 ` [RFC PATCH v1 2/4] brcmfmac: " Nicolas Cavallari
@ 2019-12-09 10:28 ` Nicolas Cavallari
  2019-12-09 10:28 ` [RFC PATCH v1 4/4] nl80211: Allow deleting stations in ibss mode to reset their state Nicolas Cavallari
  7 siblings, 0 replies; 24+ messages in thread
From: Nicolas Cavallari @ 2019-12-09 10:28 UTC (permalink / raw)
  To: Krzysztof Hałasa, Johannes Berg; +Cc: linux-wireless, Kalle Valo

In preparation of allowing userspace to explicitly reset station state
in IBSS mode, reject attempts to delete stations in IBSS mode with error
ENOTSUPP.

Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
--
I do not know if the driver could possibly support this, so take the
safe route and reject it.
---
 drivers/net/wireless/marvell/mwifiex/cfg80211.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/wireless/marvell/mwifiex/cfg80211.c b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
index d89684168500..9135fc6d5a70 100644
--- a/drivers/net/wireless/marvell/mwifiex/cfg80211.c
+++ b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
@@ -1828,6 +1828,10 @@ mwifiex_cfg80211_del_station(struct wiphy *wiphy, struct net_device *dev,
 	struct mwifiex_sta_node *sta_node;
 	u8 deauth_mac[ETH_ALEN];
 
+	if (priv->bss_mode != NL80211_IFTYPE_AP &&
+	    priv->bss_mode != NL80211_IFTYPE_P2P_GO)
+		return -ENOTSUPP;
+
 	if (!priv->bss_started && priv->wdev.cac_started) {
 		mwifiex_dbg(priv->adapter, INFO, "%s: abort CAC!\n", __func__);
 		mwifiex_abort_cac(priv);
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v1 4/4] nl80211: Allow deleting stations in ibss mode to reset their state.
  2019-10-21 12:11 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot Krzysztof Hałasa
                   ` (6 preceding siblings ...)
  2019-12-09 10:28 ` [RFC PATCH v1 3/4] mwifiex: " Nicolas Cavallari
@ 2019-12-09 10:28 ` Nicolas Cavallari
  2019-12-11 21:32   ` Johannes Berg
  7 siblings, 1 reply; 24+ messages in thread
From: Nicolas Cavallari @ 2019-12-09 10:28 UTC (permalink / raw)
  To: Krzysztof Hałasa, Johannes Berg; +Cc: linux-wireless, Kalle Valo

Sometimes, userspace is able to detect that a peer silently lost its
state (like, if the peer reboots). wpa_supplicant does this for IBSS-RSN
and currently only resets the key of the peer so that it can attempt
another handshake.

However, the kernel also hold state about the station, such as BA
sessions, probe response parameters and the like.  They also need to be
resetted correctly.

This patch allow userspace to use NL80211_CMD_DEL_STATION in IBSS mode,
which should send a deauth and reset the state of the station, just
like in mesh point mode.

This has successfully been tested with mac80211/ath9k. Drivers that do
not support it should just return an error, so that userspace can fall
back to the old methods.

Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
---
 net/wireless/nl80211.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index da5262b2298b..82046c990a2a 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -6073,7 +6073,8 @@ static int nl80211_del_station(struct sk_buff *skb, struct genl_info *info)
 	if (dev->ieee80211_ptr->iftype != NL80211_IFTYPE_AP &&
 	    dev->ieee80211_ptr->iftype != NL80211_IFTYPE_AP_VLAN &&
 	    dev->ieee80211_ptr->iftype != NL80211_IFTYPE_MESH_POINT &&
-	    dev->ieee80211_ptr->iftype != NL80211_IFTYPE_P2P_GO)
+	    dev->ieee80211_ptr->iftype != NL80211_IFTYPE_P2P_GO &&
+	    dev->ieee80211_ptr->iftype != NL80211_IFTYPE_ADHOC)
 		return -EINVAL;
 
 	if (!rdev->ops->del_station)
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC] Allow userspace to reset IBSS stations to fix aggregation issue
  2019-12-09 10:28 ` [RFC] Allow userspace to reset IBSS stations to fix aggregation issue Nicolas Cavallari
@ 2019-12-11  6:58   ` Krzysztof Hałasa
  2019-12-11 10:31     ` Nicolas Cavallari
  0 siblings, 1 reply; 24+ messages in thread
From: Krzysztof Hałasa @ 2019-12-11  6:58 UTC (permalink / raw)
  To: Nicolas Cavallari; +Cc: Johannes Berg, linux-wireless, Kalle Valo

Hi Nicolas,

Nicolas Cavallari <nicolas.cavallari@green-communications.fr> writes:

> I encountered the same issue in an IBSS-RSN network, where quick reboot
> of a station would cause issues with aggregation because the kernel is
> not aware of the reboot.
>
> I figured out that since wpa_supplicant already detect reboots, the
> simplest way to fix it would be for wpa_supplicant to reset the entire
> state of the station in the kernel, instead of just resetting keys and
> port.

Just to make sure everybody is aware that it would only be a partial fix
- unencrypted ad hoc mode (the simplest thing available) doesn't need
any userspace and thus must be fixed in the kernel. Alternatively this
functionality may be moved to wpa_supplicant and only then userspace fix
could really fix it completely.
-- 
Krzysztof Halasa

ŁUKASIEWICZ Research Network
Industrial Research Institute for Automation and Measurements PIAP
Al. Jerozolimskie 202, 02-486 Warsaw, Poland

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Allow userspace to reset IBSS stations to fix aggregation issue
  2019-12-11  6:58   ` Krzysztof Hałasa
@ 2019-12-11 10:31     ` Nicolas Cavallari
  0 siblings, 0 replies; 24+ messages in thread
From: Nicolas Cavallari @ 2019-12-11 10:31 UTC (permalink / raw)
  To: Krzysztof Hałasa; +Cc: Johannes Berg, linux-wireless, Kalle Valo

Hi Krzysztof,

On 11/12/2019 07:58, Krzysztof Hałasa wrote:
> Hi Nicolas,
> 
> Just to make sure everybody is aware that it would only be a partial fix
> - unencrypted ad hoc mode (the simplest thing available) doesn't need
> any userspace and thus must be fixed in the kernel. Alternatively this
> functionality may be moved to wpa_supplicant and only then userspace fix
> could really fix it completely.

A partial fix is better than no fix.

In any case, the IBSS-RSN and unencrypted IBSS cannot be fixed the same way.
Having the kernel automatically reset a station while wpa_supplicant may have
started a 4 way handshake with it had already lead to problems in the past.
Which is why the kernel no longer reset stations (52874a5e).

In any case, the unencrypted ibss case is probably much harder to fix, since
even in a linux-only world, there are probably fullmac drivers for hardware that
don't reply to open system auth in an open ibss. Anyway, this patch allows
userspace to fix it, using whatever policy the user chooses to detect it.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v1 4/4] nl80211: Allow deleting stations in ibss mode to reset their state.
  2019-12-09 10:28 ` [RFC PATCH v1 4/4] nl80211: Allow deleting stations in ibss mode to reset their state Nicolas Cavallari
@ 2019-12-11 21:32   ` Johannes Berg
  2019-12-12  9:56     ` Nicolas Cavallari
  0 siblings, 1 reply; 24+ messages in thread
From: Johannes Berg @ 2019-12-11 21:32 UTC (permalink / raw)
  To: Nicolas Cavallari, Krzysztof Hałasa; +Cc: linux-wireless, Kalle Valo

On Mon, 2019-12-09 at 11:28 +0100, Nicolas Cavallari wrote:
> 
> diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
> index da5262b2298b..82046c990a2a 100644
> --- a/net/wireless/nl80211.c
> +++ b/net/wireless/nl80211.c
> @@ -6073,7 +6073,8 @@ static int nl80211_del_station(struct sk_buff *skb, struct genl_info *info)
>  	if (dev->ieee80211_ptr->iftype != NL80211_IFTYPE_AP &&
>  	    dev->ieee80211_ptr->iftype != NL80211_IFTYPE_AP_VLAN &&
>  	    dev->ieee80211_ptr->iftype != NL80211_IFTYPE_MESH_POINT &&
> -	    dev->ieee80211_ptr->iftype != NL80211_IFTYPE_P2P_GO)
> +	    dev->ieee80211_ptr->iftype != NL80211_IFTYPE_P2P_GO &&
> +	    dev->ieee80211_ptr->iftype != NL80211_IFTYPE_ADHOC)
>  		return -EINVAL;

If we go this route (and I'm not sure, shouldn't be _that_ hard to do
some kind of auth/deauth thing?) then you probably should make this
depend on an nl80211 extended feature bit.

That way, not only do you get visibility in userspace whether it's
supported, but also avoid the need to change those non-mac80211 drivers
(by having only mac80211 set the extended feature)

johannes


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v1 4/4] nl80211: Allow deleting stations in ibss mode to reset their state.
  2019-12-11 21:32   ` Johannes Berg
@ 2019-12-12  9:56     ` Nicolas Cavallari
  0 siblings, 0 replies; 24+ messages in thread
From: Nicolas Cavallari @ 2019-12-12  9:56 UTC (permalink / raw)
  To: Johannes Berg, Krzysztof Hałasa; +Cc: linux-wireless, Kalle Valo

On 11/12/2019 22:32, Johannes Berg wrote:
> and I'm not sure, shouldn't be _that_ hard to do some kind of auth/deauth
> thing?

wpa_supplicant already does this for IBSS RSN. And it does not reset the station
for each received auth frame, the logic is more advanced than that.

(not to mention that it should switch to SAE at some point, but there is this
long standing issue of wpa_supplicant not being able to get the rsn ie of the
peer's probe response)

But for the unencrypted case, i guess simply re-reverting 52874a5e (Revert
"mac80211: in IBSS use the Auth frame to trigger STA reinsertion") and making it
conditional on !privacy should be enough ?

> That way, not only do you get visibility in userspace whether it's
> supported, but also avoid the need to change those non-mac80211 drivers
> (by having only mac80211 set the extended feature)

Ok.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-12-12  9:56 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-21 12:11 802.11n IBSS: wlan0 stops receiving packets due to aggregation after sender reboot Krzysztof Hałasa
2019-10-21 12:18 ` [PATCH] " Krzysztof Hałasa
2019-10-22  9:42   ` Sergei Shtylyov
2019-10-21 12:18 ` Krzysztof Hałasa
2019-10-25 10:21 ` [PATCH v2] " Krzysztof Hałasa
2019-10-28 12:21   ` Johannes Berg
2019-10-29  8:41     ` Koen Vandeputte
2019-10-29  8:58       ` Sebastian Gottschall
2019-10-29  9:40         ` Koen Vandeputte
2019-10-29  9:03       ` Johannes Berg
2019-10-29  9:47         ` Koen Vandeputte
2019-10-29  8:54     ` Krzysztof Hałasa
2019-10-29  9:07       ` Johannes Berg
2019-10-29 10:51         ` Krzysztof Hałasa
2019-10-29 10:57           ` Johannes Berg
2019-12-09 10:28 ` [RFC] Allow userspace to reset IBSS stations to fix aggregation issue Nicolas Cavallari
2019-12-11  6:58   ` Krzysztof Hałasa
2019-12-11 10:31     ` Nicolas Cavallari
2019-12-09 10:28 ` [RFC PATCH v1 1/4] ath6kl: Do not allow deleting station in IBSS mode Nicolas Cavallari
2019-12-09 10:28 ` [RFC PATCH v1 2/4] brcmfmac: " Nicolas Cavallari
2019-12-09 10:28 ` [RFC PATCH v1 3/4] mwifiex: " Nicolas Cavallari
2019-12-09 10:28 ` [RFC PATCH v1 4/4] nl80211: Allow deleting stations in ibss mode to reset their state Nicolas Cavallari
2019-12-11 21:32   ` Johannes Berg
2019-12-12  9:56     ` Nicolas Cavallari

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.