All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Possible bug - DISASSOC_LOW_ACK
@ 2021-09-22 13:56 Denis Kenzior
  0 siblings, 0 replies; 3+ messages in thread
From: Denis Kenzior @ 2021-09-22 13:56 UTC (permalink / raw)
  To: iwd

[-- Attachment #1: Type: text/plain, Size: 2983 bytes --]

Hi Michael,

On 9/22/21 3:59 AM, Michael Johnson wrote:
> Hi everyone,
> 
> In my use case that involves heavy roaming, I've been noticing that
> occasionally `iwd` disconnects completely from the network before
> rejoining which can cause a few seconds of downtime. In addition to
> the downtime it seems to put the network in an odd state which is
> hitting a bunch of corner cases in code that uses the network more
> than just a normal disconnect. Looking at the logs, it seems like iwd
> might be getting locked up somewhere as it also stops reacting to
> threshold notifications?

This is indeed strange.  iwmon would capture this, but maybe it is easier to put 
in a bit more debugging into the CQM paths and see if we get some more clues? 
Maybe the CQM events are propagated correctly but the signal isn't below the 
roaming threshold?

> 
> I'm not sure how to reproduce this as it's relatively rare, happening
> 3 times in about a week. The logs below are from a machine running
> version 1.17.
> I'll see if I can start a rolling iwmon capture to get more info. Is
> there anything else that would be useful?

Might be chasing ghosts, but the only other thing I see is a a few CMD_FRAME 
events (see below) that we receive which are not recognized?  Or at least I 
don't see any logs of them being parsed.  iwmon trace will capture these, but 
may be easier to log them in your case.

> 
> ```
> Sep 21 15:40:22.138376 p3-683 iwd[425]:
> src/netdev.c:netdev_mlme_notify() MLME notification Frame TX
> Status(60)
> Sep 21 15:40:22.140101 p3-683 iwd[425]:
> src/netdev.c:netdev_unicast_notify() Unicast notification 59
> Sep 21 15:40:22.140101 p3-683 iwd[425]:
> src/station.c:station_early_neighbor_report_cb() ifindex: 5, error:
> 0()

Did we just roam, or is something else going on?

> Sep 21 15:40:22.140101 p3-683 iwd[425]:
> src/station.c:parse_neighbor_report() Neighbor report received for
> ea:9e:38:78:07:b1: ch 44 (oper class 0), MD not set
> Sep 21 15:40:22.141734 p3-683 iwd[425]:
> src/netdev.c:netdev_unicast_notify() Unicast notification 59

We're getting a CMD_FRAME here

> Sep 21 15:40:22.142298 p3-683 iwd[425]:
> src/netdev.c:netdev_unicast_notify() Unicast notification 59

And here.  Wonder what they are?

> Sep 21 15:40:22.166050 p3-683 kernel: wlan0: Limiting TX power to 27
> (30 - 3) dBm as advertised by ea:9e:38:78:08:71
> Sep 21 15:40:22.473706 p3-683 iwd[425]:
> src/netdev.c:netdev_mlme_notify() MLME notification Notify CQM(64)
> Sep 21 15:41:12.767433 p3-683 iwd[425]:
> src/netdev.c:netdev_mlme_notify() MLME notification Notify CQM(64)

Two CQM events.  We need to know what they actually contain.

> Sep 21 15:46:41.998639 p3-683 kernel: wlan0: deauthenticated from
> ea:9e:38:78:08:71 (Reason: 34=DISASSOC_LOW_ACK)

Hmm, never seen this before.  I wonder if the power limit event the kernel 
prints about earlier has something to do with this?

Regards,
-Denis

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Possible bug - DISASSOC_LOW_ACK
@ 2021-09-22 14:53 Michael Johnson
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Johnson @ 2021-09-22 14:53 UTC (permalink / raw)
  To: iwd

[-- Attachment #1: Type: text/plain, Size: 1736 bytes --]

Hi Denis,

On Wed, 22 Sept 2021 at 15:13, Denis Kenzior <denkenz(a)gmail.com> wrote:
> This is indeed strange.  iwmon would capture this, but maybe it is easier to put
> in a bit more debugging into the CQM paths and see if we get some more clues?
> Maybe the CQM events are propagated correctly but the signal isn't below the
> roaming threshold?

Maybe. I'll try and capture the contents of the CQM packets when it happens.

> > I'm not sure how to reproduce this as it's relatively rare, happening
> > 3 times in about a week. The logs below are from a machine running
> > version 1.17.
> > I'll see if I can start a rolling iwmon capture to get more info. Is
> > there anything else that would be useful?
>
> Might be chasing ghosts, but the only other thing I see is a a few CMD_FRAME
> events (see below) that we receive which are not recognized?  Or at least I
> don't see any logs of them being parsed.  iwmon trace will capture these, but
> may be easier to log them in your case.

I can do that. I'll add extra logging to those `_notify()` methods and
update when I see it happen again.

> Did we just roam, or is something else going on?

We roamed at 15:40:21.954094 and then lost comms at 15:42:30.885817.
I've put the full logs here https://pastebin.com/FJxpGmfd.

> > Sep 21 15:46:41.998639 p3-683 kernel: wlan0: deauthenticated from
> > ea:9e:38:78:08:71 (Reason: 34=DISASSOC_LOW_ACK)
>
> Hmm, never seen this before.  I wonder if the power limit event the kernel
> prints about earlier has something to do with this?

The "Limit TX power" message is always printed after a roam. I'm not
entirely sure why but I believe APs to advertise which power level to
use?

Regards,
Michael

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Possible bug - DISASSOC_LOW_ACK
@ 2021-09-22  8:59 Michael Johnson
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Johnson @ 2021-09-22  8:59 UTC (permalink / raw)
  To: iwd

[-- Attachment #1: Type: text/plain, Size: 3265 bytes --]

Hi everyone,

In my use case that involves heavy roaming, I've been noticing that
occasionally `iwd` disconnects completely from the network before
rejoining which can cause a few seconds of downtime. In addition to
the downtime it seems to put the network in an odd state which is
hitting a bunch of corner cases in code that uses the network more
than just a normal disconnect. Looking at the logs, it seems like iwd
might be getting locked up somewhere as it also stops reacting to
threshold notifications?

I'm not sure how to reproduce this as it's relatively rare, happening
3 times in about a week. The logs below are from a machine running
version 1.17.
I'll see if I can start a rolling iwmon capture to get more info. Is
there anything else that would be useful?

```
Sep 21 15:40:22.138376 p3-683 iwd[425]:
src/netdev.c:netdev_mlme_notify() MLME notification Frame TX
Status(60)
Sep 21 15:40:22.140101 p3-683 iwd[425]:
src/netdev.c:netdev_unicast_notify() Unicast notification 59
Sep 21 15:40:22.140101 p3-683 iwd[425]:
src/station.c:station_early_neighbor_report_cb() ifindex: 5, error:
0()
Sep 21 15:40:22.140101 p3-683 iwd[425]:
src/station.c:parse_neighbor_report() Neighbor report received for
ea:9e:38:78:07:b1: ch 44 (oper class 0), MD not set
Sep 21 15:40:22.141734 p3-683 iwd[425]:
src/netdev.c:netdev_unicast_notify() Unicast notification 59
Sep 21 15:40:22.142298 p3-683 iwd[425]:
src/netdev.c:netdev_unicast_notify() Unicast notification 59
Sep 21 15:40:22.166050 p3-683 kernel: wlan0: Limiting TX power to 27
(30 - 3) dBm as advertised by ea:9e:38:78:08:71
Sep 21 15:40:22.473706 p3-683 iwd[425]:
src/netdev.c:netdev_mlme_notify() MLME notification Notify CQM(64)
Sep 21 15:41:12.767433 p3-683 iwd[425]:
src/netdev.c:netdev_mlme_notify() MLME notification Notify CQM(64)
Sep 21 15:46:41.998639 p3-683 kernel: wlan0: deauthenticated from
ea:9e:38:78:08:71 (Reason: 34=DISASSOC_LOW_ACK)
Sep 21 15:46:41.997194 p3-683 systemd-networkd[341]: wlan0: Lost carrier
Sep 21 15:46:41.999173 p3-683 iwd[425]:
src/netdev.c:netdev_link_notify() event 16 on ifindex 5
Sep 21 15:46:42.107670 p3-683 iwd[425]:
src/netdev.c:netdev_mlme_notify() MLME notification Del Station(20)
Sep 21 15:46:42.111369 p3-683 iwd[425]:
src/netdev.c:netdev_mlme_notify() MLME notification Deauthenticate(39)
Sep 21 15:46:42.111369 p3-683 iwd[425]:
src/netdev.c:netdev_deauthenticate_event()
Sep 21 15:46:42.111369 p3-683 iwd[425]:
src/netdev.c:netdev_mlme_notify() MLME notification Disconnect(48)
Sep 21 15:46:42.111369 p3-683 iwd[425]: src/netdev.c:netdev_disconnect_event()
Sep 21 15:46:42.111369 p3-683 iwd[425]: Received Deauthentication
event, reason: 34, from_ap: true
Sep 21 15:46:42.111369 p3-683 iwd[425]:
src/station.c:station_disconnect_event() 5
Sep 21 15:46:42.111369 p3-683 iwd[425]: src/station.c:station_disassociated() 5
Sep 21 15:46:42.111369 p3-683 iwd[425]:
src/resolve.c:resolve_systemd_revert() ifindex: 5
Sep 21 15:46:42.136257 p3-683 iwd[425]:
src/station.c:station_enter_state() Old State: connected, new state:
disconnected
Sep 21 15:46:42.136257 p3-683 iwd[425]:
src/station.c:station_enter_state() Old State: disconnected, new
state: autoconnect_quick
```

Kind Regards,
Michael

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-09-22 14:53 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-22 13:56 Possible bug - DISASSOC_LOW_ACK Denis Kenzior
  -- strict thread matches above, loose matches on Subject: below --
2021-09-22 14:53 Michael Johnson
2021-09-22  8:59 Michael Johnson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.