iwd.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/1] Fix wiphy regdom logic
@ 2023-05-10 20:45 James Prestwood
  2023-05-10 20:45 ` [PATCH v2 1/1] wiphy: fix regdom change wiphy dump logic James Prestwood
  0 siblings, 1 reply; 6+ messages in thread
From: James Prestwood @ 2023-05-10 20:45 UTC (permalink / raw)
  To: iwd; +Cc: James Prestwood

Below are some logs when this behavior was seen. IWD never
transitions out of the ft-roaming state due to the key setting
callbacks never being called. Its hard to know exactly what
messages were sent/received since its coming from iwmon but
netlink appears to completely lock up until a watchdog restarts
the IWD service.

After the patch was applied there were zero instances of this
behavior which seems to indicate the kernel is unable to handle
multiple GET_WIPHY calls.

May 10 00:11:54 iwd[336880]: src/station.c:station_try_next_transition() 14, target aa:46:8d:1c:16:e6
May 10 00:11:54 iwd[336880]: src/wiphy.c:wiphy_radio_work_insert() Inserting work item 14498
May 10 00:11:54 iwd[336880]: src/wiphy.c:wiphy_radio_work_insert() Inserting work item 14499
May 10 00:11:54 iwd[336880]: src/wiphy.c:wiphy_radio_work_done() Work item 14497 done
May 10 00:11:54 iwd[336880]: src/wiphy.c:wiphy_radio_work_next() Starting work item 14498
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_mlme_notify() MLME notification Remain on Channel(55)
May 10 00:11:54 iwd[336880]: src/ft.c:ft_send_authenticate()
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_mlme_notify() MLME notification Frame TX Status(60)
May 10 00:11:54 kernel: wlan0: disconnect from AP aa:46:8d:1c:17:25 for new assoc to aa:46:8d:1c:16:e6
May 10 00:11:54 locus[917]: State changed to roaming
May 10 00:11:54 locus[917]: Start 10000ms timeout waiting for connected/roaming
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_unicast_notify() Unicast notification Frame(59)
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_mlme_notify() MLME notification Cancel Remain on Channel(56)
May 10 00:11:54 iwd[336880]: src/wiphy.c:wiphy_radio_work_done() Work item 14498 done
May 10 00:11:54 iwd[336880]: src/wiphy.c:wiphy_radio_work_next() Starting work item 14499
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_cqm_rssi_update()
May 10 00:11:54 iwd[336880]: src/station.c:station_enter_state() Old State: connected, new state: ft-roaming
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_mlme_notify() MLME notification Frame Wait Cancel(67)
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_mlme_notify() MLME notification Del Station(20)
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_link_notify() event 16 on ifindex 14
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_mlme_notify() MLME notification Deauthenticate(39)
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_deauthenticate_event()
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_mlme_notify() MLME notification Disconnect(48)
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_disconnect_event()
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_mlme_notify() MLME notification New Station(19)
May 10 00:11:54 kernel: wlan0: associate with aa:46:8d:1c:16:e6 (try 1/3)
May 10 00:11:54 kernel: wlan0: RX ReassocResp from aa:46:8d:1c:16:e6 (capab=0x1511 status=0 aid=7)
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_link_notify() event 16 on ifindex 14
May 10 00:11:54 iwd[336880]: src/wiphy.c:wiphy_reg_notify() Notification of command Reg Change(36)
May 10 00:11:54 iwd[336880]: src/wiphy.c:wiphy_update_reg_domain() New reg domain country code for (global) is XX
May 10 00:11:54 systemd-networkd[393]: wlan0: Lost carrier
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_mlme_notify() MLME notification Associate(38)
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_associate_event()
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_set_gtk() ifindex=14 key_idx=2
May 10 00:11:54 iwd[336880]: src/station.c:station_handshake_event() Setting keys
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_set_tk() ifindex=14 key_idx=0
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_link_notify() event 16 on ifindex 14
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_mlme_notify() MLME notification Connect(46)
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_connect_event()
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_connect_event() aborting and ignore_connect_event not set, proceed
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_connect_event() expect_connect_failure not set, proceed
May 10 00:11:54 iwd[336880]: src/netdev.c:parse_request_ies()
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_connect_event() Request / Response IEs parsed
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_get_oci()
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_link_notify() event 16 on ifindex 14
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_link_notify() event 16 on ifindex 14
May 10 00:11:54 iwd[336880]: src/netdev.c:netdev_link_notify() event 16 on ifindex 14
May 10 00:11:54 iwd[336880]: src/wiphy.c:wiphy_reg_notify() Notification of command Reg Change(36)
May 10 00:11:54 iwd[336880]: src/wiphy.c:wiphy_update_reg_domain() New reg domain country code for (global) is US
May 10 00:11:54 iwd[336880]: src/wiphy.c:wiphy_cancel_last_dump() Canceling pending regdom wiphy dump (global)
May 10 00:11:54 kernel: wlan0: associated
May 10 00:11:54 kernel: ath: EEPROM regdomain: 0x8348
May 10 00:11:54 kernel: ath: EEPROM indicates we should expect a country code
May 10 00:11:54 kernel: ath: doing EEPROM country->regdmn map search
May 10 00:11:54 kernel: ath: country maps to regdmn code: 0x3a
May 10 00:11:54 kernel: ath: Country alpha2 being used: US
May 10 00:11:54 kernel: ath: Regpair used: 0x3a
May 10 00:11:54 kernel: ath: regdomain 0x8348 dynamically updated by country element
May 10 00:11:55 iwd[336880]: src/netdev.c:netdev_mlme_notify() MLME notification Notify CQM(64)
May 10 00:11:55 iwd[336880]: src/netdev.c:netdev_cqm_event() Signal change event (above=0 signal=-76)

--- Nothing here for 10 seconds, until the watchdog triggers ---

May 10 00:12:04 locus[917]: IWD watchdog triggered (reason=state)
May 10 00:12:04 iwd[336880]: Terminate
May 10 00:12:04 iwd[336880]: src/netdev.c:netdev_free() Freeing netdev wlan0[14]
May 10 00:12:04 iwd[336880]: src/device.c:device_free()
May 10 00:12:04 iwd[336880]: src/station.c:station_free()
May 10 00:12:04 iwd[336880]: src/netconfig.c:netconfig_destroy()
May 10 00:12:04 iwd[336880]: src/netconfig.c:netconfig_event_handler() l_netconfig event 2
May 10 00:12:04 iwd[336880]: src/netconfig-commit.c:netconfig_commit_print_addrs() removing address: 10.193.158.118
May 10 00:12:04 iwd[336880]: src/resolve.c:resolve_systemd_revert() ifindex: 14
May 10 00:12:04 systemd[1]: Stopping Wireless service...

The iwmon logs are lengthy, but I've included the bit where the
logs abruptly stop for 10 seconds until the watchdog kicks in.
Prior there was no key setting or anything connection related
after the connect event.

May 10 00:11:55:     Extended Capabilities: len 8
May 10 00:11:55:         Capability: bit  2: Extended channel switching
May 10 00:11:55:         Capability: bit 62: Opmode Notification
May 10 00:11:55:     Extended Capabilities Mask: 170 len 8
May 10 00:12:04:         04 00 00 00 00 00 00 40                          .......@
May 10 00:12:04:     VHT Capability Mask: 176 len 12
May 10 00:12:04:         f0 1f 80 33 ff ff 00 00 ff ff 00 00              ...3........
May 10 00:12:04:     MAC Address 00:0E:8E:A3:02:FE
May 10 00:12:04: > Result: New Wiphy (0x03) len 36 [multi]                     1683677515.942843
May 10 00:12:04:     Wiphy: 0 (0x00000000)
May 10 00:12:04:     Wiphy Name: phy0
May 10 00:12:04:     Generation: 1 (0x00000001)
May 10 00:12:04:     Max AP Assoc Station: 202 len 4
May 10 00:12:04:         20 00 00 00    

James Prestwood (1):
  wiphy: fix regdom change wiphy dump logic

 src/wiphy.c | 103 +++++++++++++++++++++++++++-------------------------
 1 file changed, 54 insertions(+), 49 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/1] wiphy: fix regdom change wiphy dump logic
  2023-05-10 20:45 [PATCH v2 0/1] Fix wiphy regdom logic James Prestwood
@ 2023-05-10 20:45 ` James Prestwood
  2023-05-17  0:53   ` Denis Kenzior
  0 siblings, 1 reply; 6+ messages in thread
From: James Prestwood @ 2023-05-10 20:45 UTC (permalink / raw)
  To: iwd; +Cc: James Prestwood

Certain drivers change the regulatory domain when the client roams.
First XX, then the country the roam target is using. Depending on the
timing the second regdom event can come while the first is still
dumping. This was not handled gracefully/nicely in IWD and it would
just issue another dump (it tries to cancel but if the dump started
there isn't anything that can be done). AFAIK the kernel should be
able to handle this but based on the behavior we saw it seems to not
be able to, or at least not reliably.

When the second dump comes in and GET_WIPHY is issued the kernel
stops replying to any messages which obviously causes IWD to completely
hang up waiting for various acks (set_station, set_key, etc.).

Now if another dump is ongoing (and cant be safely canceled) a flag
will be set when another dump is needed. The first dump will be
waited for and only once completed will another start. This also
ensures there is at most 1 pending dump when before any number of
dumps could be queued which is pointless since only the last dump
request matters.
---
 src/wiphy.c | 103 +++++++++++++++++++++++++++-------------------------
 1 file changed, 54 insertions(+), 49 deletions(-)

diff --git a/src/wiphy.c b/src/wiphy.c
index 2db2d2cd..dbffa44e 100644
--- a/src/wiphy.c
+++ b/src/wiphy.c
@@ -67,6 +67,7 @@ static int mac_randomize_bytes = 6;
 static char regdom_country[2];
 static uint32_t work_ids;
 static unsigned int wiphy_dump_id;
+static bool dump_queued;
 
 enum driver_flag {
 	DEFAULT_IF = 0x1,
@@ -123,6 +124,7 @@ struct wiphy {
 	bool work_in_callback;
 	unsigned int get_reg_id;
 	unsigned int dump_id;
+	bool dump_queued : 1;
 
 	bool support_scheduled_scan:1;
 	bool support_rekey_offload:1;
@@ -2066,18 +2068,23 @@ static void wiphy_setup_rm_enabled_capabilities(struct wiphy *wiphy)
 	 */
 }
 
+static void wiphy_dump_after_regdom(struct wiphy *wiphy);
+
 static void wiphy_dump_done(void *user_data)
 {
 	struct wiphy *wiphy = user_data;
 	const struct l_queue_entry *e;
 
-	/* This dump was canceled due to another dump */
-	if ((wiphy && !wiphy->dump_id) || (!wiphy && !wiphy_dump_id))
-		return;
-
 	if (wiphy) {
 		wiphy->dump_id = 0;
 
+		if (wiphy->dump_queued) {
+			wiphy->dump_queued = false;
+			wiphy_dump_after_regdom(wiphy);
+			return;
+		}
+
+		/* No other dumps queued, emit done */
 		WATCHLIST_NOTIFY(&wiphy->state_watches,
 				wiphy_state_watch_func_t, wiphy,
 				WIPHY_STATE_WATCH_EVENT_REGDOM_DONE);
@@ -2087,6 +2094,12 @@ static void wiphy_dump_done(void *user_data)
 
 	wiphy_dump_id = 0;
 
+	if (dump_queued) {
+		dump_queued = false;
+		wiphy_dump_after_regdom(NULL);
+		return;
+	}
+
 	for (e = l_queue_get_entries(wiphy_list); e; e = e->next) {
 		wiphy = e->data;
 
@@ -2160,38 +2173,48 @@ static void wiphy_dump_callback(struct l_genl_msg *msg,
 	}
 }
 
-static bool wiphy_cancel_last_dump(struct wiphy *wiphy)
+static void wiphy_dump_after_regdom(struct wiphy *wiphy)
 {
-	unsigned int id = 0;
+	const struct l_queue_entry *e;
+	struct l_genl_msg *msg;
+	unsigned int *id = NULL;
 
-	/*
-	 * Zero command ID to signal that wiphy_dump_done doesn't need to do
-	 * anything.
-	 */
-	if (wiphy && wiphy->dump_id) {
-		id = wiphy->dump_id;
-		wiphy->dump_id = 0;
-	} else if (!wiphy && wiphy_dump_id) {
-		id = wiphy_dump_id;
-		wiphy_dump_id = 0;
-	}
+	if (wiphy)
+		id = &wiphy->dump_id;
+	else
+		id = &wiphy_dump_id;
 
-	if (id) {
-		l_debug("Canceling pending regdom wiphy dump (%s)",
+	if (*id) {
+		/*
+		 * If the dump hasn't hit the kernel we can safely cancel it.
+		 * Otherwise set the dump_queued flag and once the current dump
+		 * has completed, another will start. This also ensures there is
+		 * only at most 1 pending dump.
+		 */
+		if (!l_genl_family_request_sent(nl80211, *id)) {
+			l_debug("Canceling pending regdom wiphy dump (%s)",
 					wiphy ? wiphy->name : "global");
 
-		l_genl_family_cancel(nl80211, id);
+			l_genl_family_cancel(nl80211, *id);
+
+			/*
+			 * If there is already dump queued (dump_queued=true)
+			 * the destroy callback will end up starting another
+			 * dump. If this happens nothing more needs to be done.
+			 */
+			if (*id)
+				return;
+		} else if (wiphy) {
+			wiphy->dump_queued = true;
+			return;
+		} else {
+			dump_queued = true;
+			return;
+		}
 	}
 
-	return id != 0;
-}
-
-static void wiphy_dump_after_regdom(struct wiphy *wiphy)
-{
-	const struct l_queue_entry *e;
-	struct l_genl_msg *msg;
-	unsigned int id;
-	bool no_start_event;
+	l_debug("Dumping wiphy after regdom change (%s)",
+			wiphy ? wiphy->name : "global");
 
 	msg = l_genl_msg_new_sized(NL80211_CMD_GET_WIPHY, 128);
 
@@ -2199,37 +2222,22 @@ static void wiphy_dump_after_regdom(struct wiphy *wiphy)
 		l_genl_msg_append_attr(msg, NL80211_ATTR_WIPHY, 4, &wiphy->id);
 
 	l_genl_msg_append_attr(msg, NL80211_ATTR_SPLIT_WIPHY_DUMP, 0, NULL);
-	id = l_genl_family_dump(nl80211, msg, wiphy_dump_callback,
+	*id = l_genl_family_dump(nl80211, msg, wiphy_dump_callback,
 						wiphy, wiphy_dump_done);
-	if (!id) {
+	if (!*id) {
 		l_error("Wiphy information dump failed");
 		l_genl_msg_unref(msg);
 		return;
 	}
 
-	/*
-	 * Another update while dumping wiphy. This next dump should supercede
-	 * the first and not result in a DONE event until this new dump is
-	 * finished. This is because the disabled frequencies are in an unknown
-	 * state and could cause incorrect behavior by any watchers.
-	 */
-	no_start_event = wiphy_cancel_last_dump(wiphy);
-
 	/* Limited dump so just emit the event for this wiphy */
 	if (wiphy) {
-		wiphy->dump_id = id;
-
-		if (no_start_event)
-			return;
-
 		WATCHLIST_NOTIFY(&wiphy->state_watches,
 				wiphy_state_watch_func_t, wiphy,
 				WIPHY_STATE_WATCH_EVENT_REGDOM_STARTED);
 		return;
 	}
 
-	wiphy_dump_id = id;
-
 	/* Otherwise for a global regdom change notify for all wiphy's */
 	for (e = l_queue_get_entries(wiphy_list); e; e = e->next) {
 		struct wiphy *w = e->data;
@@ -2237,9 +2245,6 @@ static void wiphy_dump_after_regdom(struct wiphy *wiphy)
 		if (w->self_managed)
 			continue;
 
-		if (no_start_event)
-			continue;
-
 		WATCHLIST_NOTIFY(&w->state_watches, wiphy_state_watch_func_t,
 				w, WIPHY_STATE_WATCH_EVENT_REGDOM_STARTED);
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/1] wiphy: fix regdom change wiphy dump logic
  2023-05-10 20:45 ` [PATCH v2 1/1] wiphy: fix regdom change wiphy dump logic James Prestwood
@ 2023-05-17  0:53   ` Denis Kenzior
  2023-05-17 15:11     ` James Prestwood
  0 siblings, 1 reply; 6+ messages in thread
From: Denis Kenzior @ 2023-05-17  0:53 UTC (permalink / raw)
  To: James Prestwood, iwd

Hi James,

On 5/10/23 15:45, James Prestwood wrote:
> Certain drivers change the regulatory domain when the client roams.
> First XX, then the country the roam target is using. Depending on the
> timing the second regdom event can come while the first is still
> dumping. This was not handled gracefully/nicely in IWD and it would
> just issue another dump (it tries to cancel but if the dump started
> there isn't anything that can be done). AFAIK the kernel should be
> able to handle this but based on the behavior we saw it seems to not
> be able to, or at least not reliably.

Okay.  Might be useful to know what the kernel actually expects to happen in 
case we send on a socket during an ongoing command.

> 
> When the second dump comes in and GET_WIPHY is issued the kernel
> stops replying to any messages which obviously causes IWD to completely
> hang up waiting for various acks (set_station, set_key, etc.).
> 

Sounds like genl isn't doing the right thing...

> Now if another dump is ongoing (and cant be safely canceled) a flag
> will be set when another dump is needed. The first dump will be
> waited for and only once completed will another start. This also
> ensures there is at most 1 pending dump when before any number of
> dumps could be queued which is pointless since only the last dump
> request matters.

NAK.  Please stop right there.  genl is already a queue.  There's no sense 
adding queuing on top of it in this particular case.  If l_genl_family_cancel 
isn't doing the right thing, then fix that instead.

> ---
>   src/wiphy.c | 103 +++++++++++++++++++++++++++-------------------------
>   1 file changed, 54 insertions(+), 49 deletions(-)
> 

Regards,
-Denis

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/1] wiphy: fix regdom change wiphy dump logic
  2023-05-17  0:53   ` Denis Kenzior
@ 2023-05-17 15:11     ` James Prestwood
  2023-05-18  3:21       ` Denis Kenzior
  0 siblings, 1 reply; 6+ messages in thread
From: James Prestwood @ 2023-05-17 15:11 UTC (permalink / raw)
  To: Denis Kenzior, iwd

Hi Denis,

On 5/16/23 5:53 PM, Denis Kenzior wrote:
> Hi James,
> 
> On 5/10/23 15:45, James Prestwood wrote:
>> Certain drivers change the regulatory domain when the client roams.
>> First XX, then the country the roam target is using. Depending on the
>> timing the second regdom event can come while the first is still
>> dumping. This was not handled gracefully/nicely in IWD and it would
>> just issue another dump (it tries to cancel but if the dump started
>> there isn't anything that can be done). AFAIK the kernel should be
>> able to handle this but based on the behavior we saw it seems to not
>> be able to, or at least not reliably.
> 
> Okay.  Might be useful to know what the kernel actually expects to 
> happen in case we send on a socket during an ongoing command.

I can try and get this to happen locally to investigate further, but so 
far its been a rare case. Maybe happening 1 time in a 24 hour period on 
20-30 clients. I'm thinking maybe a locking/race problem in 
nl80211_dump_wiphy(), but its just a guess.

> 
>>
>> When the second dump comes in and GET_WIPHY is issued the kernel
>> stops replying to any messages which obviously causes IWD to completely
>> hang up waiting for various acks (set_station, set_key, etc.).
>>
> 
> Sounds like genl isn't doing the right thing...
> 
>> Now if another dump is ongoing (and cant be safely canceled) a flag
>> will be set when another dump is needed. The first dump will be
>> waited for and only once completed will another start. This also
>> ensures there is at most 1 pending dump when before any number of
>> dumps could be queued which is pointless since only the last dump
>> request matters.
> 
> NAK.  Please stop right there.  genl is already a queue.  There's no 
> sense adding queuing on top of it in this particular case.  If 
> l_genl_family_cancel isn't doing the right thing, then fix that instead.

Well l_genl_family_cancel really cant do anything about it. Once the 
dump hits the kernel you cant cancel it (this is why we added 
l_genl_family_request_sent). So the only alternative I can think of, if 
we cant do it in IWD, would be to make l_genl queue dumps and not issue 
more dumps until the previous one finishes. This is obviously a 
significant behavioral change.

> 
>> ---
>>   src/wiphy.c | 103 +++++++++++++++++++++++++++-------------------------
>>   1 file changed, 54 insertions(+), 49 deletions(-)
>>
> 
> Regards,
> -Denis

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/1] wiphy: fix regdom change wiphy dump logic
  2023-05-17 15:11     ` James Prestwood
@ 2023-05-18  3:21       ` Denis Kenzior
  2023-05-18 14:03         ` James Prestwood
  0 siblings, 1 reply; 6+ messages in thread
From: Denis Kenzior @ 2023-05-18  3:21 UTC (permalink / raw)
  To: James Prestwood, iwd

Hi James,

>> NAK.  Please stop right there.  genl is already a queue.  There's no sense 
>> adding queuing on top of it in this particular case.  If l_genl_family_cancel 
>> isn't doing the right thing, then fix that instead.
> 
> Well l_genl_family_cancel really cant do anything about it. Once the dump hits 
> the kernel you cant cancel it (this is why we added l_genl_family_request_sent). 

This was done for a different reason.  In the case of CMD_REMAIN_ON_CHANNEL we 
must wait for the ack to obtain the cookie.  We can cancel the ROC only if we 
know the cookie.  Otherwise, even if we call l_genl_family_cancel, the ROC would 
still go ahead.

What you have here is a different and simpler case.

> So the only alternative I can think of, if we cant do it in IWD, would be to 
> make l_genl queue dumps and not issue more dumps until the previous one 
> finishes. This is obviously a significant behavioral change.

I'm lost?  That is what is supposed to happen and happens already today.  If you 
issue two dumps like:

l_genl_family_dump(..., msg1, ...);
l_genl_family_dump(..., msg2, ...);

Then msg2 will not be sent to the kernel until NLMSG_DONE flagged message is 
received.  So what exactly would be the behavioral change you are referring to?

Anyway, looking at l_genl_family_cancel, it is doing the wrong thing, so that 
should be fixed.

Regards,
-Denis

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/1] wiphy: fix regdom change wiphy dump logic
  2023-05-18  3:21       ` Denis Kenzior
@ 2023-05-18 14:03         ` James Prestwood
  0 siblings, 0 replies; 6+ messages in thread
From: James Prestwood @ 2023-05-18 14:03 UTC (permalink / raw)
  To: Denis Kenzior, iwd

Hi Denis,

On 5/17/23 8:21 PM, Denis Kenzior wrote:
> Hi James,
> 
>>> NAK.  Please stop right there.  genl is already a queue.  There's no 
>>> sense adding queuing on top of it in this particular case.  If 
>>> l_genl_family_cancel isn't doing the right thing, then fix that instead.
>>
>> Well l_genl_family_cancel really cant do anything about it. Once the 
>> dump hits the kernel you cant cancel it (this is why we added 
>> l_genl_family_request_sent). 
> 
> This was done for a different reason.  In the case of 
> CMD_REMAIN_ON_CHANNEL we must wait for the ack to obtain the cookie.  We 
> can cancel the ROC only if we know the cookie.  Otherwise, even if we 
> call l_genl_family_cancel, the ROC would still go ahead.
> 
> What you have here is a different and simpler case.
> 
>> So the only alternative I can think of, if we cant do it in IWD, would 
>> be to make l_genl queue dumps and not issue more dumps until the 
>> previous one finishes. This is obviously a significant behavioral change.
> 
> I'm lost?  That is what is supposed to happen and happens already 
> today.  If you issue two dumps like:
> 
> l_genl_family_dump(..., msg1, ...);
> l_genl_family_dump(..., msg2, ...);
> 
> Then msg2 will not be sent to the kernel until NLMSG_DONE flagged 
> message is received.  So what exactly would be the behavioral change you 
> are referring to?

Ok this was not my understanding of how l_genl messaging worked. I 
thought you could have multiple in-flight messages. But looking at it 
you can't, the IO writer only sends a single message.

So I'll fix l_genl_family_cancel to not remove the message from the 
pending queue until we get NLMSG_DONE.

> 
> Anyway, looking at l_genl_family_cancel, it is doing the wrong thing, so 
> that should be fixed.
> 
> Regards,
> -Denis

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-05-18 14:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-10 20:45 [PATCH v2 0/1] Fix wiphy regdom logic James Prestwood
2023-05-10 20:45 ` [PATCH v2 1/1] wiphy: fix regdom change wiphy dump logic James Prestwood
2023-05-17  0:53   ` Denis Kenzior
2023-05-17 15:11     ` James Prestwood
2023-05-18  3:21       ` Denis Kenzior
2023-05-18 14:03         ` James Prestwood

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).