From: Denis Kenzior <denkenz@gmail.com>
To: James Prestwood <prestwoj@gmail.com>, iwd@lists.linux.dev
Subject: Re: [PATCH] station: fix reentrant FT wiphy radio work insertion
Date: Mon, 8 May 2023 12:56:20 -0500 [thread overview]
Message-ID: <7c38d268-7d00-e16c-ff18-de198ba11987@gmail.com> (raw)
In-Reply-To: <20230501220604.366100-1-prestwoj@gmail.com>
Hi James,
On 5/1/23 17:06, James Prestwood wrote:
> The failure path for FT was inserting the same work item inside
> the 'do_work' callback. This caused duplicate work items to be
> inside the radio work queue, and upon destroy the ID would get
> zero'ed out. This caused a host of valgrind complaints when this
> behavior is triggered.
>
> This can be seen with slight modifiations to the FT test:
Typo here, 'modifiations'
>
> src/station.c:station_try_next_transition() 9, target 12:00:00:00:00:02
> src/wiphy.c:wiphy_radio_work_insert() Inserting work item 6
> src/wiphy.c:wiphy_radio_work_insert() Inserting work item 7
> src/wiphy.c:wiphy_radio_work_done() Work item 5 done
> src/wiphy.c:wiphy_radio_work_next() Starting work item 6
> src/netdev.c:netdev_mlme_notify() MLME notification Remain on Channel(55)
> src/ft.c:ft_send_authenticate()
> src/netdev.c:netdev_mlme_notify() MLME notification Frame TX Status(60)
> src/netdev.c:netdev_mlme_notify() MLME notification Frame Wait Cancel(67)
> src/netdev.c:netdev_mlme_notify() MLME notification Cancel Remain on Channel(56)
> src/wiphy.c:wiphy_radio_work_done() Work item 6 done
>
> --- Work item 7 is 'station_ft_work_ready()' ---
>
> src/wiphy.c:wiphy_radio_work_next() Starting work item 7
> src/station.c:station_debug_event() StationDebug.Event(ft-roam-failed)
> src/station.c:station_try_next_transition() 9, target 12:00:00:00:00:03
> src/wiphy.c:wiphy_radio_work_insert() Inserting work item 8
>
> --- Here we insert work item 9, which is a duplicate of 7 ---
>
Duplicate in what way? We're going to a different BSS? Or do you mean that we
overwrite station->ft_work because we queue the same work item in the work queue
callback?
> src/wiphy.c:wiphy_radio_work_insert() Inserting work item 9
> src/wiphy.c:wiphy_radio_work_next() Starting work item 8
> src/netdev.c:netdev_mlme_notify() MLME notification Remain on Channel(55)
> src/ft.c:ft_send_authenticate()
> src/netdev.c:netdev_mlme_notify() MLME notification Frame TX Status(60)
> src/netdev.c:netdev_mlme_notify() MLME notification Frame Wait Cancel(67)
> src/netdev.c:netdev_mlme_notify() MLME notification Cancel Remain on Channel(56)
> src/wiphy.c:wiphy_radio_work_done() Work item 8 done
>
> --- And finally, the duplicated work item starts ---
>
> src/wiphy.c:wiphy_radio_work_next() Starting work item 0
> src/station.c:station_debug_event() StationDebug.Event(ft-roam-failed)
> src/station.c:station_roam_failed() 9
> src/station.c:station_roam_trigger_cb() 9
> ---
> src/station.c | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/src/station.c b/src/station.c
> index 0d8f7628..3c6e43ce 100644
> --- a/src/station.c
> +++ b/src/station.c
> @@ -122,6 +122,7 @@ struct station {
>
> uint32_t wiphy_watch;
>
> + struct l_idle *ft_idle;
> struct wiphy_radio_work_item ft_work;
>
> bool preparing_roam : 1;
> @@ -1739,6 +1740,11 @@ static void station_roam_state_clear(struct station *station)
> station->roam_freqs = NULL;
> }
>
> + if (station->ft_idle) {
> + l_idle_remove(station->ft_idle);
> + station->ft_idle = NULL;
> + }
> +
> l_queue_clear(station->roam_bss_list, l_free);
>
> ft_clear_authentications(netdev_get_ifindex(station->netdev));
> @@ -2282,6 +2288,16 @@ static void station_preauthenticate_cb(struct netdev *netdev,
>
> static void station_transition_start(struct station *station);
>
> +static void station_transition_idle(struct l_idle *idle, void *user_data)
> +{
> + struct station *station = user_data;
> +
> + l_idle_remove(station->ft_idle);
> + station->ft_idle = NULL;
> +
> + station_transition_start(station);
> +}
> +
> static bool station_ft_work_ready(struct wiphy_radio_work_item *item)
> {
> struct station *station = l_container_of(item, struct station, ft_work);
> @@ -2300,7 +2316,8 @@ static bool station_ft_work_ready(struct wiphy_radio_work_item *item)
> if (ret == -ENOENT) {
> station_debug_event(station, "ft-roam-failed");
> try_next:
> - station_transition_start(station);
> + station->ft_idle = l_idle_create(station_transition_idle,
> + station, NULL);
> return true;
> } else if (ret < 0)
> goto assoc_failed;
The patch makes sense to me. However, I do wonder whether we can take care of
this in the wiphy work queue itself? Maybe by adding wiphy_work_reschedule() or
something to that effect? I would hate to keep doing this l_idle business every
time we need to do something similar.
Regards,
-Denis
next prev parent reply other threads:[~2023-05-08 18:06 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-01 22:06 [PATCH] station: fix reentrant FT wiphy radio work insertion James Prestwood
2023-05-08 17:56 ` Denis Kenzior [this message]
2023-05-08 18:10 ` James Prestwood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7c38d268-7d00-e16c-ff18-de198ba11987@gmail.com \
--to=denkenz@gmail.com \
--cc=iwd@lists.linux.dev \
--cc=prestwoj@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).