linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Kernel oops / WiFi connection failure with wpa_supplicant 2.7
       [not found] ` <20190103154921.GA25015@w1.fi>
@ 2019-01-05 19:44   ` Arend Van Spriel
  2019-01-08 17:44     ` Denis Kenzior
  0 siblings, 1 reply; 7+ messages in thread
From: Arend Van Spriel @ 2019-01-05 19:44 UTC (permalink / raw)
  To: Jouni Malinen, Eric Blau
  Cc: hostap, linux-wireless, Johannes Berg, Denis Kenzior

On 1/3/2019 4:49 PM, Jouni Malinen wrote:
> On Thu, Jan 03, 2019 at 10:38:32AM -0500, Eric Blau wrote:
>> Since upgrading to wpa_supplicant 2.7, myself and many others have hit
>> issues with wpa_supplicant failing to connect due to invalid arguments
>> being passed to the underlying kernel driver. Reverting to version 2.6
>> makes these issues go away.
> 
>> kernel: WARNING: CPU: 0 PID: 16169 at
>> drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c:5130
>> brcmf_cfg80211_set_pmk+0x50/0x70 [brcmfmac]
> 
> Which is this WARN_ON in the driver:
> 
>      /* expect using firmware supplicant for 1X */
>      ifp = netdev_priv(dev);
>      if (WARN_ON(ifp->vif->profile.use_fwsup != BRCMF_PROFILE_FWSUP_1X))
> 	return -EINVAL;

Yes. It means the firmware is not configured to use the 1x offload so it 
rejects the PMK setting here.

>> Notice that the oops references wpa_supplicant as the offending
>> process, although maybe the firmware or driver is at fault for
>> advertising 4-way handshake offload support.
> 
> That's not an oops and wpa_supplicant is not the "offending process", it
> is just the user space process in which context the driver hits this
> issue.

Well, I beg to differ but I will get to that.

>> Any ideas what the issue could be here? If there's anything else I can
>> do to help track down the problem, please let me know.
> 
> That should be reported to the maintainers of the kernel driver that has
> this issue:

So the issue is that the nl80211 api requires wpa_supplicant to provide 
an attribute in the NL80211_CMD_CONNECT to indicate that driver/firmware 
should do the 1x offload which is described in the second paragraph below:

/**
  * DOC: WPA/WPA2 EAPOL handshake offload
  *
  * By setting @NL80211_EXT_FEATURE_4WAY_HANDSHAKE_STA_PSK flag drivers
  * can indicate they support offloading EAPOL handshakes for WPA/WPA2
  * preshared key authentication. In %NL80211_CMD_CONNECT the preshared
  * key should be specified using %NL80211_ATTR_PMK. Drivers supporting
  * this offload may reject the %NL80211_CMD_CONNECT when no preshared
  * key material is provided, for example when that driver does not
  * support setting the temporal keys through %CMD_NEW_KEY.
  *
  * Similarly @NL80211_EXT_FEATURE_4WAY_HANDSHAKE_STA_1X flag can be
  * set by drivers indicating offload support of the PTK/GTK EAPOL
  * handshakes during 802.1X authentication. In order to use the offload
  * the %NL80211_CMD_CONNECT should have %NL80211_ATTR_WANT_1X_4WAY_HS
  * attribute flag. Drivers supporting this offload may reject the
  * %NL80211_CMD_CONNECT when the attribute flag is not present.
  *
  * For 802.1X the PMK or PMK-R0 are set by providing %NL80211_ATTR_PMK
  * using %NL80211_CMD_SET_PMK. For offloaded FT support also
  * %NL80211_ATTR_PMKR0_NAME must be provided.
  */

For testing I had modified the wpa_supplicant to add the required flag 
in CONNECT command, but it was a bit too hacky to submit. I will rebase 
those changes and clean it up.

However, there is more to it. When these offloads were introduced, we 
discussed about having a PORT_AUTHORIZED event or not. It was decided 
passing an attribute in CONNECT and ROAMED event would suffice and that 
is what was implemented in brcmfmac. However, it seems time passed and 
the need for an explicit PORT_AUTHORIZED was there (probably Denis 
knows), which wpa_supplicant now supports thus ignoring the attribute in 
the CONNECT and ROAMED events. The brcmfmac driver was not changed 
accordingly. For this there are patches pending in linux-wireless which 
are necessary to have a working connection.

Regards,
Arend

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Kernel oops / WiFi connection failure with wpa_supplicant 2.7
  2019-01-05 19:44   ` Kernel oops / WiFi connection failure with wpa_supplicant 2.7 Arend Van Spriel
@ 2019-01-08 17:44     ` Denis Kenzior
  2019-01-14 20:12       ` Arend Van Spriel
  0 siblings, 1 reply; 7+ messages in thread
From: Denis Kenzior @ 2019-01-08 17:44 UTC (permalink / raw)
  To: Arend Van Spriel, Jouni Malinen, Eric Blau
  Cc: hostap, linux-wireless, Johannes Berg

Hi Arend,

> However, there is more to it. When these offloads were introduced, we 
> discussed about having a PORT_AUTHORIZED event or not. It was decided 
> passing an attribute in CONNECT and ROAMED event would suffice and that 
> is what was implemented in brcmfmac. However, it seems time passed and 
> the need for an explicit PORT_AUTHORIZED was there (probably Denis 
> knows), which wpa_supplicant now supports thus ignoring the attribute in 
> the CONNECT and ROAMED events. The brcmfmac driver was not changed 
> accordingly. For this there are patches pending in linux-wireless which 
> are necessary to have a working connection.
> 

Coming in a bit late to this discussion, but it does raise a few points 
I wouldn't mind some clarification on:

- With commit 503c1fb98ba3, the kernel effectively changed the userspace 
API.  So I take it that breaking userspace APIs are OK sometimes? If so, 
I have lots of suggestions to make ;)

- Is RTNL LINK_MODE / OPER_STATE status being (supposed to be?) affected 
by the driver during a roam?  E.g. if we're in a 802.1X network with 
userspace authentication, and driver roamed requiring a new 802.1X auth, 
then in theory the RTNL mode needs to be brought back out of UP state...

- The new API leaves a lot to be desired in terms of race conditions. 
For example, how long should userspace wait for EAPoL-EAP packets to 
arrive (before triggering its own EAPoL-Start for example) if a 
CMD_ROAMED event comes?

- What happens if userspace does send an EAPoL-Start in the middle of an 
offloaded 4-way handshake?

Regards,
-Denis

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Kernel oops / WiFi connection failure with wpa_supplicant 2.7
  2019-01-08 17:44     ` Denis Kenzior
@ 2019-01-14 20:12       ` Arend Van Spriel
  2019-01-14 21:18         ` Denis Kenzior
  0 siblings, 1 reply; 7+ messages in thread
From: Arend Van Spriel @ 2019-01-14 20:12 UTC (permalink / raw)
  To: Denis Kenzior, Jouni Malinen, Eric Blau
  Cc: hostap, linux-wireless, Johannes Berg

On 1/8/2019 6:44 PM, Denis Kenzior wrote:
> Hi Arend,
> 
>> However, there is more to it. When these offloads were introduced, we 
>> discussed about having a PORT_AUTHORIZED event or not. It was decided 
>> passing an attribute in CONNECT and ROAMED event would suffice and 
>> that is what was implemented in brcmfmac. However, it seems time 
>> passed and the need for an explicit PORT_AUTHORIZED was there 
>> (probably Denis knows), which wpa_supplicant now supports thus 
>> ignoring the attribute in the CONNECT and ROAMED events. The brcmfmac 
>> driver was not changed accordingly. For this there are patches pending 
>> in linux-wireless which are necessary to have a working connection.
>>
> 
> Coming in a bit late to this discussion, but it does raise a few points 
> I wouldn't mind some clarification on:
> 
> - With commit 503c1fb98ba3, the kernel effectively changed the userspace 
> API.  So I take it that breaking userspace APIs are OK sometimes? If so, 
> I have lots of suggestions to make ;)

I bet you do :-p I think the rule of thumb is that there are no drivers 
providing the functionality behind the user-space API and/or no 
user-space applications are using that API.

> - Is RTNL LINK_MODE / OPER_STATE status being (supposed to be?) affected 
> by the driver during a roam?  E.g. if we're in a 802.1X network with 
> userspace authentication, and driver roamed requiring a new 802.1X auth, 
> then in theory the RTNL mode needs to be brought back out of UP state...

So do you expect the driver/cfg80211 to take care of that or the 
supplicant? I assumed wpa_supplicant would be doing that.

> - The new API leaves a lot to be desired in terms of race conditions. 
> For example, how long should userspace wait for EAPoL-EAP packets to 
> arrive (before triggering its own EAPoL-Start for example) if a 
> CMD_ROAMED event comes?

I think that question applies to CMD_CONNECT as well, right? Not sure if 
the specs provide any guidance for that. I can dive into that, but maybe 
someone like Jouni or Johannes know. If so, let me know ;-)

> - What happens if userspace does send an EAPoL-Start in the middle of an 
> offloaded 4-way handshake?

Probably those would be dropped.

Regards,
Arend

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Kernel oops / WiFi connection failure with wpa_supplicant 2.7
  2019-01-14 20:12       ` Arend Van Spriel
@ 2019-01-14 21:18         ` Denis Kenzior
  2019-01-14 23:04           ` Arend Van Spriel
  0 siblings, 1 reply; 7+ messages in thread
From: Denis Kenzior @ 2019-01-14 21:18 UTC (permalink / raw)
  To: Arend Van Spriel, Jouni Malinen, Eric Blau
  Cc: hostap, linux-wireless, Johannes Berg

Hi Arend,

On 01/14/2019 02:12 PM, Arend Van Spriel wrote:
> On 1/8/2019 6:44 PM, Denis Kenzior wrote:
>> Hi Arend,
>>
>>> However, there is more to it. When these offloads were introduced, we 
>>> discussed about having a PORT_AUTHORIZED event or not. It was decided 
>>> passing an attribute in CONNECT and ROAMED event would suffice and 
>>> that is what was implemented in brcmfmac. However, it seems time 
>>> passed and the need for an explicit PORT_AUTHORIZED was there 
>>> (probably Denis knows), which wpa_supplicant now supports thus 
>>> ignoring the attribute in the CONNECT and ROAMED events. The brcmfmac 
>>> driver was not changed accordingly. For this there are patches 
>>> pending in linux-wireless which are necessary to have a working 
>>> connection.
>>>
>>
>> Coming in a bit late to this discussion, but it does raise a few 
>> points I wouldn't mind some clarification on:
>>
>> - With commit 503c1fb98ba3, the kernel effectively changed the 
>> userspace API.  So I take it that breaking userspace APIs are OK 
>> sometimes? If so, I have lots of suggestions to make ;)
> 
> I bet you do :-p I think the rule of thumb is that there are no drivers 
> providing the functionality behind the user-space API and/or no 
> user-space applications are using that API.

Maybe this is a question for Johannes as well, but define 'user-space 
applications'?  If that includes wpa_s, wasn't the rule of thumb broken 
with that commit?

> 
>> - Is RTNL LINK_MODE / OPER_STATE status being (supposed to be?) 
>> affected by the driver during a roam?  E.g. if we're in a 802.1X 
>> network with userspace authentication, and driver roamed requiring a 
>> new 802.1X auth, then in theory the RTNL mode needs to be brought back 
>> out of UP state...
> 
> So do you expect the driver/cfg80211 to take care of that or the 
> supplicant? I assumed wpa_supplicant would be doing that.
> 

With regular roaming where we trigger a Deassociate/Deathenticate 
(either explicitly or implicitly) first, the interface goes into dormant 
mode by virtue of the carrier going down.

With this it isn't really clear whether the same is happening and who 
(kernel/userspace) should be doing what.  I would actually assume the 
kernel is/should be turning carrier off for the duration of the roam 
operation?

>> - The new API leaves a lot to be desired in terms of race conditions. 
>> For example, how long should userspace wait for EAPoL-EAP packets to 
>> arrive (before triggering its own EAPoL-Start for example) if a 
>> CMD_ROAMED event comes?
> 
> I think that question applies to CMD_CONNECT as well, right? Not sure if 
> the specs provide any guidance for that. I can dive into that, but maybe 
> someone like Jouni or Johannes know. If so, let me know ;-)

With CMD_CONNECT it is a bit more clear because you're most likely not 
specifying a PMKID for the first time, so you expect the authentication 
to happen in all cases.  If the AP doesn't respond after some small 
timeout, the supplicant can send its own EAPoL-Start.

With CMD_ROAMED it is less clear.

> 
>> - What happens if userspace does send an EAPoL-Start in the middle of 
>> an offloaded 4-way handshake?
> 
> Probably those would be dropped.
> 

I would love to have something more definitive than 'Probably', and it 
might be worth mentioning this hint in the documentation somewhere.

Regards,
-Denis

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Kernel oops / WiFi connection failure with wpa_supplicant 2.7
  2019-01-14 21:18         ` Denis Kenzior
@ 2019-01-14 23:04           ` Arend Van Spriel
  2019-01-15 13:00             ` Johannes Berg
  2019-01-15 15:55             ` Denis Kenzior
  0 siblings, 2 replies; 7+ messages in thread
From: Arend Van Spriel @ 2019-01-14 23:04 UTC (permalink / raw)
  To: Denis Kenzior, Jouni Malinen, Eric Blau
  Cc: hostap, linux-wireless, Johannes Berg

On 1/14/2019 10:18 PM, Denis Kenzior wrote:
> Hi Arend,
> 
> On 01/14/2019 02:12 PM, Arend Van Spriel wrote:
>> On 1/8/2019 6:44 PM, Denis Kenzior wrote:
>>> Hi Arend,
>>>
>>>> However, there is more to it. When these offloads were introduced, 
>>>> we discussed about having a PORT_AUTHORIZED event or not. It was 
>>>> decided passing an attribute in CONNECT and ROAMED event would 
>>>> suffice and that is what was implemented in brcmfmac. However, it 
>>>> seems time passed and the need for an explicit PORT_AUTHORIZED was 
>>>> there (probably Denis knows), which wpa_supplicant now supports thus 
>>>> ignoring the attribute in the CONNECT and ROAMED events. The 
>>>> brcmfmac driver was not changed accordingly. For this there are 
>>>> patches pending in linux-wireless which are necessary to have a 
>>>> working connection.
>>>>
>>>
>>> Coming in a bit late to this discussion, but it does raise a few 
>>> points I wouldn't mind some clarification on:
>>>
>>> - With commit 503c1fb98ba3, the kernel effectively changed the 
>>> userspace API.  So I take it that breaking userspace APIs are OK 
>>> sometimes? If so, I have lots of suggestions to make ;)
>>
>> I bet you do :-p I think the rule of thumb is that there are no 
>> drivers providing the functionality behind the user-space API and/or 
>> no user-space applications are using that API.
> 
> Maybe this is a question for Johannes as well, but define 'user-space 
> applications'?  If that includes wpa_s, wasn't the rule of thumb broken 
> with that commit?

In my previous reply I wanted to add that it would be hard to proof that 
no user-space applications are using the API. Not sure exactly when 
things were added in wpa_s, but I suspect it was 
post-commit-503c1fb98ba3 so it did not have support for the user-space 
API before the commit.

>>
>>> - Is RTNL LINK_MODE / OPER_STATE status being (supposed to be?) 
>>> affected by the driver during a roam?  E.g. if we're in a 802.1X 
>>> network with userspace authentication, and driver roamed requiring a 
>>> new 802.1X auth, then in theory the RTNL mode needs to be brought 
>>> back out of UP state...
>>
>> So do you expect the driver/cfg80211 to take care of that or the 
>> supplicant? I assumed wpa_supplicant would be doing that.
>>
> 
> With regular roaming where we trigger a Deassociate/Deathenticate 
> (either explicitly or implicitly) first, the interface goes into dormant 
> mode by virtue of the carrier going down.
> 
> With this it isn't really clear whether the same is happening and who 
> (kernel/userspace) should be doing what.  I would actually assume the 
> kernel is/should be turning carrier off for the duration of the roam 
> operation?

On what layer do we know 802.1X re-auth is required?

>>> - The new API leaves a lot to be desired in terms of race conditions. 
>>> For example, how long should userspace wait for EAPoL-EAP packets to 
>>> arrive (before triggering its own EAPoL-Start for example) if a 
>>> CMD_ROAMED event comes?
>>
>> I think that question applies to CMD_CONNECT as well, right? Not sure 
>> if the specs provide any guidance for that. I can dive into that, but 
>> maybe someone like Jouni or Johannes know. If so, let me know ;-)
> 
> With CMD_CONNECT it is a bit more clear because you're most likely not 
> specifying a PMKID for the first time, so you expect the authentication 
> to happen in all cases.  If the AP doesn't respond after some small 
> timeout, the supplicant can send its own EAPoL-Start.
> 
> With CMD_ROAMED it is less clear.
> 
>>
>>> - What happens if userspace does send an EAPoL-Start in the middle of 
>>> an offloaded 4-way handshake?
>>
>> Probably those would be dropped.
>>
> 
> I would love to have something more definitive than 'Probably', and it 
> might be worth mentioning this hint in the documentation somewhere.

I was hesitant to use that word, but decided to do so simply because I 
can not speak for every driver and even for the brcmfmac driver that I 
maintain I will need to look into the firmware to be sure. I agree that 
a remark of that possibility is worth adding.

Regards,
Arend

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Kernel oops / WiFi connection failure with wpa_supplicant 2.7
  2019-01-14 23:04           ` Arend Van Spriel
@ 2019-01-15 13:00             ` Johannes Berg
  2019-01-15 15:55             ` Denis Kenzior
  1 sibling, 0 replies; 7+ messages in thread
From: Johannes Berg @ 2019-01-15 13:00 UTC (permalink / raw)
  To: Arend Van Spriel, Denis Kenzior, Jouni Malinen, Eric Blau
  Cc: hostap, linux-wireless


> > Maybe this is a question for Johannes as well, but define 'user-space 
> > applications'?  If that includes wpa_s, wasn't the rule of thumb broken 
> > with that commit?
> 
> In my previous reply I wanted to add that it would be hard to proof that 
> no user-space applications are using the API. Not sure exactly when 
> things were added in wpa_s, but I suspect it was 
> post-commit-503c1fb98ba3 so it did not have support for the user-space 
> API before the commit.

I don't know about this really.

My thought at the time likely was that if there's no driver implementing
it, no userspace could've existed? Or maybe that just wasn't true, and I
got confused?

In any case, it certainly wasn't an intentional API break.

> > > > - What happens if userspace does send an EAPoL-Start in the middle of 
> > > > an offloaded 4-way handshake?
> > > 
> > > Probably those would be dropped.
> > > 
> > 
> > I would love to have something more definitive than 'Probably', and it 
> > might be worth mentioning this hint in the documentation somewhere.
> 
> I was hesitant to use that word, but decided to do so simply because I 
> can not speak for every driver and even for the brcmfmac driver that I 
> maintain I will need to look into the firmware to be sure. I agree that 
> a remark of that possibility is worth adding.

I don't really know if we should really cover all possible error
scenarios like that?

johannes


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Kernel oops / WiFi connection failure with wpa_supplicant 2.7
  2019-01-14 23:04           ` Arend Van Spriel
  2019-01-15 13:00             ` Johannes Berg
@ 2019-01-15 15:55             ` Denis Kenzior
  1 sibling, 0 replies; 7+ messages in thread
From: Denis Kenzior @ 2019-01-15 15:55 UTC (permalink / raw)
  To: Arend Van Spriel, Jouni Malinen, Eric Blau
  Cc: hostap, linux-wireless, Johannes Berg

Hi Arend,

>>>
>>>> - Is RTNL LINK_MODE / OPER_STATE status being (supposed to be?) 
>>>> affected by the driver during a roam?  E.g. if we're in a 802.1X 
>>>> network with userspace authentication, and driver roamed requiring a 
>>>> new 802.1X auth, then in theory the RTNL mode needs to be brought 
>>>> back out of UP state...
>>>
>>> So do you expect the driver/cfg80211 to take care of that or the 
>>> supplicant? I assumed wpa_supplicant would be doing that.
>>>
>>
>> With regular roaming where we trigger a Deassociate/Deathenticate 
>> (either explicitly or implicitly) first, the interface goes into 
>> dormant mode by virtue of the carrier going down.
>>
>> With this it isn't really clear whether the same is happening and who 
>> (kernel/userspace) should be doing what.  I would actually assume the 
>> kernel is/should be turning carrier off for the duration of the roam 
>> operation?
> 
> On what layer do we know 802.1X re-auth is required?
> 

Not sure what you mean by 'layer'?  If re-auth is required, then only 
the supplicant has the proper info and it will handle this via EAPoL frames.

But that is besides the point.  Regardless of whether a roam needs 
re-auth or not, network interface dormant notification is needed.  For 
example: userspace DHCP clients need to know when to renew the address. 
And yes, there are weird networks out there that expect you to 
re-negotiate your DHCP address on a roam.  Such clients are not 
integrated in any way with a supplicant and rely on rtnl.

Regards,
-Denis

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-01-15 15:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CADU241PtPeiTQWHwb=uF6Ohuua_asOwCarCAKVC8jdVVNAsByA@mail.gmail.com>
     [not found] ` <20190103154921.GA25015@w1.fi>
2019-01-05 19:44   ` Kernel oops / WiFi connection failure with wpa_supplicant 2.7 Arend Van Spriel
2019-01-08 17:44     ` Denis Kenzior
2019-01-14 20:12       ` Arend Van Spriel
2019-01-14 21:18         ` Denis Kenzior
2019-01-14 23:04           ` Arend Van Spriel
2019-01-15 13:00             ` Johannes Berg
2019-01-15 15:55             ` Denis Kenzior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).