All of lore.kernel.org
 help / color / mirror / Atom feed
* [ath9k-devel] ath9k not connecting to one particular network..
@ 2013-03-25  4:06 Linus Torvalds
  2013-03-25  4:22 ` Outback Dingo
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Linus Torvalds @ 2013-03-25  4:06 UTC (permalink / raw)
  To: ath9k-devel

Ok, I'm sure this is something people have seen before, but it has me stumped.

I'm on the road for a couple of days with my google ChromeBook pixel,
which worked perfectly fine in the previous location I was at, and
works fine at home. But in the current condo, the machine simply
*cannot* connect to the wireless network here.

The pixel has a :"Atheros AR9462 Rev:2" in it, and I know the network
itself works, because my cellphone connects to it just fine.

This is a WPA2-encrypted network, but so is my home network, and so
was the one I connected to last week when traveling. I'm writing this
on the Pixel, but I'm writing it connected to my phone doing a
hotspot, because that works. Very odd.

Working connection:

  wlan0: authenticate with 02:1a:11:fc:8c:3b
  wlan0: send auth to 02:1a:11:fc:8c:3b (try 1/3)
  wlan0: authenticated
  wlan0: associate with 02:1a:11:fc:8c:3b (try 1/3)
  wlan0: RX AssocResp from 02:1a:11:fc:8c:3b (capab=0x411 status=0 aid=1)
  wlan0: associated

Non-working connection:

  wlan0: authenticate with 50:46:5d:02:85:08
  wlan0: send auth to 50:46:5d:02:85:08 (try 1/3)
  wlan0: authenticated
  wlan0: associate with 50:46:5d:02:85:08 (try 1/3)
  wlan0: RX AssocResp from 50:46:5d:02:85:08 (capab=0x411 status=0 aid=10)
  wlan0: associated

.. now NM asks for the password again, and then I get:

  wlan0: disassociating from 50:46:5d:02:85:08 by local choice (reason=3)

I'm not seeing what the difference is. Except for that "aid=1" vs
"aid=10", which I have no idea what it means.

Looking in /var/log/messages, a successful connection looks like this:

  Mar 24 20:48:42 pixel NetworkManager[832]: <info> (wlan0):
supplicant interface state: associating -> associated
  Mar 24 20:48:43 pixel NetworkManager[832]: <info> (wlan0):
supplicant interface state: associated -> 4-way handshake
  Mar 24 20:48:43 pixel NetworkManager[832]: <info> (wlan0):
supplicant interface state: 4-way handshake -> completed

while an unsuccessful one does this:

  Mar 24 20:48:24 pixel NetworkManager[832]: <info> (wlan0):
supplicant interface state: authenticating -> associated
  Mar 24 20:48:24 pixel NetworkManager[832]: <info> (wlan0):
supplicant interface state: associated -> 4-way handshake

10-second delay because it asks for password again, followed by:

  Mar 24 20:48:34 pixel kernel: [ 2350.973914] wlan0: disassociating
from 50:46:5d:02:85:08 by local choice (reason=3)
  Mar 24 20:48:34 pixel kernel: [ 2350.980860] cfg80211: Calling CRDA
to update world regulatory domain
  Mar 24 20:48:34 pixel kernel: [ 2350.984796] wlan0: deauthenticating
from 50:46:5d:02:85:08 by local choice (reason=3)
  Mar 24 20:48:34 pixel NetworkManager[832]: <info> (wlan0):
supplicant interface state: 4-way handshake -> disconnected

Any idea? I tried loading the ath9k driver with debug=0xffff
nohwcrypt=1, but that didn't really make any difference...

Sure, the failing network has a different SSID and obviously a
different password, but no, I didn't screw up typing in the password.
What else can be different? Anything I can try? Relying on the really
slow Edge connection from my cellphone gets me a working internet, but
it's *slow*....

                 Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25  4:06 [ath9k-devel] ath9k not connecting to one particular network Linus Torvalds
@ 2013-03-25  4:22 ` Outback Dingo
  2013-03-25  9:05   ` Linus Torvalds
  2013-03-25  4:26 ` Joel Wirāmu Pauling
  2013-03-25  5:38 ` Adrian Chadd
  2 siblings, 1 reply; 27+ messages in thread
From: Outback Dingo @ 2013-03-25  4:22 UTC (permalink / raw)
  To: ath9k-devel

bad firmware load on a crappy wireless AP probably, I just flashed the
latest firmware on a new Netgear, and had exactly the same issue
flashed back to the original firmware and the issue goes away, question is
whats the router and version? Ive seen it before also occasionally
using OpenWRT on some of my home units where a client would simply refuse
to connect, you could also ask the location your staying to reboot
the wireless AP, it might clear the error.


On Mon, Mar 25, 2013 at 12:06 AM, Linus Torvalds <
torvalds@linux-foundation.org> wrote:

> Ok, I'm sure this is something people have seen before, but it has me
> stumped.
>
> I'm on the road for a couple of days with my google ChromeBook pixel,
> which worked perfectly fine in the previous location I was at, and
> works fine at home. But in the current condo, the machine simply
> *cannot* connect to the wireless network here.
>
> The pixel has a :"Atheros AR9462 Rev:2" in it, and I know the network
> itself works, because my cellphone connects to it just fine.
>
> This is a WPA2-encrypted network, but so is my home network, and so
> was the one I connected to last week when traveling. I'm writing this
> on the Pixel, but I'm writing it connected to my phone doing a
> hotspot, because that works. Very odd.
>
> Working connection:
>
>   wlan0: authenticate with 02:1a:11:fc:8c:3b
>   wlan0: send auth to 02:1a:11:fc:8c:3b (try 1/3)
>   wlan0: authenticated
>   wlan0: associate with 02:1a:11:fc:8c:3b (try 1/3)
>   wlan0: RX AssocResp from 02:1a:11:fc:8c:3b (capab=0x411 status=0 aid=1)
>   wlan0: associated
>
> Non-working connection:
>
>   wlan0: authenticate with 50:46:5d:02:85:08
>   wlan0: send auth to 50:46:5d:02:85:08 (try 1/3)
>   wlan0: authenticated
>   wlan0: associate with 50:46:5d:02:85:08 (try 1/3)
>   wlan0: RX AssocResp from 50:46:5d:02:85:08 (capab=0x411 status=0 aid=10)
>   wlan0: associated
>
> .. now NM asks for the password again, and then I get:
>
>   wlan0: disassociating from 50:46:5d:02:85:08 by local choice (reason=3)
>
> I'm not seeing what the difference is. Except for that "aid=1" vs
> "aid=10", which I have no idea what it means.
>
> Looking in /var/log/messages, a successful connection looks like this:
>
>   Mar 24 20:48:42 pixel NetworkManager[832]: <info> (wlan0):
> supplicant interface state: associating -> associated
>   Mar 24 20:48:43 pixel NetworkManager[832]: <info> (wlan0):
> supplicant interface state: associated -> 4-way handshake
>   Mar 24 20:48:43 pixel NetworkManager[832]: <info> (wlan0):
> supplicant interface state: 4-way handshake -> completed
>
> while an unsuccessful one does this:
>
>   Mar 24 20:48:24 pixel NetworkManager[832]: <info> (wlan0):
> supplicant interface state: authenticating -> associated
>   Mar 24 20:48:24 pixel NetworkManager[832]: <info> (wlan0):
> supplicant interface state: associated -> 4-way handshake
>
> 10-second delay because it asks for password again, followed by:
>
>   Mar 24 20:48:34 pixel kernel: [ 2350.973914] wlan0: disassociating
> from 50:46:5d:02:85:08 by local choice (reason=3)
>   Mar 24 20:48:34 pixel kernel: [ 2350.980860] cfg80211: Calling CRDA
> to update world regulatory domain
>   Mar 24 20:48:34 pixel kernel: [ 2350.984796] wlan0: deauthenticating
> from 50:46:5d:02:85:08 by local choice (reason=3)
>   Mar 24 20:48:34 pixel NetworkManager[832]: <info> (wlan0):
> supplicant interface state: 4-way handshake -> disconnected
>
> Any idea? I tried loading the ath9k driver with debug=0xffff
> nohwcrypt=1, but that didn't really make any difference...
>
> Sure, the failing network has a different SSID and obviously a
> different password, but no, I didn't screw up typing in the password.
> What else can be different? Anything I can try? Relying on the really
> slow Edge connection from my cellphone gets me a working internet, but
> it's *slow*....
>
>                  Linus
> _______________________________________________
> ath9k-devel mailing list
> ath9k-devel at lists.ath9k.org
> https://lists.ath9k.org/mailman/listinfo/ath9k-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ath9k.org/pipermail/ath9k-devel/attachments/20130325/3dbbe2ae/attachment.htm 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25  4:06 [ath9k-devel] ath9k not connecting to one particular network Linus Torvalds
  2013-03-25  4:22 ` Outback Dingo
@ 2013-03-25  4:26 ` Joel Wirāmu Pauling
  2013-03-25  9:23   ` Linus Torvalds
  2013-03-25  5:38 ` Adrian Chadd
  2 siblings, 1 reply; 27+ messages in thread
From: Joel Wirāmu Pauling @ 2013-03-25  4:26 UTC (permalink / raw)
  To: ath9k-devel

Can you try disabling network manager from the init scripts (I am not
sure which distro you are using as a base but
/etc/init.d/network-manager stop ) tends to work for a percentage of
machines.

Then running wpa_supplicant manually.

i.e:

create a hashed passphrase config snippit with :

aenertia at ue-ufb:~/openwrt/trunk$ wpa_passphrase youressid

wpa_passphrase youressid yourpass > wpa_conf.txt

then run wpa_supplicant mannually (with sufficient privileges)

wpa_supplicant -i wlan0 -c wpa_conf.txt

You should see a connection establish. From there you can dhclient/etc manually.

This will prove it is something up the stack with network-manager
failing to call something. If it doesn't work then possibly back to
looking at the ath9k part.

-Joel


On 25 March 2013 17:06, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> Ok, I'm sure this is something people have seen before, but it has me stumped.
>
> I'm on the road for a couple of days with my google ChromeBook pixel,
> which worked perfectly fine in the previous location I was at, and
> works fine at home. But in the current condo, the machine simply
> *cannot* connect to the wireless network here.
>
> The pixel has a :"Atheros AR9462 Rev:2" in it, and I know the network
> itself works, because my cellphone connects to it just fine.
>
> This is a WPA2-encrypted network, but so is my home network, and so
> was the one I connected to last week when traveling. I'm writing this
> on the Pixel, but I'm writing it connected to my phone doing a
> hotspot, because that works. Very odd.
>
> Working connection:
>
>   wlan0: authenticate with 02:1a:11:fc:8c:3b
>   wlan0: send auth to 02:1a:11:fc:8c:3b (try 1/3)
>   wlan0: authenticated
>   wlan0: associate with 02:1a:11:fc:8c:3b (try 1/3)
>   wlan0: RX AssocResp from 02:1a:11:fc:8c:3b (capab=0x411 status=0 aid=1)
>   wlan0: associated
>
> Non-working connection:
>
>   wlan0: authenticate with 50:46:5d:02:85:08
>   wlan0: send auth to 50:46:5d:02:85:08 (try 1/3)
>   wlan0: authenticated
>   wlan0: associate with 50:46:5d:02:85:08 (try 1/3)
>   wlan0: RX AssocResp from 50:46:5d:02:85:08 (capab=0x411 status=0 aid=10)
>   wlan0: associated
>
> .. now NM asks for the password again, and then I get:
>
>   wlan0: disassociating from 50:46:5d:02:85:08 by local choice (reason=3)
>
> I'm not seeing what the difference is. Except for that "aid=1" vs
> "aid=10", which I have no idea what it means.
>
> Looking in /var/log/messages, a successful connection looks like this:
>
>   Mar 24 20:48:42 pixel NetworkManager[832]: <info> (wlan0):
> supplicant interface state: associating -> associated
>   Mar 24 20:48:43 pixel NetworkManager[832]: <info> (wlan0):
> supplicant interface state: associated -> 4-way handshake
>   Mar 24 20:48:43 pixel NetworkManager[832]: <info> (wlan0):
> supplicant interface state: 4-way handshake -> completed
>
> while an unsuccessful one does this:
>
>   Mar 24 20:48:24 pixel NetworkManager[832]: <info> (wlan0):
> supplicant interface state: authenticating -> associated
>   Mar 24 20:48:24 pixel NetworkManager[832]: <info> (wlan0):
> supplicant interface state: associated -> 4-way handshake
>
> 10-second delay because it asks for password again, followed by:
>
>   Mar 24 20:48:34 pixel kernel: [ 2350.973914] wlan0: disassociating
> from 50:46:5d:02:85:08 by local choice (reason=3)
>   Mar 24 20:48:34 pixel kernel: [ 2350.980860] cfg80211: Calling CRDA
> to update world regulatory domain
>   Mar 24 20:48:34 pixel kernel: [ 2350.984796] wlan0: deauthenticating
> from 50:46:5d:02:85:08 by local choice (reason=3)
>   Mar 24 20:48:34 pixel NetworkManager[832]: <info> (wlan0):
> supplicant interface state: 4-way handshake -> disconnected
>
> Any idea? I tried loading the ath9k driver with debug=0xffff
> nohwcrypt=1, but that didn't really make any difference...
>
> Sure, the failing network has a different SSID and obviously a
> different password, but no, I didn't screw up typing in the password.
> What else can be different? Anything I can try? Relying on the really
> slow Edge connection from my cellphone gets me a working internet, but
> it's *slow*....
>
>                  Linus
> _______________________________________________
> ath9k-devel mailing list
> ath9k-devel at lists.ath9k.org
> https://lists.ath9k.org/mailman/listinfo/ath9k-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25  4:06 [ath9k-devel] ath9k not connecting to one particular network Linus Torvalds
  2013-03-25  4:22 ` Outback Dingo
  2013-03-25  4:26 ` Joel Wirāmu Pauling
@ 2013-03-25  5:38 ` Adrian Chadd
  2 siblings, 0 replies; 27+ messages in thread
From: Adrian Chadd @ 2013-03-25  5:38 UTC (permalink / raw)
  To: ath9k-devel

On 24 March 2013 21:06, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> Ok, I'm sure this is something people have seen before, but it has me stumped.

>   wlan0: authenticate with 50:46:5d:02:85:08
>   wlan0: send auth to 50:46:5d:02:85:08 (try 1/3)
>   wlan0: authenticated
>   wlan0: associate with 50:46:5d:02:85:08 (try 1/3)
>   wlan0: RX AssocResp from 50:46:5d:02:85:08 (capab=0x411 status=0 aid=10)
>   wlan0: associated

[snip]

> I'm not seeing what the difference is. Except for that "aid=1" vs
> "aid=10", which I have no idea what it means.

AID is the "Association ID", which is a field used by the AP to
identify which particular offset inside various things (eg the TIM
bitmap, which identifies which stations have traffic in each beacon
frame, so stations can stay asleep until they see a beacon with their
bit set in said TIM) to pay attention to.

Now, the bulk of the driver/chip doesn't care about the AID. It only
affects things in power save.

So perhaps you can try disabling power save on your chromebook
(there's an iw command to do this; please check the archives to see
what it is.)

If disabling station-side power save works, maybe the AID isn't being
programmed or restored correctly when the hardware is being reset
(which occurs during scans, for example.)

Thanks,


Adrian

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25  4:22 ` Outback Dingo
@ 2013-03-25  9:05   ` Linus Torvalds
  0 siblings, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2013-03-25  9:05 UTC (permalink / raw)
  To: ath9k-devel

On Sun, Mar 24, 2013 at 9:22 PM, Outback Dingo <outbackdingo@gmail.com> wrote:
> bad firmware load on a crappy wireless AP probably, I just flashed the
> latest firmware on a new Netgear, and had exactly the same issue
> flashed back to the original firmware and the issue goes away, question is
> whats the router and version? Ive seen it before also occasionally
> using OpenWRT on some of my home units where a client would simply refuse to
> connect, you could also ask the location your staying to reboot
> the wireless AP, it might clear the error.

I don't think you really thought that through.

I agree that AP's are often buggy and have horrid firmware, but asking
various hotels to go around fixing their network just because it
doesn't work for you - when it works for everybody else - just isn't
an option.

As mentioned, I guarantee that the wireless network is actually
working. There's something in ath9k (or possibly NetworknManager) that
doesn't work with this network. It's a problem on the client side, and
in the real world you can't just go say "hey, please bend over
backwards because my client is being picky". No hotel will care
enough.

                   Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25  4:26 ` Joel Wirāmu Pauling
@ 2013-03-25  9:23   ` Linus Torvalds
  2013-03-25  9:38     ` Linus Torvalds
                       ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Linus Torvalds @ 2013-03-25  9:23 UTC (permalink / raw)
  To: ath9k-devel

On Sun, Mar 24, 2013 at 9:26 PM, Joel Wir?mu Pauling <joel@aenertia.net> wrote:
> Can you try disabling network manager from the init scripts (I am not
> sure which distro you are using as a base but
> /etc/init.d/network-manager stop ) tends to work for a percentage of
> machines.
>
> Then running wpa_supplicant manually.

Bingo.

Apparently this really is NetworkManager doing something wrong.

I'm running F18, so doing

        systemctl stop NetworkManager.service
        ifconfig wlan0 up
        iwlist scan
        wpa_passphrase *essid* *password* > wpa_conf.txt
        wpa_supplicant -i wlan0 -c wpa_conf.txt

and then

        dhclient wlan0

worked. Presumably NetworkManager should do the same, but doesn't. Or
more likely, does something *more*, and that "something more" ends up
confusing the driver and resetting.

Interestingly, the kernel messages from doing this were different:

  wlan0: RX AssocResp from 50:46:5d:02:85:08 (capab=0x411 status=0 aid=16)

notice how now it says "aid=16" instead of "aid=10". WTF?

So now I have working internet, can anybody suggest something for the
NM bug-report? Some way for the NM people to guess at what they do
wrong? Any ideas of why it would happen only for this particular
network?

                            Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25  9:23   ` Linus Torvalds
@ 2013-03-25  9:38     ` Linus Torvalds
  2013-03-25 10:12     ` Jouni Malinen
  2013-03-25 10:43     ` Oleksij Rempel
  2 siblings, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2013-03-25  9:38 UTC (permalink / raw)
  To: ath9k-devel

On Mon, Mar 25, 2013 at 2:23 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Interestingly, the kernel messages from doing this were different:
>
>   wlan0: RX AssocResp from 50:46:5d:02:85:08 (capab=0x411 status=0 aid=16)
>
> notice how now it says "aid=16" instead of "aid=10". WTF?

This seems to be irrelevant. Restarting NetworkManager (just to test)
still didn't give me a working connection, but I got that "aid=16"
again.

So the driver messages don't tell me anything, except that NM does
*something* that triggers that

   wlan0: disassociating from 50:46:5d:02:85:08 by local choice (reason=3)

immediately after associating. It also results in

   cfg80211: Calling CRDA to update world regulatory domain

and then all the frequencies are shown again, so I assume it's some
kind of link reset. "reason=3" seems to be WLAN_REASON_DEAUTH_LEAVING.

I'll do a NM bug-report, but if somebody has a clue what NM might be
doing that would trigger this (when running wpa_supplicant manually
does not), that would be lovely to add to the bugreport.

                              Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25  9:23   ` Linus Torvalds
  2013-03-25  9:38     ` Linus Torvalds
@ 2013-03-25 10:12     ` Jouni Malinen
  2013-03-25 10:38       ` Linus Torvalds
  2013-03-25 10:43     ` Oleksij Rempel
  2 siblings, 1 reply; 27+ messages in thread
From: Jouni Malinen @ 2013-03-25 10:12 UTC (permalink / raw)
  To: ath9k-devel

On Mon, Mar 25, 2013 at 02:23:52AM -0700, Linus Torvalds wrote:
> Apparently this really is NetworkManager doing something wrong.
> 
> I'm running F18, so doing
> 
>         systemctl stop NetworkManager.service
>         ifconfig wlan0 up
>         iwlist scan
>         wpa_passphrase *essid* *password* > wpa_conf.txt
>         wpa_supplicant -i wlan0 -c wpa_conf.txt
> 
> and then
> 
>         dhclient wlan0
> 
> worked.

Did you happen to notice whether wpa_supplicant showed more than one
attempt at associating with the AP?

> Presumably NetworkManager should do the same, but doesn't. Or
> more likely, does something *more*, and that "something more" ends up
> confusing the driver and resetting.

It could be helpful to get wpa_supplicant debug output from both the
successful case (running it with manual configuration) and with NM
running it. I haven't done much debugging with NM, but this page seems
to show some guidance on how to collect the latter:
https://live.gnome.org/NetworkManager/Debugging
(for the former, just add -ddt on wpa_supplicant command line)

> Interestingly, the kernel messages from doing this were different:
> 
>   wlan0: RX AssocResp from 50:46:5d:02:85:08 (capab=0x411 status=0 aid=16)
> 
> notice how now it says "aid=16" instead of "aid=10". WTF?

That is likely irrelevant to the issue here. The AP usually allocates
the next available Association ID for the station and the exact value
depends on what other stations are currently associated. The only area
where this would have an effect during the association is in power
saving (buffered frames on the AP are indicated for a specific AID).

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25 10:12     ` Jouni Malinen
@ 2013-03-25 10:38       ` Linus Torvalds
  2013-03-25 11:06         ` Linus Torvalds
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2013-03-25 10:38 UTC (permalink / raw)
  To: ath9k-devel

On Mon, Mar 25, 2013 at 3:12 AM, Jouni Malinen <jouni@qca.qualcomm.com> wrote:
>
> Did you happen to notice whether wpa_supplicant showed more than one
> attempt at associating with the AP?

According to the kernel messages, there seems to be just a single
quick association:

IOW, this is what happens with wpa_supplicant:

  [ 4219.297875] wlan0: authenticate with 50:46:5d:02:85:08
  [ 4219.312013] wlan0: send auth to 50:46:5d:02:85:08 (try 1/3)
  [ 4219.314029] wlan0: authenticated
  [ 4219.314099] wlan0: associate with 50:46:5d:02:85:08 (try 1/3)
  [ 4219.317809] wlan0: RX AssocResp from 50:46:5d:02:85:08
(capab=0x411 status=0 aid=16)
  [ 4219.317913] wlan0: associated
  [ 4219.317934] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready

and everything is happy.

>> Presumably NetworkManager should do the same, but doesn't. Or
>> more likely, does something *more*, and that "something more" ends up
>> confusing the driver and resetting.
>
> It could be helpful to get wpa_supplicant debug output from both the
> successful case (running it with manual configuration) and with NM
> running it. I haven't done much debugging with NM, but this page seems
> to show some guidance on how to collect the latter:
> https://live.gnome.org/NetworkManager/Debugging
> (for the former, just add -ddt on wpa_supplicant command line)

Hmm. That points to /var/log/wpa_supplicant.log, which I hadn't
noticed before. That looks useful, except for the fact that it just
shows

  wlan0: SME: Trying to authenticate with 50:46:5d:02:85:08
(SSID='lodgecondo' freq=2412 MHz)
  wlan0: Trying to associate with 50:46:5d:02:85:08 (SSID='lodgecondo'
freq=2412 MHz)
  wlan0: Associated with 50:46:5d:02:85:08
  wlan0: Authentication with 50:46:5d:02:85:08 timed out.
  wlan0: CTRL-EVENT-DISCONNECTED bssid=00:00:00:00:00:00 reason=3

which looks pretty bogus, since wpa_supplicant on its own seems to
associate/authenticate pretty much immediately. There are no
timestamps in that log-file, so it's hard to tell, but that web page
tells me how to add them, so I'll do that (I obviously lose my network
when I try this, so I'm not doing it while writing this email ;)

                  Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25  9:23   ` Linus Torvalds
  2013-03-25  9:38     ` Linus Torvalds
  2013-03-25 10:12     ` Jouni Malinen
@ 2013-03-25 10:43     ` Oleksij Rempel
  2 siblings, 0 replies; 27+ messages in thread
From: Oleksij Rempel @ 2013-03-25 10:43 UTC (permalink / raw)
  To: ath9k-devel

Am 25.03.2013 10:23, schrieb Linus Torvalds:
> On Sun, Mar 24, 2013 at 9:26 PM, Joel Wir?mu Pauling <joel@aenertia.net> wrote:
>> Can you try disabling network manager from the init scripts (I am not
>> sure which distro you are using as a base but
>> /etc/init.d/network-manager stop ) tends to work for a percentage of
>> machines.
>>
>> Then running wpa_supplicant manually.
>
> Bingo.
>
> Apparently this really is NetworkManager doing something wrong.
>
> I'm running F18, so doing
>
>          systemctl stop NetworkManager.service
>          ifconfig wlan0 up
>          iwlist scan
>          wpa_passphrase *essid* *password* > wpa_conf.txt
>          wpa_supplicant -i wlan0 -c wpa_conf.txt
>
> and then
>
>          dhclient wlan0
>
> worked. Presumably NetworkManager should do the same, but doesn't. Or
> more likely, does something *more*, and that "something more" ends up
> confusing the driver and resetting.
>
> Interestingly, the kernel messages from doing this were different:
>
>    wlan0: RX AssocResp from 50:46:5d:02:85:08 (capab=0x411 status=0 aid=16)
>
> notice how now it says "aid=16" instead of "aid=10". WTF?
>
> So now I have working internet, can anybody suggest something for the
> NM bug-report? Some way for the NM people to guess at what they do
> wrong? Any ideas of why it would happen only for this particular
> network?
>

For some months there was a bug in NetworkManager or dhcp client it 
used. If dhcp leas time is unlimited then it fails to connect. By 
changing default dhcpclient for NEtworkManager it was possible to work 
around it. MAy bit it is that issue.


-- 
Regards,
Oleksij

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25 10:38       ` Linus Torvalds
@ 2013-03-25 11:06         ` Linus Torvalds
  2013-03-25 11:30           ` Jouni Malinen
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2013-03-25 11:06 UTC (permalink / raw)
  To: ath9k-devel

On Mon, Mar 25, 2013 at 3:38 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> that web page
> tells me how to add them, so I'll do that (I obviously lose my network
> when I try this, so I'm not doing it while writing this email ;)

Ok, full wpasupplicant log added to the RH NetworkManager bugzilla,
let's hope that somebody sees the problem there. I see that there's a
gnome bugzilla for NM too, but it appears that at least Dan Williams
is on both the RH and the gnome bugzilla, so I hope the RH bugzilla
entry is sufficient.

Added Dan to the cc. Dan, is there anything else you'd want me to do?
I'll only be at the wireless network that shows the problem for one
more day, so after that debugging will be harder. The bugzilla is

  https://bugzilla.redhat.com/show_bug.cgi?id=927191

and while you didn't see the early part of the thread, it just boils
down to "NM doesn't work, setting things up manually with
wpa_supplicant does":.

Thanks to Jouni and Joel for pointing out how to get NM out of the
equation (since I originally assumed this was an ath9k issue - I don't
recall having ever seen this particular problem with my previous
laptop, but it's clearly something special about this network)

                 Linus

> I'm running F18, so doing
>
>         systemctl stop NetworkManager.service
>         ifconfig wlan0 up
>         iwlist scan
>         wpa_passphrase lodgecondo condo000 > wpa_conf.txt
>         wpa_supplicant -i wlan0 -c wpa_conf.txt
>
> and then
>
>         dhclient wlan0
>
> worked.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25 11:06         ` Linus Torvalds
@ 2013-03-25 11:30           ` Jouni Malinen
  2013-03-25 12:17             ` Joel Wirāmu Pauling
  2013-03-25 16:04             ` Linus Torvalds
  0 siblings, 2 replies; 27+ messages in thread
From: Jouni Malinen @ 2013-03-25 11:30 UTC (permalink / raw)
  To: ath9k-devel

On Mon, Mar 25, 2013 at 04:06:49AM -0700, Linus Torvalds wrote:
> Ok, full wpasupplicant log added to the RH NetworkManager bugzilla,
> let's hope that somebody sees the problem there. I see that there's a
> gnome bugzilla for NM too, but it appears that at least Dan Williams
> is on both the RH and the gnome bugzilla, so I hope the RH bugzilla
> entry is sufficient.

>   https://bugzilla.redhat.com/show_bug.cgi?id=927191
> 
> and while you didn't see the early part of the thread, it just boils
> down to "NM doesn't work, setting things up manually with
> wpa_supplicant does":.

This looks very basic case taken into account the AP configuration (just
WPA2-Personal/CCMP) and passphrase that should not allow much of a
chance for typos or encoding issues (non-ASCII..). However, I cannot
find anything obvious from the debug log. For some reason, the AP just
rejects EAPOL-Key message 2/4 from the station. In most cases, this is
because of a typo in the PSK/passphrase, but it sounds quite unlikely
here unless NM somehow configured this differently (which would also be
surprising if it applies only for this specific network).

Since it looks like the particular passphrase is of not much need for
protection, it could be useful if you could collect wpa_supplicant debug
logs with -K added to the command line (-ddKt for the manual test and
something similar with NM, I'd assume). This -K adds passwords and keys
into the log so it is not something that is enabled by default, but it
sounds like this is fine here and it will include the details that would
be needed to check what exactly are the differences in the 4-way
handshake between NM and manual wpa_supplicant configuration cases.

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25 11:30           ` Jouni Malinen
@ 2013-03-25 12:17             ` Joel Wirāmu Pauling
  2013-03-25 12:46               ` Jouni Malinen
  2013-03-25 16:04             ` Linus Torvalds
  1 sibling, 1 reply; 27+ messages in thread
From: Joel Wirāmu Pauling @ 2013-03-25 12:17 UTC (permalink / raw)
  To: ath9k-devel

Likely due to some odd dhcp options or behaviours comming  whatever odd
ball vendor implemented dhcp server that runs the wireless network dhcp
leases. Often in hotels or corporates they use an offloaded radius and
crypto upstream of the AP itself. These tend to be dark voodoo, Cisco and
Aruba/Alcatel-lucent are two such examples of these where I have seen this
sort of issue and others and NM failing to get recognise the dhcp offer.

Wireshark the interface and watch the dhcp negotiation sequence on a couple
of different networks and devices if you really want to pinpoint it.

Disclaimer I work for ALU.

Glad to hear you got it working.

-Joel
http://gplus.to/aenertia
On 26 Mar 2013 00:30, "Jouni Malinen" <jouni@qca.qualcomm.com> wrote:

> On Mon, Mar 25, 2013 at 04:06:49AM -0700, Linus Torvalds wrote:
> > Ok, full wpasupplicant log added to the RH NetworkManager bugzilla,
> > let's hope that somebody sees the problem there. I see that there's a
> > gnome bugzilla for NM too, but it appears that at least Dan Williams
> > is on both the RH and the gnome bugzilla, so I hope the RH bugzilla
> > entry is sufficient.
>
> >   https://bugzilla.redhat.com/show_bug.cgi?id=927191
> >
> > and while you didn't see the early part of the thread, it just boils
> > down to "NM doesn't work, setting things up manually with
> > wpa_supplicant does":.
>
> This looks very basic case taken into account the AP configuration (just
> WPA2-Personal/CCMP) and passphrase that should not allow much of a
> chance for typos or encoding issues (non-ASCII..). However, I cannot
> find anything obvious from the debug log. For some reason, the AP just
> rejects EAPOL-Key message 2/4 from the station. In most cases, this is
> because of a typo in the PSK/passphrase, but it sounds quite unlikely
> here unless NM somehow configured this differently (which would also be
> surprising if it applies only for this specific network).
>
> Since it looks like the particular passphrase is of not much need for
> protection, it could be useful if you could collect wpa_supplicant debug
> logs with -K added to the command line (-ddKt for the manual test and
> something similar with NM, I'd assume). This -K adds passwords and keys
> into the log so it is not something that is enabled by default, but it
> sounds like this is fine here and it will include the details that would
> be needed to check what exactly are the differences in the 4-way
> handshake between NM and manual wpa_supplicant configuration cases.
>
> --
> Jouni Malinen                                            PGP id EFC895FA
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ath9k.org/pipermail/ath9k-devel/attachments/20130326/f0224f7b/attachment.htm 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25 12:17             ` Joel Wirāmu Pauling
@ 2013-03-25 12:46               ` Jouni Malinen
  0 siblings, 0 replies; 27+ messages in thread
From: Jouni Malinen @ 2013-03-25 12:46 UTC (permalink / raw)
  To: ath9k-devel

On Tue, Mar 26, 2013 at 01:17:10AM +1300, Joel Wir?mu Pauling wrote:
>    Likely due to some odd dhcp options or behaviours comming  whatever odd
>    ball vendor implemented dhcp server that runs the wireless network dhcp
>    leases. Often in hotels or corporates they use an offloaded radius and
>    crypto upstream of the AP itself. These tend to be dark voodoo, Cisco and
>    Aruba/Alcatel-lucent are two such examples of these where I have seen this
>    sort of issue and others and NM failing to get recognise the dhcp offer.

It did not even complete 4-way handshake, so there was no DHCP issues
here (didn't get that far).

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25 11:30           ` Jouni Malinen
  2013-03-25 12:17             ` Joel Wirāmu Pauling
@ 2013-03-25 16:04             ` Linus Torvalds
  2013-03-25 16:35               ` Jouni Malinen
  1 sibling, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2013-03-25 16:04 UTC (permalink / raw)
  To: ath9k-devel

On Mon, Mar 25, 2013 at 4:30 AM, Jouni Malinen <jouni@qca.qualcomm.com> wrote:
>
> This looks very basic case taken into account the AP configuration (just
> WPA2-Personal/CCMP) and passphrase that should not allow much of a
> chance for typos or encoding issues (non-ASCII..).

Hmm. I'm certain I didn't mistype it, and I did it several times (with
"show text"), and it really is a very simple 8-character passphrase
("condo000").

But it is certainly possible that the key got corrupted in between the
GUI element and the actual PSK key. When I do it manually, there's
only the "wpa_passphrase" thing that writes out the PSK key, and there
is much less room for that to get screwed up somewhere.

> However, I cannot
> find anything obvious from the debug log. For some reason, the AP just
> rejects EAPOL-Key message 2/4 from the station. In most cases, this is
> because of a typo in the PSK/passphrase, but it sounds quite unlikely
> here unless NM somehow configured this differently (which would also be
> surprising if it applies only for this specific network).

Yeah, as mentioned, I can literally switch between the two wireless
networks here in the room, and they are both WPA2. My phone just has a
much more complex passphrase.

> Since it looks like the particular passphrase is of not much need for
> protection, it could be useful if you could collect wpa_supplicant debug
> logs with -K added to the command line (-ddKt for the manual test and
> something similar with NM, I'd assume). This -K adds passwords and keys
> into the log so it is not something that is enabled by default, but it
> sounds like this is fine here and it will include the details that would
> be needed to check what exactly are the differences in the 4-way
> handshake between NM and manual wpa_supplicant configuration cases.

Ok, I've collected a successful trace with wpa_supplicant, and added
it to the bugzilla entry.

I do not know how to get NM to expose the PSK key it is using, though.
 So the NM-generated failing log doesn't have that information, and
unless somebody tells me the magic dbus sequence to enable it, I
likely won't get it...

                LInus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25 16:04             ` Linus Torvalds
@ 2013-03-25 16:35               ` Jouni Malinen
  2013-03-25 16:42                 ` Linus Torvalds
  0 siblings, 1 reply; 27+ messages in thread
From: Jouni Malinen @ 2013-03-25 16:35 UTC (permalink / raw)
  To: ath9k-devel

On Mon, Mar 25, 2013 at 09:04:14AM -0700, Linus Torvalds wrote:
> Ok, I've collected a successful trace with wpa_supplicant, and added
> it to the bugzilla entry.

Thanks. This log does actually show a retry of the EAPOL-Key message 1/4
about a second after the first attempt and that was the area I wanted to
get more details on to allow this to be compared to the one with NM
where EAPOL-Key message 3/4 does not come from the AP. Based on this, it
is now less clear to me why the NM case would have behaved differently.

> I do not know how to get NM to expose the PSK key it is using, though.
>  So the NM-generated failing log doesn't have that information, and
> unless somebody tells me the magic dbus sequence to enable it, I
> likely won't get it...

I'm not really familiar with debugging with NM enabled, but it could be
possible that the instructions for the older wpa_supplicant on
https://live.gnome.org/NetworkManager/Debugging could be adopted for
this. The key here would be to get -K added wpa_supplicant command line
since exposing of the keys cannot be enabled dynamically at runtime.

It could be possible to do that by editing one of these files:
/usr/share/dbus-1/system-services/fi.w1.wpa_supplicant1.service
/usr/share/dbus-1/system-services/fi.epitest.hostap.WPASupplicant.service
I had both on my desktop; don't know which one is used now, but I'd
assume it is the first one since that uses a newer D-Bus interface.

Adding -dK to the Exec item there would hopefully add those to the
wpa_supplicant command line. "sudo killall wpa_supplicant" should force
wpa_supplicant to be restarted with the updated command line.

(And yes, you probably want to remove that -K from the command line
after having collected the logs to avoid future passwords for any other
network getting written into a log file..)

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25 16:35               ` Jouni Malinen
@ 2013-03-25 16:42                 ` Linus Torvalds
  2013-03-25 18:03                   ` Peter Stuge
  2013-03-26  2:12                   ` Linus Torvalds
  0 siblings, 2 replies; 27+ messages in thread
From: Linus Torvalds @ 2013-03-25 16:42 UTC (permalink / raw)
  To: ath9k-devel

On Mon, Mar 25, 2013 at 9:35 AM, Jouni Malinen <jouni@qca.qualcomm.com> wrote:
>
> I'm not really familiar with debugging with NM enabled, but it could be
> possible that the instructions for the older wpa_supplicant on
> https://live.gnome.org/NetworkManager/Debugging could be adopted for
> this. The key here would be to get -K added wpa_supplicant command line
> since exposing of the keys cannot be enabled dynamically at runtime.

I tried that already, but couldn't get it to work. I suspect there's
some tighter integration between NM and wpa_supplicant, so that it
doesn't actually use that Exec= line any more, or edits it, or
something.

> It could be possible to do that by editing one of these files:
> /usr/share/dbus-1/system-services/fi.w1.wpa_supplicant1.service
> /usr/share/dbus-1/system-services/fi.epitest.hostap.WPASupplicant.service
> I had both on my desktop; don't know which one is used now, but I'd
> assume it is the first one since that uses a newer D-Bus interface.

I did it in both, and restarted both wpa_supplicant.service and
NetworkManager.service, and still wpa_supplicant didn't get the --ddKt
argument I tried to add.

It is possible that those files are all cached by systemd or dbusd
from boot or something. I did not reboot or force systemd/dbusd to
reload their files (if that is even possible).

                   Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25 16:42                 ` Linus Torvalds
@ 2013-03-25 18:03                   ` Peter Stuge
  2013-03-25 19:01                     ` Adrian Chadd
  2013-03-26  2:12                   ` Linus Torvalds
  1 sibling, 1 reply; 27+ messages in thread
From: Peter Stuge @ 2013-03-25 18:03 UTC (permalink / raw)
  To: ath9k-devel

Me and other ath9k users have experienced various similar problems
for years, across different ath9k hardware. NM was usually not part
of the picture, at least never in my case, my problems were always
with only wpa_supplicant, if that.

ath9k has improved, but I still have "soft" issues that nobody at QCA
will spend time on, because I'm not significant enough for remote
debugging, because QCA doesn't talk about hardware details with
users, and because problems (obviously) can't be reproduced in
their lab.

Ironic that when you have an issue and get attention, it seems to lie
outside ath9k. (It may still be the driver's fault, just that the
problem isn't triggered when starting wpa_supplicant manually.)


Linus Torvalds wrote:
> I did not reboot or force systemd/dbusd to
> reload their files (if that is even possible).

systemctl [--system] daemon-reload


I hate to say this, but part of me hopes that you'll have some more
problems with the Pixel wifi. :\


//Peter

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25 18:03                   ` Peter Stuge
@ 2013-03-25 19:01                     ` Adrian Chadd
  0 siblings, 0 replies; 27+ messages in thread
From: Adrian Chadd @ 2013-03-25 19:01 UTC (permalink / raw)
  To: ath9k-devel

On 25 March 2013 11:03, Peter Stuge <peter@stuge.se> wrote:
> Me and other ath9k users have experienced various similar problems
> for years, across different ath9k hardware. NM was usually not part
> of the picture, at least never in my case, my problems were always
> with only wpa_supplicant, if that.
>
> ath9k has improved, but I still have "soft" issues that nobody at QCA
> will spend time on, because I'm not significant enough for remote
> debugging, because QCA doesn't talk about hardware details with
> users, and because problems (obviously) can't be reproduced in
> their lab.

Dude, I don't know if you've been paying attention, but Felix, Luis
and I have been ratcheting up both the developer access (still under
NDA for now), information about the hardware in general, and now open
source HAL code for the AR9300 and later chips.

_I_ got involved in this stuff deep enough and quick enough that they
hired me. There's enough information and code out there about how
their kit works. A sufficiently motivated developer can dive in and
get a lot of interesting work done. The unfortunate truth of the
matter is that we don't have the time as developers to support users.
But it hasn't stopped developers from diving in and fixing things.

It's just that for the most part, people seem scared. I don't know
why, Felix/Luis were very supportive and patient with me when I was
coming up to scratch on how the 11n chips work as part of FreeBSD
development _AND_ we found bugs in ath9k together. Heck, Felix and I
are _still_ finding bugs/bad assumptions in ath9k even today, because
I'm going through the motions of adding AR9300 support to FreeBSD and
doing my own code/documentation review.

Now - Linus. I'm glad we got to the root cause of this. I'd really
appreciate it if you could find an official Google support path to
lodge an actual bug report and get the google chromeos people to work
on this. Google have people working both on ath9k and the general
Linux wireless infrastructure. They're the ones using open source and
they can work with the open source community to get the bugs fixed.

_I_ would really like to see more community involvement with OEMs
using open source wireless (Linux, BSD, Haiku, etc.) That's partially
going to happen through the vendors and it's partially going to happen
through OEMs who are deploying them. Let's try to work with both to
keep the communication lines open.

Finally - Peter, I know you've been burnt in the past. Your posts
aren't productive. Some of the bugs can be fixed in software. Some
bugs you've brought up in the past require a PCI(e) analyser and logic
analyser. I don't (yet) have those at home - but if someone wants to
work on an FPGA project to build open source versions of both, I'm all
ears. There's practical aspects to all of this which I, Luis and Felix
are working on but we don't have the magic answer(s) yet. But please
acknowledge all the damned fine work that's been going on lately. Luis
and I have spent a lot of my non-work time doing code review and
working with QCA legal to get things opened up. You convienently seem
to miss that.

2c,


Adrian

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-25 16:42                 ` Linus Torvalds
  2013-03-25 18:03                   ` Peter Stuge
@ 2013-03-26  2:12                   ` Linus Torvalds
  2013-03-26  9:58                     ` Jouni Malinen
  1 sibling, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2013-03-26  2:12 UTC (permalink / raw)
  To: ath9k-devel

On Mon, Mar 25, 2013 at 9:42 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Mar 25, 2013 at 9:35 AM, Jouni Malinen <jouni@qca.qualcomm.com> wrote:
>>
>> I'm not really familiar with debugging with NM enabled, but it could be
>> possible that the instructions for the older wpa_supplicant on
>> https://live.gnome.org/NetworkManager/Debugging could be adopted for
>> this. The key here would be to get -K added wpa_supplicant command line
>> since exposing of the keys cannot be enabled dynamically at runtime.
>
> I tried that already, but couldn't get it to work. I suspect there's
> some tighter integration between NM and wpa_supplicant, so that it
> doesn't actually use that Exec= line any more, or edits it, or
> something.

Nothing sane worked, so I did a brute-force "let's just make
wpa_supplicant a shell-script that adds the debug fields and then runs
the real wpa_supplicant binary with the extra flags".

And that worked.

The bugzilla at

  https://bugzilla.redhat.com/show_bug.cgi?id=927191

now has a wpa_supplicant trace with keys and timestamps for the
non-working NetworkManager case too.

I see no sane difference. There are several dbus-related setup
differences, but then in the actual handshake, afaik they do the same
thing, except the non-working one never gets the reply after sending
EAPOL-Key 2/4. I dunno. I have no idea what that thing is actually
doing.

Hopefully somebody else can see what the difference is.

               Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-26  2:12                   ` Linus Torvalds
@ 2013-03-26  9:58                     ` Jouni Malinen
  2013-03-26 10:28                       ` Peter Stuge
                                         ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Jouni Malinen @ 2013-03-26  9:58 UTC (permalink / raw)
  To: ath9k-devel

On Mon, Mar 25, 2013 at 07:12:57PM -0700, Linus Torvalds wrote:
> Nothing sane worked, so I did a brute-force "let's just make
> wpa_supplicant a shell-script that adds the debug fields and then runs
> the real wpa_supplicant binary with the extra flags".

Ah, yes. I remember having done that at some point long time ago when
giving up with NM.. ;-)

> The bugzilla at
> 
>   https://bugzilla.redhat.com/show_bug.cgi?id=927191
> 
> now has a wpa_supplicant trace with keys and timestamps for the
> non-working NetworkManager case too.
> 
> I see no sane difference. There are several dbus-related setup
> differences, but then in the actual handshake, afaik they do the same
> thing, except the non-working one never gets the reply after sending
> EAPOL-Key 2/4. I dunno. I have no idea what that thing is actually
> doing.

I could not find any real difference in the security negotiation of
EAPOL-Key messages 1-2. The point that Robert made in the bugzilla case
is interesting, though. I did not notice this at first, but there is
indeed a clear difference in the driver interface (nl80211 vs. WEXT)
that is being used here. This should not have really caused the issue
since both cases used cfg80211 and same set of parameters for
association. Anyway, that is the only clear difference..

If you still happen to be at the location with this AP, it could be
useful to confirm that this kernel interface difference is indeed the
reason by running the manual configuration case with -Dnl80211 added to
the wpa_supplicant command line to force nl80211 interface to be used.

In addition, you could try to collect the frames exchanged by the
success and failure cases using a monitor interface:

iw wlan0 interface add mon0 type monitor
ifconfig mon0 up
dumpcap -i mon0 -w /tmp/capture.pkt

And then run the success/failure case.

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-26  9:58                     ` Jouni Malinen
@ 2013-03-26 10:28                       ` Peter Stuge
  2013-03-26 13:56                       ` Dan Williams
  2013-03-26 15:25                       ` Linus Torvalds
  2 siblings, 0 replies; 27+ messages in thread
From: Peter Stuge @ 2013-03-26 10:28 UTC (permalink / raw)
  To: ath9k-devel

Jouni Malinen wrote:
> a clear difference in the driver interface (nl80211 vs. WEXT)
> that is being used here. This should not have really caused the
> issue since both cases used cfg80211 and same set of parameters for
> association. Anyway, that is the only clear difference..

I can confirm that nl80211 and WEXT behave differently on ath5k AR5414.

I'll look for repeatability and make captures if you want.


//Peter

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-26  9:58                     ` Jouni Malinen
  2013-03-26 10:28                       ` Peter Stuge
@ 2013-03-26 13:56                       ` Dan Williams
  2013-03-26 15:25                       ` Linus Torvalds
  2 siblings, 0 replies; 27+ messages in thread
From: Dan Williams @ 2013-03-26 13:56 UTC (permalink / raw)
  To: ath9k-devel

On Tue, 2013-03-26 at 11:58 +0200, Jouni Malinen wrote:
> On Mon, Mar 25, 2013 at 07:12:57PM -0700, Linus Torvalds wrote:
> > Nothing sane worked, so I did a brute-force "let's just make
> > wpa_supplicant a shell-script that adds the debug fields and then runs
> > the real wpa_supplicant binary with the extra flags".
> 
> Ah, yes. I remember having done that at some point long time ago when
> giving up with NM.. ;-)

That debugging page should really be updated now that most people are on
0.9.  In any case, when developing, to get around systemd + NM
respawning the supplicant when it dies, I typically do:

mv /usr/sbin/wpa_supplicant /
killall -TERM wpa_supplicant
/wpa_supplicant -dddtuK

and then NM will see the supplicant start again, and reconnect, and you
get all the debug logs.  You can also poke the supplicant via D-Bus to
increase the logging level, but I'm not sure if the D-Bus interface
allows exposing the keys.

> > The bugzilla at
> > 
> >   https://bugzilla.redhat.com/show_bug.cgi?id=927191
> > 
> > now has a wpa_supplicant trace with keys and timestamps for the
> > non-working NetworkManager case too.
> > 
> > I see no sane difference. There are several dbus-related setup
> > differences, but then in the actual handshake, afaik they do the same
> > thing, except the non-working one never gets the reply after sending
> > EAPOL-Key 2/4. I dunno. I have no idea what that thing is actually
> > doing.
> 
> I could not find any real difference in the security negotiation of
> EAPOL-Key messages 1-2. The point that Robert made in the bugzilla case
> is interesting, though. I did not notice this at first, but there is
> indeed a clear difference in the driver interface (nl80211 vs. WEXT)
> that is being used here. This should not have really caused the issue
> since both cases used cfg80211 and same set of parameters for
> association. Anyway, that is the only clear difference..

NM has passed "nl80211,wext" as the supplicant driver for a year or so
now, because we all know nl80211 is the way to go.  Besides that, the
network config sent to the supplicant does not change at all between
nl80211 and wext; you can see the options in /var/log/messages:

<info> Config: added 'ssid' value 'rio grande'
<info> Config: added 'scan_ssid' value '1'
<info> Config: added 'key_mgmt' value 'WPA-PSK'
<info> Config: added 'psk' value '<omitted>'
<info> Config: added 'proto' value 'WPA RSN'
<info> Activation (wlan0) Stage 2 of 5 (Device Configure) complete.

NM doesn't dump the key, but that feature seems useful and could be
added with some huge warnings.

> If you still happen to be at the location with this AP, it could be
> useful to confirm that this kernel interface difference is indeed the
> reason by running the manual configuration case with -Dnl80211 added to
> the wpa_supplicant command line to force nl80211 interface to be used.

This would be a great test; obviously if nl80211 fails to work but wext
does work, we need to fix that in the supplicant or kernel.

Dan

> In addition, you could try to collect the frames exchanged by the
> success and failure cases using a monitor interface:
> 
> iw wlan0 interface add mon0 type monitor
> ifconfig mon0 up
> dumpcap -i mon0 -w /tmp/capture.pkt
> 
> And then run the success/failure case.
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-26  9:58                     ` Jouni Malinen
  2013-03-26 10:28                       ` Peter Stuge
  2013-03-26 13:56                       ` Dan Williams
@ 2013-03-26 15:25                       ` Linus Torvalds
  2013-03-26 15:40                         ` Jouni Malinen
  2 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2013-03-26 15:25 UTC (permalink / raw)
  To: ath9k-devel

On Tue, Mar 26, 2013 at 2:58 AM, Jouni Malinen <jouni@qca.qualcomm.com> wrote:
>
> If you still happen to be at the location with this AP, it could be
> useful to confirm that this kernel interface difference is indeed the
> reason by running the manual configuration case with -Dnl80211 added to
> the wpa_supplicant command line to force nl80211 interface to be used.

Ok, added to the bugzilla. I'm still in the condo, but I'm leaving
soon and need to pack up, so this is likely the last thing I can do..

It is indeed the nl80211 part that causes problems, so I guess we're
back to an actual ath9k issue.

> In addition, you could try to collect the frames exchanged by the
> success and failure cases using a monitor interface:
>
> iw wlan0 interface add mon0 type monitor
> ifconfig mon0 up
> dumpcap -i mon0 -w /tmp/capture.pkt
>
> And then run the success/failure case.

Ok, will try to do that too.

                Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-26 15:25                       ` Linus Torvalds
@ 2013-03-26 15:40                         ` Jouni Malinen
  2013-03-26 19:20                             ` Luis R. Rodriguez
  0 siblings, 1 reply; 27+ messages in thread
From: Jouni Malinen @ 2013-03-26 15:40 UTC (permalink / raw)
  To: ath9k-devel

On Tue, Mar 26, 2013 at 08:25:20AM -0700, Linus Torvalds wrote:
> Ok, added to the bugzilla. I'm still in the condo, but I'm leaving
> soon and need to pack up, so this is likely the last thing I can do..
> 
> It is indeed the nl80211 part that causes problems, so I guess we're
> back to an actual ath9k issue.

Thanks. This is peculiar, but indeed pretty clear on the issue being
triggered by use of nl80211 instead of WEXT regardless of what the
actual issue is. I haven't used wpa_supplicant v1.0-rc3 pretty much
at all myself and there has been quite a few changes in nl80211
interaction since then. Anyway, this should give a good test case with
the current kernel and that somewhat older version with nl80211
implementation in wpa_supplicant.

> > In addition, you could try to collect the frames exchanged by the
> > success and failure cases using a monitor interface:
> >
> > iw wlan0 interface add mon0 type monitor
> > ifconfig mon0 up
> > dumpcap -i mon0 -w /tmp/capture.pkt
> >
> > And then run the success/failure case.
> 
> Ok, will try to do that too.

If you'll get a chance to do that, it would be quite helpful since it
could make it easier to try to reproduce this if there is indeed
something special about that AP behavior that allows the issue to show
up but does not show up in the wpa_supplicant debug log.

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [ath9k-devel] ath9k not connecting to one particular network..
  2013-03-26 15:40                         ` Jouni Malinen
@ 2013-03-26 19:20                             ` Luis R. Rodriguez
  0 siblings, 0 replies; 27+ messages in thread
From: Luis R. Rodriguez @ 2013-03-26 19:20 UTC (permalink / raw)
  To: Jouni Malinen, linux-wireless
  Cc: Linus Torvalds, Vasanthakumar Thiagarajan, Dan Williams,
	John W. Linville, ath9k-devel, Joel Wirāmu Pauling,
	Senthil Balasubramanian

Adding linux-wireless, not sure if many people read this list.

  Luis

On Tue, Mar 26, 2013 at 8:40 AM, Jouni Malinen <jouni@qca.qualcomm.com> wrote:
> On Tue, Mar 26, 2013 at 08:25:20AM -0700, Linus Torvalds wrote:
>> Ok, added to the bugzilla. I'm still in the condo, but I'm leaving
>> soon and need to pack up, so this is likely the last thing I can do..
>>
>> It is indeed the nl80211 part that causes problems, so I guess we're
>> back to an actual ath9k issue.
>
> Thanks. This is peculiar, but indeed pretty clear on the issue being
> triggered by use of nl80211 instead of WEXT regardless of what the
> actual issue is. I haven't used wpa_supplicant v1.0-rc3 pretty much
> at all myself and there has been quite a few changes in nl80211
> interaction since then. Anyway, this should give a good test case with
> the current kernel and that somewhat older version with nl80211
> implementation in wpa_supplicant.
>
>> > In addition, you could try to collect the frames exchanged by the
>> > success and failure cases using a monitor interface:
>> >
>> > iw wlan0 interface add mon0 type monitor
>> > ifconfig mon0 up
>> > dumpcap -i mon0 -w /tmp/capture.pkt
>> >
>> > And then run the success/failure case.
>>
>> Ok, will try to do that too.
>
> If you'll get a chance to do that, it would be quite helpful since it
> could make it easier to try to reproduce this if there is indeed
> something special about that AP behavior that allows the issue to show
> up but does not show up in the wpa_supplicant debug log.
>
> --
> Jouni Malinen                                            PGP id EFC895FA
> _______________________________________________
> ath9k-devel mailing list
> ath9k-devel@lists.ath9k.org
> https://lists.ath9k.org/mailman/listinfo/ath9k-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [ath9k-devel] ath9k not connecting to one particular network..
@ 2013-03-26 19:20                             ` Luis R. Rodriguez
  0 siblings, 0 replies; 27+ messages in thread
From: Luis R. Rodriguez @ 2013-03-26 19:20 UTC (permalink / raw)
  To: ath9k-devel

Adding linux-wireless, not sure if many people read this list.

  Luis

On Tue, Mar 26, 2013 at 8:40 AM, Jouni Malinen <jouni@qca.qualcomm.com> wrote:
> On Tue, Mar 26, 2013 at 08:25:20AM -0700, Linus Torvalds wrote:
>> Ok, added to the bugzilla. I'm still in the condo, but I'm leaving
>> soon and need to pack up, so this is likely the last thing I can do..
>>
>> It is indeed the nl80211 part that causes problems, so I guess we're
>> back to an actual ath9k issue.
>
> Thanks. This is peculiar, but indeed pretty clear on the issue being
> triggered by use of nl80211 instead of WEXT regardless of what the
> actual issue is. I haven't used wpa_supplicant v1.0-rc3 pretty much
> at all myself and there has been quite a few changes in nl80211
> interaction since then. Anyway, this should give a good test case with
> the current kernel and that somewhat older version with nl80211
> implementation in wpa_supplicant.
>
>> > In addition, you could try to collect the frames exchanged by the
>> > success and failure cases using a monitor interface:
>> >
>> > iw wlan0 interface add mon0 type monitor
>> > ifconfig mon0 up
>> > dumpcap -i mon0 -w /tmp/capture.pkt
>> >
>> > And then run the success/failure case.
>>
>> Ok, will try to do that too.
>
> If you'll get a chance to do that, it would be quite helpful since it
> could make it easier to try to reproduce this if there is indeed
> something special about that AP behavior that allows the issue to show
> up but does not show up in the wpa_supplicant debug log.
>
> --
> Jouni Malinen                                            PGP id EFC895FA
> _______________________________________________
> ath9k-devel mailing list
> ath9k-devel at lists.ath9k.org
> https://lists.ath9k.org/mailman/listinfo/ath9k-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2013-03-26 19:20 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-25  4:06 [ath9k-devel] ath9k not connecting to one particular network Linus Torvalds
2013-03-25  4:22 ` Outback Dingo
2013-03-25  9:05   ` Linus Torvalds
2013-03-25  4:26 ` Joel Wirāmu Pauling
2013-03-25  9:23   ` Linus Torvalds
2013-03-25  9:38     ` Linus Torvalds
2013-03-25 10:12     ` Jouni Malinen
2013-03-25 10:38       ` Linus Torvalds
2013-03-25 11:06         ` Linus Torvalds
2013-03-25 11:30           ` Jouni Malinen
2013-03-25 12:17             ` Joel Wirāmu Pauling
2013-03-25 12:46               ` Jouni Malinen
2013-03-25 16:04             ` Linus Torvalds
2013-03-25 16:35               ` Jouni Malinen
2013-03-25 16:42                 ` Linus Torvalds
2013-03-25 18:03                   ` Peter Stuge
2013-03-25 19:01                     ` Adrian Chadd
2013-03-26  2:12                   ` Linus Torvalds
2013-03-26  9:58                     ` Jouni Malinen
2013-03-26 10:28                       ` Peter Stuge
2013-03-26 13:56                       ` Dan Williams
2013-03-26 15:25                       ` Linus Torvalds
2013-03-26 15:40                         ` Jouni Malinen
2013-03-26 19:20                           ` Luis R. Rodriguez
2013-03-26 19:20                             ` Luis R. Rodriguez
2013-03-25 10:43     ` Oleksij Rempel
2013-03-25  5:38 ` Adrian Chadd

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.