All of lore.kernel.org
 help / color / mirror / Atom feed
* Malformed AddBA Response prevents clients from associating?
@ 2014-08-25  2:17 Denton Gentry
  2014-08-25  2:22 ` Adrian Chadd
  0 siblings, 1 reply; 6+ messages in thread
From: Denton Gentry @ 2014-08-25  2:17 UTC (permalink / raw)
  To: ath10k

[-- Attachment #1: Type: text/plain, Size: 2066 bytes --]

I have an AP using ath10k as a 5 GHz 802.11ac interface. Very
occasionally it gets into a mode where stations are unable to
associate. I don’t know how it gets into this mode, but once there the
sequence of events when a client tries to associate is:

1. Client and AP exchange Authentication, Association Request, and
Association Response frames.

2. AP sends the first packet in the EAPOL exchange.

3. Client sends an AddBA request.

4. AP sends back a malformed AddBA Response. The corruption takes the
form of a recognizable AddBA Response appended with 10 bytes of
additional stuff, and carrying a bad FCS.
Once in this state, every AddBA Response I’ve captured has this extra
10 bytes of stuff appended to the end, and carries a bad FCS.

5. The client discards the AddBA Response because the FCS is wrong.
Gathering pcaps on the client shows that it does try to send its EAPOL
response, but that packet never makes it onto the air (presumably
because the AddBA is pending).


Rebooting the client does not resolve the problem, after reboot it
still cannot associate. Rebooting the AP resolves the problem.

I don’t know a lot about what causes the AP to get into this state. I
wrote a script to make clients sit in a loop disassociating and
associating every few seconds. A single client running this loop
worked for many hours; it never failed. Starting the loop on two
clients made the problem happen in about two hours. I have not yet
tried it with more clients yet.

Because the AddBA Response is generated by the firmware, I believe the
problem must be in the firmware. For some reason, it starts sending
malformed AddBA Responses.

I’ve attached two pcaps, which I’ve trimmed down to manageable size by
removing lots of beacons. The Malformed_AddBAR.pcap is from a system
in the state where clients cannot associate. The Normal_AddBAR.pcap is
captured using the same AP and client, but with everything working
normally (the EAPOL exchange completes, and the client begins sending
QoS Data frames).

[-- Attachment #2: Malformed_AddBAR.pcap --]
[-- Type: application/octet-stream, Size: 4680 bytes --]

[-- Attachment #3: Normal_AddBAR.pcap --]
[-- Type: application/octet-stream, Size: 4687 bytes --]

[-- Attachment #4: Type: text/plain, Size: 146 bytes --]

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Malformed AddBA Response prevents clients from associating?
  2014-08-25  2:17 Malformed AddBA Response prevents clients from associating? Denton Gentry
@ 2014-08-25  2:22 ` Adrian Chadd
  2014-08-25  2:28   ` Denton Gentry
  0 siblings, 1 reply; 6+ messages in thread
From: Adrian Chadd @ 2014-08-25  2:22 UTC (permalink / raw)
  To: Denton Gentry; +Cc: ath10k

Which firmware build are you using?



-a


On 24 August 2014 19:17, Denton Gentry <denton.gentry@gmail.com> wrote:
> I have an AP using ath10k as a 5 GHz 802.11ac interface. Very
> occasionally it gets into a mode where stations are unable to
> associate. I don’t know how it gets into this mode, but once there the
> sequence of events when a client tries to associate is:
>
> 1. Client and AP exchange Authentication, Association Request, and
> Association Response frames.
>
> 2. AP sends the first packet in the EAPOL exchange.
>
> 3. Client sends an AddBA request.
>
> 4. AP sends back a malformed AddBA Response. The corruption takes the
> form of a recognizable AddBA Response appended with 10 bytes of
> additional stuff, and carrying a bad FCS.
> Once in this state, every AddBA Response I’ve captured has this extra
> 10 bytes of stuff appended to the end, and carries a bad FCS.
>
> 5. The client discards the AddBA Response because the FCS is wrong.
> Gathering pcaps on the client shows that it does try to send its EAPOL
> response, but that packet never makes it onto the air (presumably
> because the AddBA is pending).
>
>
> Rebooting the client does not resolve the problem, after reboot it
> still cannot associate. Rebooting the AP resolves the problem.
>
> I don’t know a lot about what causes the AP to get into this state. I
> wrote a script to make clients sit in a loop disassociating and
> associating every few seconds. A single client running this loop
> worked for many hours; it never failed. Starting the loop on two
> clients made the problem happen in about two hours. I have not yet
> tried it with more clients yet.
>
> Because the AddBA Response is generated by the firmware, I believe the
> problem must be in the firmware. For some reason, it starts sending
> malformed AddBA Responses.
>
> I’ve attached two pcaps, which I’ve trimmed down to manageable size by
> removing lots of beacons. The Malformed_AddBAR.pcap is from a system
> in the state where clients cannot associate. The Normal_AddBAR.pcap is
> captured using the same AP and client, but with everything working
> normally (the EAPOL exchange completes, and the client begins sending
> QoS Data frames).
>
> _______________________________________________
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k
>

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Malformed AddBA Response prevents clients from associating?
  2014-08-25  2:22 ` Adrian Chadd
@ 2014-08-25  2:28   ` Denton Gentry
  2014-08-25  6:59     ` Kalle Valo
  0 siblings, 1 reply; 6+ messages in thread
From: Denton Gentry @ 2014-08-25  2:28 UTC (permalink / raw)
  To: Adrian Chadd; +Cc: ath10k

[   18.047597] ath10k: qca988x hw2.0 (0x4100016c, 0x043202ff) fw
10.1.467.2-1 api 2 htt 2.1

On Sun, Aug 24, 2014 at 7:22 PM, Adrian Chadd <adrian@freebsd.org> wrote:
> Which firmware build are you using?
>
>
>
> -a
>
>
> On 24 August 2014 19:17, Denton Gentry <denton.gentry@gmail.com> wrote:
>> I have an AP using ath10k as a 5 GHz 802.11ac interface. Very
>> occasionally it gets into a mode where stations are unable to
>> associate. I don’t know how it gets into this mode, but once there the
>> sequence of events when a client tries to associate is:
>>
>> 1. Client and AP exchange Authentication, Association Request, and
>> Association Response frames.
>>
>> 2. AP sends the first packet in the EAPOL exchange.
>>
>> 3. Client sends an AddBA request.
>>
>> 4. AP sends back a malformed AddBA Response. The corruption takes the
>> form of a recognizable AddBA Response appended with 10 bytes of
>> additional stuff, and carrying a bad FCS.
>> Once in this state, every AddBA Response I’ve captured has this extra
>> 10 bytes of stuff appended to the end, and carries a bad FCS.
>>
>> 5. The client discards the AddBA Response because the FCS is wrong.
>> Gathering pcaps on the client shows that it does try to send its EAPOL
>> response, but that packet never makes it onto the air (presumably
>> because the AddBA is pending).
>>
>>
>> Rebooting the client does not resolve the problem, after reboot it
>> still cannot associate. Rebooting the AP resolves the problem.
>>
>> I don’t know a lot about what causes the AP to get into this state. I
>> wrote a script to make clients sit in a loop disassociating and
>> associating every few seconds. A single client running this loop
>> worked for many hours; it never failed. Starting the loop on two
>> clients made the problem happen in about two hours. I have not yet
>> tried it with more clients yet.
>>
>> Because the AddBA Response is generated by the firmware, I believe the
>> problem must be in the firmware. For some reason, it starts sending
>> malformed AddBA Responses.
>>
>> I’ve attached two pcaps, which I’ve trimmed down to manageable size by
>> removing lots of beacons. The Malformed_AddBAR.pcap is from a system
>> in the state where clients cannot associate. The Normal_AddBAR.pcap is
>> captured using the same AP and client, but with everything working
>> normally (the EAPOL exchange completes, and the client begins sending
>> QoS Data frames).
>>
>> _______________________________________________
>> ath10k mailing list
>> ath10k@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/ath10k
>>

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Malformed AddBA Response prevents clients from associating?
  2014-08-25  2:28   ` Denton Gentry
@ 2014-08-25  6:59     ` Kalle Valo
  2014-08-26 16:44       ` Denton Gentry
  0 siblings, 1 reply; 6+ messages in thread
From: Kalle Valo @ 2014-08-25  6:59 UTC (permalink / raw)
  To: Denton Gentry; +Cc: Adrian Chadd, ath10k

Denton Gentry <denton.gentry@gmail.com> writes:

> [   18.047597] ath10k: qca988x hw2.0 (0x4100016c, 0x043202ff) fw
> 10.1.467.2-1 api 2 htt 2.1
>
> On Sun, Aug 24, 2014 at 7:22 PM, Adrian Chadd <adrian@freebsd.org> wrote:
>> Which firmware build are you using?

(Standard whining: top posting is annoying.)

Denton, any chance you could test with 10.2 firmware and see if the bug
happens on that firmware as well? I don't know what version of ath10k
you are using, but I think it should be enough to just to have this
patch:

https://github.com/kvalo/ath/commit/24c88f7807fb7c723690474d0a5d3441468185d9

And then take the firmware from here:

https://github.com/kvalo/ath10k-firmware/tree/master/10.2

Do not rename the firmware to firmware-2.bin or anything like that. To
avoid problems keep the name firmware-3.bin.

-- 
Kalle Valo

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Malformed AddBA Response prevents clients from associating?
  2014-08-25  6:59     ` Kalle Valo
@ 2014-08-26 16:44       ` Denton Gentry
  2014-08-26 18:33         ` Kalle Valo
  0 siblings, 1 reply; 6+ messages in thread
From: Denton Gentry @ 2014-08-26 16:44 UTC (permalink / raw)
  To: Kalle Valo; +Cc: Adrian Chadd, ath10k

On Sun, Aug 24, 2014 at 11:59 PM, Kalle Valo <kvalo@qca.qualcomm.com> wrote:
> Denton Gentry <denton.gentry@gmail.com> writes:
>
>> [   18.047597] ath10k: qca988x hw2.0 (0x4100016c, 0x043202ff) fw
>> 10.1.467.2-1 api 2 htt 2.1
>>
>> On Sun, Aug 24, 2014 at 7:22 PM, Adrian Chadd <adrian@freebsd.org> wrote:
>>> Which firmware build are you using?
>
> (Standard whining: top posting is annoying.)
>
> Denton, any chance you could test with 10.2 firmware and see if the bug
> happens on that firmware as well? I don't know what version of ath10k
> you are using, but I think it should be enough to just to have this
> patch:
>
> https://github.com/kvalo/ath/commit/24c88f7807fb7c723690474d0a5d3441468185d9
>
> And then take the firmware from here:
>
> https://github.com/kvalo/ath10k-firmware/tree/master/10.2
>
> Do not rename the firmware to firmware-2.bin or anything like that. To
> avoid problems keep the name firmware-3.bin.
>
> --
> Kalle Valo

I built an image with the 10.2 firmware and left it running with two
clients disassociating and reassociating every few seconds. After 8
hours I stopped it, with no failures detected.

I restored the 10.1 firmware, and it failed in 9 minutes. There is a
substantial amount of randomness, but it generally happens within 2
hours.

So whatever this issue is, firmware 10.2 does appear to resolve it or
at least make it way less likely to happen.

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Malformed AddBA Response prevents clients from associating?
  2014-08-26 16:44       ` Denton Gentry
@ 2014-08-26 18:33         ` Kalle Valo
  0 siblings, 0 replies; 6+ messages in thread
From: Kalle Valo @ 2014-08-26 18:33 UTC (permalink / raw)
  To: Denton Gentry; +Cc: Adrian Chadd, ath10k

Denton Gentry <denton.gentry@gmail.com> writes:

> I built an image with the 10.2 firmware and left it running with two
> clients disassociating and reassociating every few seconds. After 8
> hours I stopped it, with no failures detected.
>
> I restored the 10.1 firmware, and it failed in 9 minutes. There is a
> substantial amount of randomness, but it generally happens within 2
> hours.
>
> So whatever this issue is, firmware 10.2 does appear to resolve it or
> at least make it way less likely to happen.

That's good to hear, thanks for testing.

-- 
Kalle Valo

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-08-26 18:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-25  2:17 Malformed AddBA Response prevents clients from associating? Denton Gentry
2014-08-25  2:22 ` Adrian Chadd
2014-08-25  2:28   ` Denton Gentry
2014-08-25  6:59     ` Kalle Valo
2014-08-26 16:44       ` Denton Gentry
2014-08-26 18:33         ` Kalle Valo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.