All of lore.kernel.org
 help / color / mirror / Atom feed
* Frames acknowledged and silently discarded in firmware
@ 2018-02-07 22:15 Javier Cardona
  2018-02-08 23:23 ` Adrian Chadd
  2018-03-13 23:42 ` Javier Cardona
  0 siblings, 2 replies; 12+ messages in thread
From: Javier Cardona @ 2018-02-07 22:15 UTC (permalink / raw)
  To: ath10k

Hi,

We have observed a problem where, under certain conditions, the ath10k firmware will acknowledge frames but not send them up to the driver.  
Frames are sent by a mesh access point (MAP1) to a second mesh AP (MAP2) at MCS 9/NSS-3, which at that distance is probably marginal.  Since frames get acknlowledged by MAP2, MAP1 will not try a lower rate.  But the driver at MAP2 does not receive the frames.

We have captures of this exchange for both the unsuccessful as well as the successful case, which happens when we move MAP2 closer to MAP1.   They can be found here:
https://www.dropbox.com/sh/0or86c8vxotygdc/AABI7RmQ2nztcOBF3UDXbPUma

In both scenarios, the frames from MAP1 to MAP2 are acknowledged, as observed in sniffer captures.

In the successful scenario, the driver logs show the frames being received by the driver.  Sequence numbers in the debug logs match those in the sniffer captures.

root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
Feb 05 05:43:31 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d734f6c0 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1027 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
Feb 05 05:43:42 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d6aaa480 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1031 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
Feb 05 05:43:53 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d53a8d80 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1037 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0

In the failure scenario, the driver logs show no frames, even if the capture shows that the frames are acknowledged.

root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
<nothing here>

If we force MAP1 to use a single stream, the frames are received successfully.

  # iw mesh0 set bitrates legacy-5 ht-mcs-5 vht-mcs-5 1:0-9

It seems as if the firmware is acknowledging but silently discarding frames… is that possible?
Can anyone provide some pointers on how to troubleshoot this?  

We are using this firmware: https://github.com/kvalo/ath10k-firmware/blob/master/QCA9984/hw1.0/3.4/firmware-5.bin_10.4-3.4-00104 and kernel 4.9.31 with a few cherry-picked patches from the ath10k branch.
The hardware is QCA994.

Best,

Javier




_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Frames acknowledged and silently discarded in firmware
  2018-02-07 22:15 Frames acknowledged and silently discarded in firmware Javier Cardona
@ 2018-02-08 23:23 ` Adrian Chadd
  2018-02-09  1:21   ` Javier Cardona
  2018-03-13 23:42 ` Javier Cardona
  1 sibling, 1 reply; 12+ messages in thread
From: Adrian Chadd @ 2018-02-08 23:23 UTC (permalink / raw)
  To: Javier Cardona; +Cc: ath10k

hi,

do you have firmware source at the current gig? or access to the pktlog tools?

Have you verified that the frames weren't at all indicated up via HTT?
Like are they absolutely NOT coming up via HTT, or are they coming up
via HTT but being thrown out before the driver is ready to pass them
up.


-a


On 7 February 2018 at 14:15, Javier Cardona <jcardona@fb.com> wrote:
> Hi,
>
> We have observed a problem where, under certain conditions, the ath10k firmware will acknowledge frames but not send them up to the driver.
> Frames are sent by a mesh access point (MAP1) to a second mesh AP (MAP2) at MCS 9/NSS-3, which at that distance is probably marginal.  Since frames get acknlowledged by MAP2, MAP1 will not try a lower rate.  But the driver at MAP2 does not receive the frames.
>
> We have captures of this exchange for both the unsuccessful as well as the successful case, which happens when we move MAP2 closer to MAP1.   They can be found here:
> https://www.dropbox.com/sh/0or86c8vxotygdc/AABI7RmQ2nztcOBF3UDXbPUma
>
> In both scenarios, the frames from MAP1 to MAP2 are acknowledged, as observed in sniffer captures.
>
> In the successful scenario, the driver logs show the frames being received by the driver.  Sequence numbers in the debug logs match those in the sniffer captures.
>
> root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
> Feb 05 05:43:31 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d734f6c0 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1027 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
> Feb 05 05:43:42 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d6aaa480 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1031 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
> Feb 05 05:43:53 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d53a8d80 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1037 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
>
> In the failure scenario, the driver logs show no frames, even if the capture shows that the frames are acknowledged.
>
> root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
> <nothing here>
>
> If we force MAP1 to use a single stream, the frames are received successfully.
>
>   # iw mesh0 set bitrates legacy-5 ht-mcs-5 vht-mcs-5 1:0-9
>
> It seems as if the firmware is acknowledging but silently discarding frames… is that possible?
> Can anyone provide some pointers on how to troubleshoot this?
>
> We are using this firmware: https://github.com/kvalo/ath10k-firmware/blob/master/QCA9984/hw1.0/3.4/firmware-5.bin_10.4-3.4-00104 and kernel 4.9.31 with a few cherry-picked patches from the ath10k branch.
> The hardware is QCA994.
>
> Best,
>
> Javier
>
>
>
>
> _______________________________________________
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Frames acknowledged and silently discarded in firmware
  2018-02-08 23:23 ` Adrian Chadd
@ 2018-02-09  1:21   ` Javier Cardona
  2018-02-09  1:27     ` Adrian Chadd
  0 siblings, 1 reply; 12+ messages in thread
From: Javier Cardona @ 2018-02-09  1:21 UTC (permalink / raw)
  To: Adrian Chadd; +Cc: ath10k

Hi Adrian,

We do not have access to firmware source, nor to any other QCA proprietary tools at this time.
The logs were collected by enabling ATH10K_DBG_DATA in debug_mask.

AFICT, those messages are generated inside ath10k_htt_rx_h_deliver(), and the only place where the frames could be dropped before that is in ath10k_htt_rx_h_filter(), for conditions that do not apply (unconfigured channel or CAC in progress).  But if there is any other way to rule that frames are being dropped before that point, I’ll definitely try it out.

On the same logs, we do receive some fcs-err=1 frames, but they are not the “missing” frames: mac address and sequence number do not match.  This is to say that invalid FCS frames that are later dropped by the driver still make it to the debug logs.
My current working assumption is the missing frames fail integrity checks on reception but they are still acknowledged and then dropped in firmware.  This is what the observations so far seem to indicate… is that at all possible?

Thanks for your help!

Javier

On 2/8/18, 3:23 PM, "adrian.chadd@gmail.com on behalf of Adrian Chadd" <adrian.chadd@gmail.com on behalf of adrian@freebsd.org> wrote:

    hi,
    
    do you have firmware source at the current gig? or access to the pktlog tools?
    
    Have you verified that the frames weren't at all indicated up via HTT?
    Like are they absolutely NOT coming up via HTT, or are they coming up
    via HTT but being thrown out before the driver is ready to pass them
    up.
    
    
    -a
    
    
    On 7 February 2018 at 14:15, Javier Cardona <jcardona@fb.com> wrote:
    > Hi,
    >
    > We have observed a problem where, under certain conditions, the ath10k firmware will acknowledge frames but not send them up to the driver.
    > Frames are sent by a mesh access point (MAP1) to a second mesh AP (MAP2) at MCS 9/NSS-3, which at that distance is probably marginal.  Since frames get acknlowledged by MAP2, MAP1 will not try a lower rate.  But the driver at MAP2 does not receive the frames.
    >
    > We have captures of this exchange for both the unsuccessful as well as the successful case, which happens when we move MAP2 closer to MAP1.   They can be found here:
    > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.dropbox.com_sh_0or86c8vxotygdc_AABI7RmQ2nztcOBF3UDXbPUma&d=DwIFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=CH8h6v1aoPH3YUC48S5HeA&m=TgRrCL-K0Y96MsSLmn_dbarFMV0xfgmY7Vx0lNlVx-4&s=51bnOIdtEMQSDHf7LPZ-CYjl-73l0--y6s8sOBDhDk0&e=
    >
    > In both scenarios, the frames from MAP1 to MAP2 are acknowledged, as observed in sniffer captures.
    >
    > In the successful scenario, the driver logs show the frames being received by the driver.  Sequence numbers in the debug logs match those in the sniffer captures.
    >
    > root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
    > Feb 05 05:43:31 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d734f6c0 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1027 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
    > Feb 05 05:43:42 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d6aaa480 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1031 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
    > Feb 05 05:43:53 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d53a8d80 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1037 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
    >
    > In the failure scenario, the driver logs show no frames, even if the capture shows that the frames are acknowledged.
    >
    > root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
    > <nothing here>
    >
    > If we force MAP1 to use a single stream, the frames are received successfully.
    >
    >   # iw mesh0 set bitrates legacy-5 ht-mcs-5 vht-mcs-5 1:0-9
    >
    > It seems as if the firmware is acknowledging but silently discarding frames… is that possible?
    > Can anyone provide some pointers on how to troubleshoot this?
    >
    > We are using this firmware: https://github.com/kvalo/ath10k-firmware/blob/master/QCA9984/hw1.0/3.4/firmware-5.bin_10.4-3.4-00104 and kernel 4.9.31 with a few cherry-picked patches from the ath10k branch.
    > The hardware is QCA994.
    >
    > Best,
    >
    > Javier
    >
    >
    >
    >
    > _______________________________________________
    > ath10k mailing list
    > ath10k@lists.infradead.org
    > http://lists.infradead.org/mailman/listinfo/ath10k
    

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Frames acknowledged and silently discarded in firmware
  2018-02-09  1:21   ` Javier Cardona
@ 2018-02-09  1:27     ` Adrian Chadd
  0 siblings, 0 replies; 12+ messages in thread
From: Adrian Chadd @ 2018-02-09  1:27 UTC (permalink / raw)
  To: Javier Cardona; +Cc: ath10k

Hi,

Yeah, it's possible it's being dropped in the HTT notification path -
the hardware afaik is doing the ACKing there.

I think pktlog is the primary way I'd try to debug this but yeah,
there's no open pktlog tool iirc for even just looking at the packet
data..



-adrian

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Frames acknowledged and silently discarded in firmware
  2018-02-07 22:15 Frames acknowledged and silently discarded in firmware Javier Cardona
  2018-02-08 23:23 ` Adrian Chadd
@ 2018-03-13 23:42 ` Javier Cardona
  2018-03-14  0:09   ` Ben Greear
  2018-03-14  0:52   ` Thomas Pedersen
  1 sibling, 2 replies; 12+ messages in thread
From: Javier Cardona @ 2018-03-13 23:42 UTC (permalink / raw)
  To: ath10k

Hi,

We have resolved this issue.  I'm sharing the details in case that might help others.

The description of the problem was accurate EXCEPT that the acks that we observed on the sniffer were not being sent by the failed station (MAP2, in the context of my original e-mail) but by a third station MAP3.  Those Acks were sent by MAP3 but with MAP2's address in the Transmitter Address field.

These anomalous Block Acks were sent because the MAC-ADDRESS-FILTER was misconfigured at MAP3, which caused that station to respond to addresses different than its own.  The reasons for this misconfiguration were: 
  (1) in mesh (and other) mode(s), the driver creates a hidden monitor vif along the mesh vif 
  (2) this second monitor vif is assigned the default mac address reported by the firmware (arvif->mac_addr) 
  (3) this mac address (which the driver was getting from the pre-cal files) is XOR'd with the mesh vif address to configure MAC-ADDRESS-FILTER
  (4) once that happens, the hardware will ack all addresses that pass the MAC-ADDRESS-FILTER.  If the two mac addresses (vif->addr and arvif->mac_addr) are very dissimilar, that will result in a storm of invalid Block Acks

We resolved the issue by patching the pre-cal data with the same address as the mesh interface, so that vif->addr == arvif->mac_addr.  This is more a workaround than a real fix, because this misconfiguration this MAC-ADDRESS-FILTER can easily go unnoticed.  In fact, what unblocked us on this issue was switching to Candela Tech's custom firmware and driver (ath10k-ct).  This provides a nice interface to the hardware registers:

cat /debug/ieee80211/wiphy1/ath10k/fw_regs

        ath10k Target Register Dump
        =================
        MAC-FILTER-ADDR-L32 0xffffffff
        MAC-FILTER-ADDR-U16 0x0000ffff

We would probably be still trying to solve this if Ben Greear did not point us in the right direction.

Cheers,

Javier
 

On 2/7/18, 2:15 PM, "Javier Cardona" <jcardona@fb.com> wrote:

    Hi,
    
    We have observed a problem where, under certain conditions, the ath10k firmware will acknowledge frames but not send them up to the driver.  
    Frames are sent by a mesh access point (MAP1) to a second mesh AP (MAP2) at MCS 9/NSS-3, which at that distance is probably marginal.  Since frames get acknlowledged by MAP2, MAP1 will not try a lower rate.  But the driver at MAP2 does not receive the frames.
    
    We have captures of this exchange for both the unsuccessful as well as the successful case, which happens when we move MAP2 closer to MAP1.   They can be found here:
    https://www.dropbox.com/sh/0or86c8vxotygdc/AABI7RmQ2nztcOBF3UDXbPUma
    
    In both scenarios, the frames from MAP1 to MAP2 are acknowledged, as observed in sniffer captures.
    
    In the successful scenario, the driver logs show the frames being received by the driver.  Sequence numbers in the debug logs match those in the sniffer captures.
    
    root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
    Feb 05 05:43:31 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d734f6c0 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1027 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
    Feb 05 05:43:42 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d6aaa480 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1031 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
    Feb 05 05:43:53 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d53a8d80 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1037 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
    
    In the failure scenario, the driver logs show no frames, even if the capture shows that the frames are acknowledged.
    
    root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
    <nothing here>
    
    If we force MAP1 to use a single stream, the frames are received successfully.
    
      # iw mesh0 set bitrates legacy-5 ht-mcs-5 vht-mcs-5 1:0-9
    
    It seems as if the firmware is acknowledging but silently discarding frames… is that possible?
    Can anyone provide some pointers on how to troubleshoot this?  
    
    We are using this firmware: https://github.com/kvalo/ath10k-firmware/blob/master/QCA9984/hw1.0/3.4/firmware-5.bin_10.4-3.4-00104 and kernel 4.9.31 with a few cherry-picked patches from the ath10k branch.
    The hardware is QCA994.
    
    Best,
    
    Javier
    
    
    
    
    

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Frames acknowledged and silently discarded in firmware
  2018-03-13 23:42 ` Javier Cardona
@ 2018-03-14  0:09   ` Ben Greear
  2018-03-14  0:20     ` Adrian Chadd
  2018-03-14  0:52   ` Thomas Pedersen
  1 sibling, 1 reply; 12+ messages in thread
From: Ben Greear @ 2018-03-14  0:09 UTC (permalink / raw)
  To: Javier Cardona, ath10k

On 03/13/2018 04:42 PM, Javier Cardona wrote:
> Hi,
>
> We have resolved this issue.  I'm sharing the details in case that might help others.
>
> The description of the problem was accurate EXCEPT that the acks that we observed on the sniffer were not being sent by the failed station (MAP2, in the context of my original e-mail) but by a third station MAP3.  Those Acks were sent by MAP3 but with MAP2's address in the Transmitter Address field.
>
> These anomalous Block Acks were sent because the MAC-ADDRESS-FILTER was misconfigured at MAP3, which caused that station to respond to addresses different than its own.  The reasons for this misconfiguration were:
>   (1) in mesh (and other) mode(s), the driver creates a hidden monitor vif along the mesh vif
>   (2) this second monitor vif is assigned the default mac address reported by the firmware (arvif->mac_addr)
>   (3) this mac address (which the driver was getting from the pre-cal files) is XOR'd with the mesh vif address to configure MAC-ADDRESS-FILTER
>   (4) once that happens, the hardware will ack all addresses that pass the MAC-ADDRESS-FILTER.  If the two mac addresses (vif->addr and arvif->mac_addr) are very dissimilar, that will result in a storm of invalid Block Acks
>
> We resolved the issue by patching the pre-cal data with the same address as the mesh interface, so that vif->addr == arvif->mac_addr.  This is more a workaround than a real fix, because this misconfiguration this MAC-ADDRESS-FILTER can easily go unnoticed.  In fact, what unblocked us on this issue was switching to Candela Tech's custom firmware and driver (ath10k-ct).  This provides a nice interface to the hardware registers:
>
> cat /debug/ieee80211/wiphy1/ath10k/fw_regs
>
>         ath10k Target Register Dump
>         =================
>         MAC-FILTER-ADDR-L32 0xffffffff
>         MAC-FILTER-ADDR-U16 0x0000ffff
>
> We would probably be still trying to solve this if Ben Greear did not point us in the right direction.

I'm glad I could help.

Can anyone think of any reason why a monitor vdev's MAC address should be
considered when calculating the filter?  I cannot imagine it should ever ACK
frames, so I think I will remove it from consideration when calculating
the filter in my ath10k-ct firmware unless someone suggests otherwise...

Thanks,
Ben

>
> Cheers,
>
> Javier
>
>
> On 2/7/18, 2:15 PM, "Javier Cardona" <jcardona@fb.com> wrote:
>
>     Hi,
>
>     We have observed a problem where, under certain conditions, the ath10k firmware will acknowledge frames but not send them up to the driver.
>     Frames are sent by a mesh access point (MAP1) to a second mesh AP (MAP2) at MCS 9/NSS-3, which at that distance is probably marginal.  Since frames get acknlowledged by MAP2, MAP1 will not try a lower rate.  But the driver at MAP2 does not receive the frames.
>
>     We have captures of this exchange for both the unsuccessful as well as the successful case, which happens when we move MAP2 closer to MAP1.   They can be found here:
>     https://www.dropbox.com/sh/0or86c8vxotygdc/AABI7RmQ2nztcOBF3UDXbPUma
>
>     In both scenarios, the frames from MAP1 to MAP2 are acknowledged, as observed in sniffer captures.
>
>     In the successful scenario, the driver logs show the frames being received by the driver.  Sequence numbers in the debug logs match those in the sniffer captures.
>
>     root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
>     Feb 05 05:43:31 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d734f6c0 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1027 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
>     Feb 05 05:43:42 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d6aaa480 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1031 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
>     Feb 05 05:43:53 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d53a8d80 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1037 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
>
>     In the failure scenario, the driver logs show no frames, even if the capture shows that the frames are acknowledged.
>
>     root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
>     <nothing here>
>
>     If we force MAP1 to use a single stream, the frames are received successfully.
>
>       # iw mesh0 set bitrates legacy-5 ht-mcs-5 vht-mcs-5 1:0-9
>
>     It seems as if the firmware is acknowledging but silently discarding frames… is that possible?
>     Can anyone provide some pointers on how to troubleshoot this?
>
>     We are using this firmware: https://github.com/kvalo/ath10k-firmware/blob/master/QCA9984/hw1.0/3.4/firmware-5.bin_10.4-3.4-00104 and kernel 4.9.31 with a few cherry-picked patches from the ath10k branch.
>     The hardware is QCA994.
>
>     Best,
>
>     Javier
>
>
>
>
>
>
> _______________________________________________
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Frames acknowledged and silently discarded in firmware
  2018-03-14  0:09   ` Ben Greear
@ 2018-03-14  0:20     ` Adrian Chadd
  2018-03-14  2:36       ` Javier Cardona
  0 siblings, 1 reply; 12+ messages in thread
From: Adrian Chadd @ 2018-03-14  0:20 UTC (permalink / raw)
  To: Ben Greear; +Cc: Javier Cardona, ath10k

HI,

What were the mac addresses? Why were they so dissimilar?



-adrian

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Frames acknowledged and silently discarded in firmware
  2018-03-13 23:42 ` Javier Cardona
  2018-03-14  0:09   ` Ben Greear
@ 2018-03-14  0:52   ` Thomas Pedersen
  2018-03-14  2:33     ` Javier Cardona
  2018-03-14 16:47     ` Ben Greear
  1 sibling, 2 replies; 12+ messages in thread
From: Thomas Pedersen @ 2018-03-14  0:52 UTC (permalink / raw)
  To: Javier Cardona; +Cc: ath10k

Javier,

On Tue, Mar 13, 2018 at 4:42 PM, Javier Cardona <jcardona@fb.com> wrote:
> Hi,
>
> We have resolved this issue.  I'm sharing the details in case that might help others.
>
> The description of the problem was accurate EXCEPT that the acks that we observed on the sniffer were not being sent by the failed station (MAP2, in the context of my original e-mail) but by a third station MAP3.  Those Acks were sent by MAP3 but with MAP2's address in the Transmitter Address field.
>
> These anomalous Block Acks were sent because the MAC-ADDRESS-FILTER was misconfigured at MAP3, which caused that station to respond to addresses different than its own.  The reasons for this misconfiguration were:
>   (1) in mesh (and other) mode(s), the driver creates a hidden monitor vif along the mesh vif

At least since some 10.4 firmware, the hidden monitor vdev is no longer
required. See https://www.spinics.net/lists/linux-wireless/msg156475.html


>   (2) this second monitor vif is assigned the default mac address reported by the firmware (arvif->mac_addr)
>   (3) this mac address (which the driver was getting from the pre-cal files) is XOR'd with the mesh vif address to configure MAC-ADDRESS-FILTER
>   (4) once that happens, the hardware will ack all addresses that pass the MAC-ADDRESS-FILTER.  If the two mac addresses (vif->addr and arvif->mac_addr) are very dissimilar, that will result in a storm of invalid Block Acks
>
> We resolved the issue by patching the pre-cal data with the same address as the mesh interface, so that vif->addr == arvif->mac_addr.  This is more a workaround than a real fix, because this misconfiguration this MAC-ADDRESS-FILTER can easily go unnoticed.  In fact, what unblocked us on this issue was switching to Candela Tech's custom firmware and driver (ath10k-ct).  This provides a nice interface to the hardware registers:
>
> cat /debug/ieee80211/wiphy1/ath10k/fw_regs
>
>         ath10k Target Register Dump
>         =================
>         MAC-FILTER-ADDR-L32 0xffffffff
>         MAC-FILTER-ADDR-U16 0x0000ffff
>
> We would probably be still trying to solve this if Ben Greear did not point us in the right direction.
>
> Cheers,
>
> Javier
>
>
> On 2/7/18, 2:15 PM, "Javier Cardona" <jcardona@fb.com> wrote:
>
>     Hi,
>
>     We have observed a problem where, under certain conditions, the ath10k firmware will acknowledge frames but not send them up to the driver.
>     Frames are sent by a mesh access point (MAP1) to a second mesh AP (MAP2) at MCS 9/NSS-3, which at that distance is probably marginal.  Since frames get acknlowledged by MAP2, MAP1 will not try a lower rate.  But the driver at MAP2 does not receive the frames.
>
>     We have captures of this exchange for both the unsuccessful as well as the successful case, which happens when we move MAP2 closer to MAP1.   They can be found here:
>     https://www.dropbox.com/sh/0or86c8vxotygdc/AABI7RmQ2nztcOBF3UDXbPUma
>
>     In both scenarios, the frames from MAP1 to MAP2 are acknowledged, as observed in sniffer captures.
>
>     In the successful scenario, the driver logs show the frames being received by the driver.  Sequence numbers in the debug logs match those in the sniffer captures.
>
>     root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
>     Feb 05 05:43:31 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d734f6c0 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1027 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
>     Feb 05 05:43:42 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d6aaa480 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1031 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
>     Feb 05 05:43:53 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d53a8d80 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1037 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
>
>     In the failure scenario, the driver logs show no frames, even if the capture shows that the frames are acknowledged.
>
>     root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
>     <nothing here>
>
>     If we force MAP1 to use a single stream, the frames are received successfully.
>
>       # iw mesh0 set bitrates legacy-5 ht-mcs-5 vht-mcs-5 1:0-9
>
>     It seems as if the firmware is acknowledging but silently discarding frames… is that possible?
>     Can anyone provide some pointers on how to troubleshoot this?
>
>     We are using this firmware: https://github.com/kvalo/ath10k-firmware/blob/master/QCA9984/hw1.0/3.4/firmware-5.bin_10.4-3.4-00104 and kernel 4.9.31 with a few cherry-picked patches from the ath10k branch.
>     The hardware is QCA994.
>
>     Best,
>
>     Javier
>
>
>
>
>
>
> _______________________________________________
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k



-- 
thomas

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Frames acknowledged and silently discarded in firmware
  2018-03-14  0:52   ` Thomas Pedersen
@ 2018-03-14  2:33     ` Javier Cardona
  2018-03-14 16:47     ` Ben Greear
  1 sibling, 0 replies; 12+ messages in thread
From: Javier Cardona @ 2018-03-14  2:33 UTC (permalink / raw)
  To: Thomas Pedersen; +Cc: ath10k

Thomas,

On 3/13/18, 5:53 PM, "Thomas Pedersen" <thomas@eero.com> wrote:

    Javier,
    
    On Tue, Mar 13, 2018 at 4:42 PM, Javier Cardona <jcardona@fb.com> wrote:
    > Hi,
    >
    > We have resolved this issue.  I'm sharing the details in case that might help others.
    >
    > The description of the problem was accurate EXCEPT that the acks that we observed on the sniffer were not being sent by the failed station (MAP2, in the context of my original e-mail) but by a third station MAP3.  Those Acks were sent by MAP3 but with MAP2's address in the Transmitter Address field.
    >
    > These anomalous Block Acks were sent because the MAC-ADDRESS-FILTER was misconfigured at MAP3, which caused that station to respond to addresses different than its own.  The reasons for this misconfiguration were:
    >   (1) in mesh (and other) mode(s), the driver creates a hidden monitor vif along the mesh vif
    
    At least since some 10.4 firmware, the hidden monitor vdev is no longer
    required. See https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_linux-2Dwireless_msg156475.html&d=DwIFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=CH8h6v1aoPH3YUC48S5HeA&m=nnX55-ZbHxnvePVayeWJ_xi-_XOXran2F6f_TrBFTDM&s=VxPb7xHUlVSTmjtiULlwwGp5xwUp5c5LkxwZxq6wdeA&e=
    
Oh yes, you are right.  That monitor vif is created only when using ath10k-ct firmware.  The original stock firmware does not create the additional vif.  
However, the original issue (MAC-ADDRESS-FILTER getting corrupted, which triggers anomalous Block Acks) still occurs with the stock driver and 10.4 firmware.
It seems like the firmware will still use original mac address provided by the driver on load, even if the mesh interface uses a different address.

If the driver uses the bogus address in the pre-cal file (12:34:56:78:90:12), the MAC FILTER are misconfigured:

MAC FILTERS ==
phy0:0x00032018:0xb93efa8d
phy0:0x0003201c:0x00009bec

Once the pre-cal file is patched to match the address that will be assigned to the mesh vif, the mac filter is now correct:

MAC FILTERS ==
phy0:0x00032018:0xffffffff
phy0:0x0003201c:0x0000ffff

Thanks!

Javier

    >   (2) this second monitor vif is assigned the default mac address reported by the firmware (arvif->mac_addr)
    >   (3) this mac address (which the driver was getting from the pre-cal files) is XOR'd with the mesh vif address to configure MAC-ADDRESS-FILTER
    >   (4) once that happens, the hardware will ack all addresses that pass the MAC-ADDRESS-FILTER.  If the two mac addresses (vif->addr and arvif->mac_addr) are very dissimilar, that will result in a storm of invalid Block Acks
    >
    > We resolved the issue by patching the pre-cal data with the same address as the mesh interface, so that vif->addr == arvif->mac_addr.  This is more a workaround than a real fix, because this misconfiguration this MAC-ADDRESS-FILTER can easily go unnoticed.  In fact, what unblocked us on this issue was switching to Candela Tech's custom firmware and driver (ath10k-ct).  This provides a nice interface to the hardware registers:
    >
    > cat /debug/ieee80211/wiphy1/ath10k/fw_regs
    >
    >         ath10k Target Register Dump
    >         =================
    >         MAC-FILTER-ADDR-L32 0xffffffff
    >         MAC-FILTER-ADDR-U16 0x0000ffff
    >
    > We would probably be still trying to solve this if Ben Greear did not point us in the right direction.
    >
    > Cheers,
    >
    > Javier
    >
    >
    > On 2/7/18, 2:15 PM, "Javier Cardona" <jcardona@fb.com> wrote:
    >
    >     Hi,
    >
    >     We have observed a problem where, under certain conditions, the ath10k firmware will acknowledge frames but not send them up to the driver.
    >     Frames are sent by a mesh access point (MAP1) to a second mesh AP (MAP2) at MCS 9/NSS-3, which at that distance is probably marginal.  Since frames get acknlowledged by MAP2, MAP1 will not try a lower rate.  But the driver at MAP2 does not receive the frames.
    >
    >     We have captures of this exchange for both the unsuccessful as well as the successful case, which happens when we move MAP2 closer to MAP1.   They can be found here:
    >     https://urldefense.proofpoint.com/v2/url?u=https-3A__www.dropbox.com_sh_0or86c8vxotygdc_AABI7RmQ2nztcOBF3UDXbPUma&d=DwIFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=CH8h6v1aoPH3YUC48S5HeA&m=nnX55-ZbHxnvePVayeWJ_xi-_XOXran2F6f_TrBFTDM&s=3Yrhtv_RtixAZobAEOhKvTFxxeqbCFwEtefEHG3lB1k&e=
    >
    >     In both scenarios, the frames from MAP1 to MAP2 are acknowledged, as observed in sniffer captures.
    >
    >     In the successful scenario, the driver logs show the frames being received by the driver.  Sequence numbers in the debug logs match those in the sniffer captures.
    >
    >     root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
    >     Feb 05 05:43:31 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d734f6c0 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1027 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
    >     Feb 05 05:43:42 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d6aaa480 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1031 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
    >     Feb 05 05:43:53 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb d53a8d80 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1037 vht sgi rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 amsdu-more 0
    >
    >     In the failure scenario, the driver logs show no frames, even if the capture shows that the frames are acknowledged.
    >
    >     root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | grep ucast | grep 'len 374'
    >     <nothing here>
    >
    >     If we force MAP1 to use a single stream, the frames are received successfully.
    >
    >       # iw mesh0 set bitrates legacy-5 ht-mcs-5 vht-mcs-5 1:0-9
    >
    >     It seems as if the firmware is acknowledging but silently discarding frames… is that possible?
    >     Can anyone provide some pointers on how to troubleshoot this?
    >
    >     We are using this firmware: https://github.com/kvalo/ath10k-firmware/blob/master/QCA9984/hw1.0/3.4/firmware-5.bin_10.4-3.4-00104 and kernel 4.9.31 with a few cherry-picked patches from the ath10k branch.
    >     The hardware is QCA994.
    >
    >     Best,
    >
    >     Javier
    >
    >
    >
    >
    >
    >
    > _______________________________________________
    > ath10k mailing list
    > ath10k@lists.infradead.org
    > http://lists.infradead.org/mailman/listinfo/ath10k
    
    
    
    -- 
    thomas
    

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Frames acknowledged and silently discarded in firmware
  2018-03-14  0:20     ` Adrian Chadd
@ 2018-03-14  2:36       ` Javier Cardona
  0 siblings, 0 replies; 12+ messages in thread
From: Javier Cardona @ 2018-03-14  2:36 UTC (permalink / raw)
  To: Adrian Chadd, Ben Greear; +Cc: ath10k

Hi Adrian,

On 3/13/18, 5:21 PM, "Adrian Chadd" <adrian@freebsd.org> wrote:

    HI,
    
    What were the mac addresses? Why were they so dissimilar?
    
The addresses on the pre-cal files were 12:34:56:78:90:12
We changed them to manufacturer assigned macs.

Thanks!

Javier

    
    -adrian
    

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Frames acknowledged and silently discarded in firmware
  2018-03-14  0:52   ` Thomas Pedersen
  2018-03-14  2:33     ` Javier Cardona
@ 2018-03-14 16:47     ` Ben Greear
  2018-03-14 17:00       ` Thomas Pedersen
  1 sibling, 1 reply; 12+ messages in thread
From: Ben Greear @ 2018-03-14 16:47 UTC (permalink / raw)
  To: Thomas Pedersen, Javier Cardona; +Cc: ath10k

On 03/13/2018 05:52 PM, Thomas Pedersen wrote:
> Javier,
>
> On Tue, Mar 13, 2018 at 4:42 PM, Javier Cardona <jcardona@fb.com> wrote:
>> Hi,
>>
>> We have resolved this issue.  I'm sharing the details in case that might help others.
>>
>> The description of the problem was accurate EXCEPT that the acks that we observed on the sniffer were not being sent by the failed station (MAP2, in the context of my original e-mail) but by a third station MAP3.  Those Acks were sent by MAP3 but with MAP2's address in the Transmitter Address field.
>>
>> These anomalous Block Acks were sent because the MAC-ADDRESS-FILTER was misconfigured at MAP3, which caused that station to respond to addresses different than its own.  The reasons for this misconfiguration were:
>>   (1) in mesh (and other) mode(s), the driver creates a hidden monitor vif along the mesh vif
>
> At least since some 10.4 firmware, the hidden monitor vdev is no longer
> required. See https://www.spinics.net/lists/linux-wireless/msg156475.html

I'd like to fix my firmware to support this same feature (in case it does not currently
support it) and add the pertinent firmware feature flags.

Can anyone share the filter flag name(s) that needs to be enabled in the firmware for mesh
mode so that monitor devices are not required?

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Frames acknowledged and silently discarded in firmware
  2018-03-14 16:47     ` Ben Greear
@ 2018-03-14 17:00       ` Thomas Pedersen
  0 siblings, 0 replies; 12+ messages in thread
From: Thomas Pedersen @ 2018-03-14 17:00 UTC (permalink / raw)
  To: Ben Greear; +Cc: Javier Cardona, ath10k

On Wed, Mar 14, 2018 at 9:47 AM, Ben Greear <greearb@candelatech.com> wrote:
> On 03/13/2018 05:52 PM, Thomas Pedersen wrote:
>>
>> Javier,
>>
>> On Tue, Mar 13, 2018 at 4:42 PM, Javier Cardona <jcardona@fb.com> wrote:
>>>
>>> Hi,
>>>
>>> We have resolved this issue.  I'm sharing the details in case that might
>>> help others.
>>>
>>> The description of the problem was accurate EXCEPT that the acks that we
>>> observed on the sniffer were not being sent by the failed station (MAP2, in
>>> the context of my original e-mail) but by a third station MAP3.  Those Acks
>>> were sent by MAP3 but with MAP2's address in the Transmitter Address field.
>>>
>>> These anomalous Block Acks were sent because the MAC-ADDRESS-FILTER was
>>> misconfigured at MAP3, which caused that station to respond to addresses
>>> different than its own.  The reasons for this misconfiguration were:
>>>   (1) in mesh (and other) mode(s), the driver creates a hidden monitor
>>> vif along the mesh vif
>>
>>
>> At least since some 10.4 firmware, the hidden monitor vdev is no longer
>> required. See https://www.spinics.net/lists/linux-wireless/msg156475.html
>
>
> I'd like to fix my firmware to support this same feature (in case it does
> not currently
> support it) and add the pertinent firmware feature flags.
>
> Can anyone share the filter flag name(s) that needs to be enabled in the
> firmware for mesh
> mode so that monitor devices are not required?

It should be FIF_ALLMULTI | FIF_OTHER_BSS

-- 
thomas

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-03-14 17:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-07 22:15 Frames acknowledged and silently discarded in firmware Javier Cardona
2018-02-08 23:23 ` Adrian Chadd
2018-02-09  1:21   ` Javier Cardona
2018-02-09  1:27     ` Adrian Chadd
2018-03-13 23:42 ` Javier Cardona
2018-03-14  0:09   ` Ben Greear
2018-03-14  0:20     ` Adrian Chadd
2018-03-14  2:36       ` Javier Cardona
2018-03-14  0:52   ` Thomas Pedersen
2018-03-14  2:33     ` Javier Cardona
2018-03-14 16:47     ` Ben Greear
2018-03-14 17:00       ` Thomas Pedersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.