regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* Re: [regression] STP on 80211s is broken in 6.4-rc4
       [not found] <CT5GNZSK28AI.2K6M69OXM9RW5@syracuse>
@ 2023-06-10  6:44 ` Bagas Sanjaya
  2023-06-15 12:54   ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 10+ messages in thread
From: Bagas Sanjaya @ 2023-06-10  6:44 UTC (permalink / raw)
  To: Nicolas Escande, nbd, Toke Høiland-Jørgensen,
	Kalle Valo, Johannes Berg
  Cc: linux-wireless, Linux Regressions

[-- Attachment #1: Type: text/plain, Size: 1853 bytes --]

On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
> Hello Felix,
> 
> As user of the mesh part of mac80211 on multiple products at work let me say
> thank you for all the work you do on wifi, especially on 80211s, and especially
> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
> 
> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP 
> doesn't work anymore and alas we use it for now (for the better or worse).
> 
> What I gathered so far from my setup:
>  - we use ath9k & ath10k
>  - in my case STP frames are received as regular packet and not as amsdu
>  - the received packets have a wrong length of 44 in tcpdump
>    (instead of 38 with our previous kernel)
>  - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
>    this makes llc_rcv() discard the frames & breaks STP
> 
> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
> 
> I guess that your changes to handle both ampdu subframes & normal frames in the
> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
> exact problem.
> 
> It seems this change was already in the 6.3 kernel so I guess someone should
> have seen it before (but I didn't find anything..) ? Maybe I missed something...
> 
> Anyway I'm happy to provide more info or try anything you throw at me.
> 

Thanks for the regression report. I'm adding it to regzbot:

(Felix: it looks like this regression is introcued by a commit authored by you.
Would you like to take a look on it?)

#regzbot ^introduced: 986e43b19ae917

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [regression] STP on 80211s is broken in 6.4-rc4
  2023-06-10  6:44 ` [regression] STP on 80211s is broken in 6.4-rc4 Bagas Sanjaya
@ 2023-06-15 12:54   ` Linux regression tracking (Thorsten Leemhuis)
  2023-06-16  7:45     ` Nicolas Escande
  0 siblings, 1 reply; 10+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-06-15 12:54 UTC (permalink / raw)
  To: Bagas Sanjaya, Nicolas Escande, nbd,
	Toke Høiland-Jørgensen, Kalle Valo, Johannes Berg
  Cc: linux-wireless, Linux Regressions

On 10.06.23 08:44, Bagas Sanjaya wrote:
> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
>> Hello Felix,
>>
>> As user of the mesh part of mac80211 on multiple products at work let me say
>> thank you for all the work you do on wifi, especially on 80211s, and especially
>> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
>>
>> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP 
>> doesn't work anymore and alas we use it for now (for the better or worse).
>>
>> What I gathered so far from my setup:
>>  - we use ath9k & ath10k
>>  - in my case STP frames are received as regular packet and not as amsdu
>>  - the received packets have a wrong length of 44 in tcpdump
>>    (instead of 38 with our previous kernel)
>>  - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
>>    this makes llc_rcv() discard the frames & breaks STP
>>
>> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
>> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
>>
>> I guess that your changes to handle both ampdu subframes & normal frames in the
>> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
>> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
>> exact problem.
>>
>> It seems this change was already in the 6.3 kernel so I guess someone should
>> have seen it before (but I didn't find anything..) ? Maybe I missed something...
>>
>> Anyway I'm happy to provide more info or try anything you throw at me.
>>
> 
> Thanks for the regression report. I'm adding it to regzbot:
> 
> (Felix: it looks like this regression is introcued by a commit authored by you.
> Would you like to take a look on it?)
> 
> #regzbot ^introduced: 986e43b19ae917

Hmmm, Felix did not reply. But let's ignore that for now.

Nicolas, I noticed there are a few patches in next that refer to the
culprit. Might be worth giving this series a try:

https://lore.kernel.org/all/20230314095956.62085-1-nbd@nbd.name/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [regression] STP on 80211s is broken in 6.4-rc4
  2023-06-15 12:54   ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-06-16  7:45     ` Nicolas Escande
  2023-06-16  9:25       ` Linux regression tracking (Thorsten Leemhuis)
  2023-07-10 11:32       ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 2 replies; 10+ messages in thread
From: Nicolas Escande @ 2023-06-16  7:45 UTC (permalink / raw)
  To: Linux regressions mailing list, Bagas Sanjaya, nbd,
	Toke Høiland-Jørgensen, Kalle Valo, Johannes Berg
  Cc: linux-wireless

On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 10.06.23 08:44, Bagas Sanjaya wrote:
> > On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
> >> Hello Felix,
> >>
> >> As user of the mesh part of mac80211 on multiple products at work let me say
> >> thank you for all the work you do on wifi, especially on 80211s, and especially
> >> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
> >>
> >> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP 
> >> doesn't work anymore and alas we use it for now (for the better or worse).
> >>
> >> What I gathered so far from my setup:
> >>  - we use ath9k & ath10k
> >>  - in my case STP frames are received as regular packet and not as amsdu
> >>  - the received packets have a wrong length of 44 in tcpdump
> >>    (instead of 38 with our previous kernel)
> >>  - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
> >>    this makes llc_rcv() discard the frames & breaks STP
> >>
> >> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
> >> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
> >>
> >> I guess that your changes to handle both ampdu subframes & normal frames in the
> >> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
> >> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
> >> exact problem.
> >>
> >> It seems this change was already in the 6.3 kernel so I guess someone should
> >> have seen it before (but I didn't find anything..) ? Maybe I missed something...
> >>
> >> Anyway I'm happy to provide more info or try anything you throw at me.
> >>
> > 
> > Thanks for the regression report. I'm adding it to regzbot:
> > 
> > (Felix: it looks like this regression is introcued by a commit authored by you.
> > Would you like to take a look on it?)
> > 
> > #regzbot ^introduced: 986e43b19ae917
>
> Hmmm, Felix did not reply. But let's ignore that for now.

I haven't seen mails from felix on the list for a few days, I'm guessing he's
unavailable for now but I'll hapilly wait.

>
> Nicolas, I noticed there are a few patches in next that refer to the
> culprit. Might be worth giving this series a try:
>
> https://lore.kernel.org/all/20230314095956.62085-1-nbd@nbd.name/

Well this series already landed in 6.4 and that is the version I did my initial
testing with. So no luck there.

>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [regression] STP on 80211s is broken in 6.4-rc4
  2023-06-16  7:45     ` Nicolas Escande
@ 2023-06-16  9:25       ` Linux regression tracking (Thorsten Leemhuis)
  2023-06-16 12:17         ` Bagas Sanjaya
  2023-07-10 11:32       ` Linux regression tracking (Thorsten Leemhuis)
  1 sibling, 1 reply; 10+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-06-16  9:25 UTC (permalink / raw)
  To: Nicolas Escande, Linux regressions mailing list, Bagas Sanjaya,
	nbd, Toke Høiland-Jørgensen, Kalle Valo, Johannes Berg
  Cc: linux-wireless

On 16.06.23 09:45, Nicolas Escande wrote:
> On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 10.06.23 08:44, Bagas Sanjaya wrote:
>>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:

>> Hmmm, Felix did not reply. But let's ignore that for now.
> I haven't seen mails from felix on the list for a few days, I'm guessing he's
> unavailable for now but I'll hapilly wait.

Okay.

>> Nicolas, I noticed there are a few patches in next that refer to the
>> culprit. Might be worth giving this series a try:
>> https://lore.kernel.org/all/20230314095956.62085-1-nbd@nbd.name/
> Well this series already landed in 6.4 and that is the version I did my initial
> testing with. So no luck there.

What? Ohh, sorry for the noise, I had missed that they were in mainline
already.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [regression] STP on 80211s is broken in 6.4-rc4
  2023-06-16  9:25       ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-06-16 12:17         ` Bagas Sanjaya
  2023-06-16 12:33           ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 10+ messages in thread
From: Bagas Sanjaya @ 2023-06-16 12:17 UTC (permalink / raw)
  To: Linux regressions mailing list, Nicolas Escande, nbd,
	Toke Høiland-Jørgensen, Kalle Valo, Johannes Berg
  Cc: linux-wireless

On 6/16/23 16:25, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 16.06.23 09:45, Nicolas Escande wrote:
>> On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> On 10.06.23 08:44, Bagas Sanjaya wrote:
>>>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
> 
>>> Hmmm, Felix did not reply. But let's ignore that for now.
>> I haven't seen mails from felix on the list for a few days, I'm guessing he's
>> unavailable for now but I'll hapilly wait.
> 
> Okay.
> 
>>> Nicolas, I noticed there are a few patches in next that refer to the
>>> culprit. Might be worth giving this series a try:
>>> https://lore.kernel.org/all/20230314095956.62085-1-nbd@nbd.name/
>> Well this series already landed in 6.4 and that is the version I did my initial
>> testing with. So no luck there.
> 
> What? Ohh, sorry for the noise, I had missed that they were in mainline
> already.
> 

Hi Thorsten,

Should this be removed from tracking as inconclusive?

-- 
An old man doll... just what I always wanted! - Clara


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [regression] STP on 80211s is broken in 6.4-rc4
  2023-06-16 12:17         ` Bagas Sanjaya
@ 2023-06-16 12:33           ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 0 replies; 10+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-06-16 12:33 UTC (permalink / raw)
  To: Bagas Sanjaya, Linux regressions mailing list, Nicolas Escande,
	nbd, Toke Høiland-Jørgensen, Kalle Valo, Johannes Berg
  Cc: linux-wireless

On 16.06.23 14:17, Bagas Sanjaya wrote:
> On 6/16/23 16:25, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 16.06.23 09:45, Nicolas Escande wrote:
>>> On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>> On 10.06.23 08:44, Bagas Sanjaya wrote:
>>>>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
>>
>>>> Hmmm, Felix did not reply. But let's ignore that for now.
>>> I haven't seen mails from felix on the list for a few days, I'm guessing he's
>>> unavailable for now but I'll hapilly wait.
>>
>> Okay.
>>
>>>> Nicolas, I noticed there are a few patches in next that refer to the
>>>> culprit. Might be worth giving this series a try:
>>>> https://lore.kernel.org/all/20230314095956.62085-1-nbd@nbd.name/
>>> Well this series already landed in 6.4 and that is the version I did my initial
>>> testing with. So no luck there.
>>
>> What? Ohh, sorry for the noise, I had missed that they were in mainline
>> already.
> 
> Should this be removed from tracking as inconclusive?

Ehh, why? Afaics this is still a regression, just not one the reporter
considers urgent; that is fine for me, unless more people start to
report the problem.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [regression] STP on 80211s is broken in 6.4-rc4
  2023-06-16  7:45     ` Nicolas Escande
  2023-06-16  9:25       ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-07-10 11:32       ` Linux regression tracking (Thorsten Leemhuis)
  2023-07-10 16:50         ` Nicolas Escande
  1 sibling, 1 reply; 10+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-07-10 11:32 UTC (permalink / raw)
  To: Nicolas Escande, Linux regressions mailing list, Bagas Sanjaya,
	nbd, Toke Høiland-Jørgensen, Kalle Valo, Johannes Berg
  Cc: linux-wireless

On 16.06.23 09:45, Nicolas Escande wrote:
> On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 10.06.23 08:44, Bagas Sanjaya wrote:
>>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
>>>>
>>>> As user of the mesh part of mac80211 on multiple products at work let me say
>>>> thank you for all the work you do on wifi, especially on 80211s, and especially
>>>> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
>>>>
>>>> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP 
>>>> doesn't work anymore and alas we use it for now (for the better or worse).
>>>>
>>>> What I gathered so far from my setup:
>>>>  - we use ath9k & ath10k
>>>>  - in my case STP frames are received as regular packet and not as amsdu
>>>>  - the received packets have a wrong length of 44 in tcpdump
>>>>    (instead of 38 with our previous kernel)
>>>>  - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
>>>>    this makes llc_rcv() discard the frames & breaks STP
>>>>
>>>> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
>>>> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
>>>>
>>>> I guess that your changes to handle both ampdu subframes & normal frames in the
>>>> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
>>>> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
>>>> exact problem.
>>>>
>>>> It seems this change was already in the 6.3 kernel so I guess someone should
>>>> have seen it before (but I didn't find anything..) ? Maybe I missed something...
>>>>
>>>> Anyway I'm happy to provide more info or try anything you throw at me.
>> [...]
>> Hmmm, Felix did not reply. But let's ignore that for now.
> 
> I haven't seen mails from felix on the list for a few days, I'm guessing he's
> unavailable for now but I'll hapilly wait.

Still no progress. Hmmm. Are you still okay with that? I've seen no
other reports about this, so waiting is somewhat (albeit not completely)
fine for me if it is for you.

But in any case it might be good if you could recheck 6.5-rc1.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [regression] STP on 80211s is broken in 6.4-rc4
  2023-07-10 11:32       ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-07-10 16:50         ` Nicolas Escande
  2023-07-11 11:12           ` Felix Fietkau
  0 siblings, 1 reply; 10+ messages in thread
From: Nicolas Escande @ 2023-07-10 16:50 UTC (permalink / raw)
  To: Linux regressions mailing list, Bagas Sanjaya, nbd,
	Toke Høiland-Jørgensen, Kalle Valo, Johannes Berg
  Cc: linux-wireless

On Mon Jul 10, 2023 at 1:32 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 16.06.23 09:45, Nicolas Escande wrote:
> > On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
> >> On 10.06.23 08:44, Bagas Sanjaya wrote:
> >>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
> >>>>
> >>>> As user of the mesh part of mac80211 on multiple products at work let me say
> >>>> thank you for all the work you do on wifi, especially on 80211s, and especially
> >>>> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
> >>>>
> >>>> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP 
> >>>> doesn't work anymore and alas we use it for now (for the better or worse).
> >>>>
> >>>> What I gathered so far from my setup:
> >>>>  - we use ath9k & ath10k
> >>>>  - in my case STP frames are received as regular packet and not as amsdu
> >>>>  - the received packets have a wrong length of 44 in tcpdump
> >>>>    (instead of 38 with our previous kernel)
> >>>>  - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
> >>>>    this makes llc_rcv() discard the frames & breaks STP
> >>>>
> >>>> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
> >>>> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
> >>>>
> >>>> I guess that your changes to handle both ampdu subframes & normal frames in the
> >>>> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
> >>>> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
> >>>> exact problem.
> >>>>
> >>>> It seems this change was already in the 6.3 kernel so I guess someone should
> >>>> have seen it before (but I didn't find anything..) ? Maybe I missed something...
> >>>>
> >>>> Anyway I'm happy to provide more info or try anything you throw at me.
> >> [...]
> >> Hmmm, Felix did not reply. But let's ignore that for now.
> > 
> > I haven't seen mails from felix on the list for a few days, I'm guessing he's
> > unavailable for now but I'll hapilly wait.
>
> Still no progress. Hmmm. Are you still okay with that? I've seen no
> other reports about this, so waiting is somewhat (albeit not completely)
> fine for me if it is for you.
I'm not so surprised no one else reported it, using STP on wifi (and 802.11s) is
not a really common thing to do, to be honest (and STP on wifi is unreliable).
Even though some openwrt guys do it for sure, I'm guessing their kernel version
is lagging behind...
>
> But in any case it might be good if you could recheck 6.5-rc1.
Testing on 6.5 as a whole won't be as easy for me as testing a single patch on
top of 6.4. I'll do my best to try but from what I saw nothing got merged that
would even remotely help me on this issue. 

I am not loosing hope that Felix or someone that understands this stuff better
finds the time to look into this. I'm guessing it's the summer vacation effet.

>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [regression] STP on 80211s is broken in 6.4-rc4
  2023-07-10 16:50         ` Nicolas Escande
@ 2023-07-11 11:12           ` Felix Fietkau
  2023-07-11 12:15             ` Nicolas Escande
  0 siblings, 1 reply; 10+ messages in thread
From: Felix Fietkau @ 2023-07-11 11:12 UTC (permalink / raw)
  To: Nicolas Escande, Linux regressions mailing list, Bagas Sanjaya,
	Toke Høiland-Jørgensen, Kalle Valo, Johannes Berg
  Cc: linux-wireless

On 10.07.23 18:50, Nicolas Escande wrote:
> On Mon Jul 10, 2023 at 1:32 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 16.06.23 09:45, Nicolas Escande wrote:
>> > On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
>> >> On 10.06.23 08:44, Bagas Sanjaya wrote:
>> >>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
>> >>>>
>> >>>> As user of the mesh part of mac80211 on multiple products at work let me say
>> >>>> thank you for all the work you do on wifi, especially on 80211s, and especially
>> >>>> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
>> >>>>
>> >>>> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP 
>> >>>> doesn't work anymore and alas we use it for now (for the better or worse).
>> >>>>
>> >>>> What I gathered so far from my setup:
>> >>>>  - we use ath9k & ath10k
>> >>>>  - in my case STP frames are received as regular packet and not as amsdu
>> >>>>  - the received packets have a wrong length of 44 in tcpdump
>> >>>>    (instead of 38 with our previous kernel)
>> >>>>  - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
>> >>>>    this makes llc_rcv() discard the frames & breaks STP
>> >>>>
>> >>>> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
>> >>>> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
>> >>>>
>> >>>> I guess that your changes to handle both ampdu subframes & normal frames in the
>> >>>> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
>> >>>> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
>> >>>> exact problem.
>> >>>>
>> >>>> It seems this change was already in the 6.3 kernel so I guess someone should
>> >>>> have seen it before (but I didn't find anything..) ? Maybe I missed something...
>> >>>>
>> >>>> Anyway I'm happy to provide more info or try anything you throw at me.
>> >> [...]
>> >> Hmmm, Felix did not reply. But let's ignore that for now.
>> > 
>> > I haven't seen mails from felix on the list for a few days, I'm guessing he's
>> > unavailable for now but I'll hapilly wait.
>>
>> Still no progress. Hmmm. Are you still okay with that? I've seen no
>> other reports about this, so waiting is somewhat (albeit not completely)
>> fine for me if it is for you.
> I'm not so surprised no one else reported it, using STP on wifi (and 802.11s) is
> not a really common thing to do, to be honest (and STP on wifi is unreliable).
> Even though some openwrt guys do it for sure, I'm guessing their kernel version
> is lagging behind...
>>
>> But in any case it might be good if you could recheck 6.5-rc1.
> Testing on 6.5 as a whole won't be as easy for me as testing a single patch on
> top of 6.4. I'll do my best to try but from what I saw nothing got merged that
> would even remotely help me on this issue.
> 
> I am not loosing hope that Felix or someone that understands this stuff better
> finds the time to look into this. I'm guessing it's the summer vacation effet.

Sorry for the delay. This should fix the regression, please test.
I will submit it for 6.5 soon.
---
--- a/net/wireless/util.c
+++ b/net/wireless/util.c
@@ -580,6 +580,8 @@ int ieee80211_strip_8023_mesh_hdr(struct
  		hdrlen += ETH_ALEN + 2;
  	else if (!pskb_may_pull(skb, hdrlen))
  		return -EINVAL;
+	else
+		payload.eth.h_proto = htons(skb->len - hdrlen);

  	mesh_addr = skb->data + sizeof(payload.eth) + ETH_ALEN;
  	switch (payload.flags & MESH_FLAGS_AE) {



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [regression] STP on 80211s is broken in 6.4-rc4
  2023-07-11 11:12           ` Felix Fietkau
@ 2023-07-11 12:15             ` Nicolas Escande
  0 siblings, 0 replies; 10+ messages in thread
From: Nicolas Escande @ 2023-07-11 12:15 UTC (permalink / raw)
  To: Felix Fietkau, Linux regressions mailing list, Bagas Sanjaya,
	Toke Høiland-Jørgensen, Kalle Valo, Johannes Berg
  Cc: linux-wireless

On Tue Jul 11, 2023 at 1:12 PM CEST, Felix Fietkau wrote:
> On 10.07.23 18:50, Nicolas Escande wrote:
> > On Mon Jul 10, 2023 at 1:32 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
> >> On 16.06.23 09:45, Nicolas Escande wrote:
> >> > On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
> >> >> On 10.06.23 08:44, Bagas Sanjaya wrote:
> >> >>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
> >> >>>>
> >> >>>> As user of the mesh part of mac80211 on multiple products at work let me say
> >> >>>> thank you for all the work you do on wifi, especially on 80211s, and especially
> >> >>>> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
> >> >>>>
> >> >>>> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP 
> >> >>>> doesn't work anymore and alas we use it for now (for the better or worse).
> >> >>>>
> >> >>>> What I gathered so far from my setup:
> >> >>>>  - we use ath9k & ath10k
> >> >>>>  - in my case STP frames are received as regular packet and not as amsdu
> >> >>>>  - the received packets have a wrong length of 44 in tcpdump
> >> >>>>    (instead of 38 with our previous kernel)
> >> >>>>  - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
> >> >>>>    this makes llc_rcv() discard the frames & breaks STP
> >> >>>>
> >> >>>> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
> >> >>>> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
> >> >>>>
> >> >>>> I guess that your changes to handle both ampdu subframes & normal frames in the
> >> >>>> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
> >> >>>> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
> >> >>>> exact problem.
> >> >>>>
> >> >>>> It seems this change was already in the 6.3 kernel so I guess someone should
> >> >>>> have seen it before (but I didn't find anything..) ? Maybe I missed something...
> >> >>>>
> >> >>>> Anyway I'm happy to provide more info or try anything you throw at me.
> >> >> [...]
> >> >> Hmmm, Felix did not reply. But let's ignore that for now.
> >> > 
> >> > I haven't seen mails from felix on the list for a few days, I'm guessing he's
> >> > unavailable for now but I'll hapilly wait.
> >>
> >> Still no progress. Hmmm. Are you still okay with that? I've seen no
> >> other reports about this, so waiting is somewhat (albeit not completely)
> >> fine for me if it is for you.
> > I'm not so surprised no one else reported it, using STP on wifi (and 802.11s) is
> > not a really common thing to do, to be honest (and STP on wifi is unreliable).
> > Even though some openwrt guys do it for sure, I'm guessing their kernel version
> > is lagging behind...
> >>
> >> But in any case it might be good if you could recheck 6.5-rc1.
> > Testing on 6.5 as a whole won't be as easy for me as testing a single patch on
> > top of 6.4. I'll do my best to try but from what I saw nothing got merged that
> > would even remotely help me on this issue.
> > 
> > I am not loosing hope that Felix or someone that understands this stuff better
> > finds the time to look into this. I'm guessing it's the summer vacation effet.
>
> Sorry for the delay. This should fix the regression, please test.
> I will submit it for 6.5 soon.
> ---
> --- a/net/wireless/util.c
> +++ b/net/wireless/util.c
> @@ -580,6 +580,8 @@ int ieee80211_strip_8023_mesh_hdr(struct
>   		hdrlen += ETH_ALEN + 2;
>   	else if (!pskb_may_pull(skb, hdrlen))
>   		return -EINVAL;
> +	else
> +		payload.eth.h_proto = htons(skb->len - hdrlen);
>
>   	mesh_addr = skb->data + sizeof(payload.eth) + ETH_ALEN;
>   	switch (payload.flags & MESH_FLAGS_AE) {

Great, that does the trick for me
Thanks Felix

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-07-11 12:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CT5GNZSK28AI.2K6M69OXM9RW5@syracuse>
2023-06-10  6:44 ` [regression] STP on 80211s is broken in 6.4-rc4 Bagas Sanjaya
2023-06-15 12:54   ` Linux regression tracking (Thorsten Leemhuis)
2023-06-16  7:45     ` Nicolas Escande
2023-06-16  9:25       ` Linux regression tracking (Thorsten Leemhuis)
2023-06-16 12:17         ` Bagas Sanjaya
2023-06-16 12:33           ` Linux regression tracking (Thorsten Leemhuis)
2023-07-10 11:32       ` Linux regression tracking (Thorsten Leemhuis)
2023-07-10 16:50         ` Nicolas Escande
2023-07-11 11:12           ` Felix Fietkau
2023-07-11 12:15             ` Nicolas Escande

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).